This post slides from my talk at Agile Austin on February 12, 2013.
I want to thank the group for the opportunity and the audience for the great interaction!
Link to slides: DevOps Beyond the Basics – FINAL
As mentioned in the last post, once there is a “whole system” understanding of an application system, the next problem is that there are really multiple variants of that system running within the organization at any given time. There are notionally at least three: Development, Test, and Production. In reality, however, most shops frequently have multiple levels of test and potentially more than one Development variant. Some even have Staging or “Pre-production” areas very late in test where the modified system must run for some period before finally replacing the production environment. A lot of this environment proliferation is based on historic processes that are themselves a product of the available tooling and lessons organizations have learned over years of delivering software.
Tooling and processes are constantly evolving. The DevOps movement is really a reflection of the mainstreaming of Agile approaches and cloud-related technologies and is ultimately a discussion of how to best exploit it. That discussion, as it applies to environment proliferation, means we need to get to an understanding of the core problems we are trying to solve. The two main problem areas are maintaining the validity of the sub-production environments as representative of production and tracking the groupings of changes to the system in each of the environments.
The first problem area, that of maintaining the validity of sub-production envrionments, is a more complex problem than it would seem. There are organizational silo problems where multiple different groups own the different environments. For example, a QA group may own the lab configuraitons and therefore have a disconnect relative to the production team. There are also multipliers associated with technical specialities, such as DBAs or Network Administration, which may be shared across some levels of environment. And if the complexity of the organization was not enough, there are other issues associated with teams that do not get along well, the business’ perception that test environments are less critical than production, and other organizational dynamics that make it that much more difficult to ensure good testing regimes are part of the process.
The second key problem area that must be addresssed is tracking the groups of changes to the application system that are being evaluated in a particular sub-production environment. This means having a unique identifier for the combination of application code, the database schema and dataset, system configuration, and network configuration. That translates to five version markers – one for each of the main areas of the application system plus one for the particular combination of all four. On the surface, this is straightforward, but in most shops, there are few facilities for tracking versions of configurations outside of software code. Even when they are, they are too often not connected to one another for tracking groupings of configurations.
They typical pattern for solving these two problems actually begins with the second problem first. It is difficult to ensure the validity of a test environment if there is no easy way to identify and understand the configuration of the components involved. This is why many DevOps initiatives start with configuration management tools such as Puppet, Chef, or VMWare VCenter. It is also why “all-in-one” solutions such as IBM’s Pure family are starting to enter the market. Once an organization can get a handle on their configurations, then it is substantially easier to have fact-based engineering conversations about valid test configurations and environments because everyone involved has a clear reference for understanding exactly what is being discussed.
This problem discussion glosses over the important aspect of being able to maintain these tools and environments over time. Consistently applying the groups of changes to the various environments requires a complex system by itself. The term system is most appropirate because the needed capabilities go well beyond the scope of a single tool and then those capabilities need to be available for each of the system components. Any discussion of such broad capabilities is well beyond the scope of a single blog post, so the next several posts in this series will look at framework for understanding the capabilities needed for such a system.
As discussed last time, having a clear understanding of the thing being changed is key to understanding how to change it. Given that, this post will focus on creating a common framework for understanding the “Change-ee” systems. To be clear, the primary subject of this discussion are software application systems. That should be obvious from the DevOps discussion, but I prefer not to assume things.
Application systems generally have four main types of components. First, and most obviously, is the software code. That is often referred to as the “application”. However, as the DevOps movement has long held, that is a rather narrow definition of things. The software code can not run by itself in a standalone vacuum. That is why these posts refer to an application *system* rather than just an application. The other three parts of the equation are the database, the server infrastructure and the network insfrastructure. It takes all four of these areas working together for an application system to function.
Since these four areas will frame the discussion going forward, we need to have a common understanding about what is in each. It is important to understand that there are variants of each of these components as changes are applied and qualified for use in the production environment. In other words, there will be sub-production environments that have to have representative configurations. And those have to be considered when deciding how to apply changes through the environment.
The complicating factor for these four areas is that there are multiple instances of each of them that exist in an organization at any given time. And those multiple instances may be at different revision levels. Dealing with that is a discussion unto itself, but is no less critical to understanding the requirements for a system to manage your application system. The next post will examine this aspect of things and the challenges associated with it.
This is the first post in a series which will look at common patterns among DevOps environments. Based on these patterns, they will attempt to put a reasonable structure together that will help organizations focus DevOps discussions, prioritize choices, and generally improve how they operate.
In the last post, I discussed how many shops take the perspective of developing a system for DevOps within their environments. This notion of a “system for changing systems” as a practical way of approaching DevOps requires two pieces. The first is the system being changed – the “change-ee” system. The second is the system doing the changing – the “DevOps”, or “change-er” system. Before talking about automatically changing something, it is necessary to have a consistent understanding of the thing being changed. Put another way, no automation can operate well without a deep understanding of the thing being automated. So this first post is about establishing a common language for generically understanding the application systems; the “change-ee” systems in the discussion.
A note on products, technologies and tools… Given the variances in architectures for application (“change-ee”) systems, and therefore the implied variances on the systems that apply changes to them, it is not useful to get product prescriptive for either. In fact, a key goal with this framework is to ensure that it is as broadly applicable and useful as possible when solving DevOps-related problems in any environment. That would be very difficult if it overly focused on any one technology stack. So, these posts will not necessarily name names other than to use them as examples of categories of tools and technologies.
With these things in mind, these posts will progress from the inside-out. The next post will begin the process with a look at the typical components in an application system (“change-ee”). From there, the next set of posts will discuss the capabilities needed to systematically apply changes to these systems. Finally, after the structure is completed, the last set of posts will look at the typical progression of how organizations build these capabilities.
The next post will dive in and start looking at the structure of the “change-ee” environment.