October 29, 2012

A System for Changing Systems – Part 5 – Top-level Categories

The first step to understanding the framework is to define the broad, top level capability areas. A very common problem in technology is the frequent over-use of terms that can have radically different meanings depending on the context of a conversation. So, as with any effort to clarify the discussion of a topic, it is very critical to define terms and hold to those definitions during the course of the discussion.

Top level categories of capabilities around various environments in which applications typically must run.

Top level capability areas for sustaining application systems across environments.

At the top level of this framework are six capability groupings

Change Management – This category is for capabilities that ensure that changes to the system are properly understood and tracked as they happen. This is a massively overused term, but the main idea for this framework is that managing changes is not the same thing as applying them. Other capabilities deal with that. This capability category is all about oversight.
Orchestration – This category deals with the ability to coordinate activity across different components, areas, and technologies in a complex distributed application system in a synchronized manner
Deployment – This category covers the activities related to managing the lifecycles of an application systems’ artifacts through the various environments. Put more simply this area deals with the mechanics of actually changing out pieces of an application system.
Monitoring – The monitoring category deals with instrumenting the environment for various purposes. This instrumentation concept covers all pieces of the application system and provides feedback in the appropriate manner for interested stakeholders. For example, capacity usage for operations and feature usage for development.
System Registry – This refers to the need for a flexible and well-understood repository of shared information about the infrastructure in which the application system runs. This deals with the services on which the application system depends and which may need to be updated before a new instance of the application system can operate correctly.
Provisioning – This capability is about creating and allocating the appropriate infrastructure resources for an instance of the application system to run properly. This deals with the number and configuration of those resources. While this area is related to deployment, it is separate because in many infrastructures it may not be desireable or even technically possible to provision fresh resources with each deployment and linking the two would blunt the relevancy of the framework.

The next few posts will dig into the sub-categories underneath each of these top-level items.

Posted in Agile, Cloud, DevOps, Management
Tagged Agile, application change management, Cloud, configuration management, deployment, dev ops, DevOps, software-development
Leave a comment

October 15, 2012

A System for Changing Systems – Part 4 – Groundwork for Understanding the Capabilities of a System Changing System

In the last couple of posts, we have talked about how application systems need a change application system around them to manage the changes to the application system itself. A “system to manage the system” as it were. We also talked about the multi-part nature of application systems and the fact that the application systems typically run in more than one environment at any given time and will “move” from environment to environment as part of their QA process. These first three posts seek to set a working definition of the thing being changed so that we can proceed to a working definition of a system for managing those changes. This post starts that second part of the series – defining the capabilities of a change application system. This definition will then serve as the base for the third part – pragmatically adopting and applying the capabilities to begin achieving a DevOps mode of operation.

DevOps is a large problem domain with many moving parts. Just within the first set of these posts, we have seen how four rather broad area definitions can multiply substantially in a typical environment. Further, there are aspects of the problem domain that will be prioritized by different stakeholders based on their discipline’s perspective on the problem. The whole point of DevOps, of course, is to eliminate that perspective bias. So, it becomes very important to have some method for unifying the understanding and discussion of the organizations’ capabilities. In the final analysis, it is not as important what that unified picture looks like as it is that the picture be clearly understood by all.

To that end, I have put together a framework that I use with my customers to help in the process of understanding their current state and prioritizing their improvement efforts. I initially presented this framework at the Innovate 2012 conference and subsequently published an introductory whitepaper on the IBM developerWorks website. My intent with these posts is to expand the discussion and, hopefully, help folks get better faster. The interesting thing to me is to see folks adopt this either as is or as the seed of something of their own. Either way, it has been gratifying to see folks respond to it in its nascent form and I think the only way for it to get better is to get more eyeballs on it.

So, here is my picture of the top-level of the capability areas (tools and processes) an organization needs to have to deliver changes to an application system.

Overview of capability areas required to sustain environments

The quality and maturity of these within the organization will vary based on their business needs – particularly around formality – and the frequency with which they need to apply changes.

I applied three principles when I put this together:

The capabilities had to be things that exist in all environments that application system runs (ie dev, test, prod, or whatever layers exist). THe idea here is that such a perspective will help unify tooling and approaches to a theoretical ideal of one solution for all environments.
The capabilities had to be broad enough to allow for different levels of priority / formality depending on the environment. The idea is to not burden a more volatile test environment with production-grade formality or vice-versa. But to allow a structured discussion of how the team will deliver that capability in a unified way to the various environments. DevOps is an Agile concept, so the notion of minimally necessary applies.
The capabilities had to be generic enough to apply to any technology stack that an organization might have. Larger organizations may need multiple solutions based on the fact that they have many application systems that were created at different points in time, in different languages, and in different architectures. It may not be possible to use exactly the same tool / process in all of those environments, but it most certainly is possible to maintain a common understanding and vocabulary about it.

In the next couple of posts, I will drill a bit deeper into the capability areas to apply some scope, focus, and meaning.

Posted in Agile, Cloud, DevOps, Enterprise
Tagged Agile, Automation, Cloud, dev ops, DevOps, enterprise-it, virtualization
Leave a comment

October 1, 2012

A System for Changing Systems – Part 3 – How Many “Chang-ee”s

As mentioned in the last post, once there is a “whole system” understanding of an application system, the next problem is that there are really multiple variants of that system running within the organization at any given time. There are notionally at least three: Development, Test, and Production. In reality, however, most shops frequently have multiple levels of test and potentially more than one Development variant. Some even have Staging or “Pre-production” areas very late in test where the modified system must run for some period before finally replacing the production environment. A lot of this environment proliferation is based on historic processes that are themselves a product of the available tooling and lessons organizations have learned over years of delivering software.

This is a simplified, real-world example flow through some typical environments. Note the potential variable paths – another reason to know what configuration is being tested.

Tooling and processes are constantly evolving. The DevOps movement is really a reflection of the mainstreaming of Agile approaches and cloud-related technologies and is ultimately a discussion of how to best exploit it. That discussion, as it applies to environment proliferation, means we need to get to an understanding of the core problems we are trying to solve. The two main problem areas are maintaining the validity of the sub-production environments as representative of production and tracking the groupings of changes to the system in each of the environments.

The first problem area, that of maintaining the validity of sub-production envrionments, is a more complex problem than it would seem. There are organizational silo problems where multiple different groups own the different environments. For example, a QA group may own the lab configuraitons and therefore have a disconnect relative to the production team. There are also multipliers associated with technical specialities, such as DBAs or Network Administration, which may be shared across some levels of environment. And if the complexity of the organization was not enough, there are other issues associated with teams that do not get along well, the business’ perception that test environments are less critical than production, and other organizational dynamics that make it that much more difficult to ensure good testing regimes are part of the process.

The second key problem area that must be addresssed is tracking the groups of changes to the application system that are being evaluated in a particular sub-production environment. This means having a unique identifier for the combination of application code, the database schema and dataset, system configuration, and network configuration. That translates to five version markers – one for each of the main areas of the application system plus one for the particular combination of all four. On the surface, this is straightforward, but in most shops, there are few facilities for tracking versions of configurations outside of software code. Even when they are, they are too often not connected to one another for tracking groupings of configurations.

They typical pattern for solving these two problems actually begins with the second problem first. It is difficult to ensure the validity of a test environment if there is no easy way to identify and understand the configuration of the components involved. This is why many DevOps initiatives start with configuration management tools such as Puppet, Chef, or VMWare VCenter. It is also why “all-in-one” solutions such as IBM’s Pure family are starting to enter the market. Once an organization can get a handle on their configurations, then it is substantially easier to have fact-based engineering conversations about valid test configurations and environments because everyone involved has a clear reference for understanding exactly what is being discussed.

This problem discussion glosses over the important aspect of being able to maintain these tools and environments over time. Consistently applying the groups of changes to the various environments requires a complex system by itself. The term system is most appropirate because the needed capabilities go well beyond the scope of a single tool and then those capabilities need to be available for each of the system components. Any discussion of such broad capabilities is well beyond the scope of a single blog post, so the next several posts in this series will look at framework for understanding the capabilities needed for such a system.

Posted in Agile, DevOps, Enterprise, Management
Tagged Agile, application change, application systems, Change Management, DevOps, enterprise-it, management, software-development
Leave a comment

September 24, 2012

A System for Changing Systems – Part 2 – The “Chang-ee”

As discussed last time, having a clear understanding of the thing being changed is key to understanding how to change it. Given that, this post will focus on creating a common framework for understanding the “Change-ee” systems. To be clear, the primary subject of this discussion are software application systems. That should be obvious from the DevOps discussion, but I prefer not to assume things.

Application systems generally have four main types of components. First, and most obviously, is the software code. That is often referred to as the “application”. However, as the DevOps movement has long held, that is a rather narrow definition of things. The software code can not run by itself in a standalone vacuum. That is why these posts refer to an application *system* rather than just an application. The other three parts of the equation are the database, the server infrastructure and the network insfrastructure. It takes all four of these areas working together for an application system to function.

Since these four areas will frame the discussion going forward, we need to have a common understanding about what is in each. It is important to understand that there are variants of each of these components as changes are applied and qualified for use in the production environment. In other words, there will be sub-production environments that have to have representative configurations. And those have to be considered when deciding how to apply changes through the environment.

Application Code – This is the set of functionality defined by the business case that justifies the existance of the application system in the first place and consists of the artifacts created by the development team for the solution including things such as server code, user interface artifacts, business rules, etc.
Database & Data – This is the data structure required for the application to run. This area includes all data-related artifacts, whether they are associated with a traditional RDBMS, “no sql” system, or just flat files. This includes data, data definition structures (eg schema), test datasets, and so forth.
Server Infrastructure (OS, VM, Middleware, Storage) – This represents the services and libraries required for the application to run. A broad category ranging from the VM/OS layer all the way through the various middleware layers and libraries on which the application depends. This area also includes storage for the database area.
Network Infrastructure – This category is for all of the inter-system communications components and links required for users to derive value from the application system. This includes the connectivity to the users, connectivity among servers, connectivity to resources (e.g. storage), and the devices (e.g. load balancers, routers, etc.) that enable the application system to meet its functional, performance, and availability requirements

Conceptual image of the main system component areas that need to be in sync in order for a system to operate correctly

The complicating factor for these four areas is that there are multiple instances of each of them that exist in an organization at any given time. And those multiple instances may be at different revision levels. Dealing with that is a discussion unto itself, but is no less critical to understanding the requirements for a system to manage your application system. The next post will examine this aspect of things and the challenges associated with it.

Posted in Agile, DevOps, Enterprise, Management
Tagged Agile, application change, application systems, Change Management, development, DevOps, software-development, virtualization
1 Comment

September 17, 2012

A System for Changing Systems – Part 1 – Approach

This is the first post in a series which will look at common patterns among DevOps environments. Based on these patterns, they will attempt to put a reasonable structure together that will help organizations focus DevOps discussions, prioritize choices, and generally improve how they operate.

In the last post, I discussed how many shops take the perspective of developing a system for DevOps within their environments. This notion of a “system for changing systems” as a practical way of approaching DevOps requires two pieces. The first is the system being changed – the “change-ee” system. The second is the system doing the changing – the “DevOps”, or “change-er” system. Before talking about automatically changing something, it is necessary to have a consistent understanding of the thing being changed. Put another way, no automation can operate well without a deep understanding of the thing being automated. So this first post is about establishing a common language for generically understanding the application systems; the “change-ee” systems in the discussion.

A note on products, technologies and tools… Given the variances in architectures for application (“change-ee”) systems, and therefore the implied variances on the systems that apply changes to them, it is not useful to get product prescriptive for either. In fact, a key goal with this framework is to ensure that it is as broadly applicable and useful as possible when solving DevOps-related problems in any environment. That would be very difficult if it overly focused on any one technology stack. So, these posts will not necessarily name names other than to use them as examples of categories of tools and technologies.

With these things in mind, these posts will progress from the inside-out. The next post will begin the process with a look at the typical components in an application system (“change-ee”). From there, the next set of posts will discuss the capabilities needed to systematically apply changes to these systems. Finally, after the structure is completed, the last set of posts will look at the typical progression of how organizations build these capabilities.

The next post will dive in and start looking at the structure of the “change-ee” environment.

Posted in Agile, DevOps, Enterprise, Management
Tagged application change, application systems, DevOps, enterprise-it, management, software-development
Leave a comment

September 10, 2012

DevOps is about Developing a Change Application System

As the DevOps movement rolls on, there is a pattern emerging. Some efforts are initiated by development, seeking relief on test environment management. Others are initiated by operations departments trying to get more automation and instrumentation into the environments they manage. I frequently hear comments that are variations on “same stuff I’ve been doing for xx years, different title” from people who have DevOps in their job title or job description. Their shops are hoping that if they encourage folks to think about DevOps and maybe try some new tools, they will get the improvements promised by DevOps discussions online. Well, just like buying a Ferrari and talking it up won’t make you Michael Schumacher, having Puppet or Chef to do your configuration management won’t “make you DevOps” (whatever that means). Successful DevOps shops are bypassing the window dressing and going at DevOps as a project unto itself.

There are a number of unique aspects to undertaking a project such as this. They require a holistic perspective on the process, touch a very broad range of activities, and provide an approach for changing other systems while being constantly changed themselves.

These projects are unique in the software organization because they require a team to look at the whole end-to-end approach to delivering changes to the application systems within that organization FROM THE SIDE; rather than from a position somewhere in the middle of the process. This is an important difference in approach, because it forces a change in perspective on the problem. Typically, someone looking from either the development or the operations “end” of the process will often suffer from a perceptive problem where the “closer” problems in the process look bigger than the ones “farther” up or down the process line. It is a very human thing to be deceived by the perspective of our current position. After all, there are countless examples of using perspective for optical illusions. Clever Leaning Tower of Pisa pictures (where someone appears to be holding it up) and the entire Lord of the Rings movie trilogy (the actors playing the hobbits are not that short) provide easy examples. Narrowness of perspective is, in fact, a frequent reason that “grassroots” efforts’ fail outside of small teams. Successfully making large and impactful changes requires a broader perspective.

The other breadth-related aspect of these programs is that they touch a very wide range of activities over time and seek to optimize for flow both through and among each. That means that they have some similarities with supply chain optimization and ERP projects if not in scale, then in complexity. And the skills to look at those flows probably do not exist directly within the software organization, but in the business units themselves. It can be difficult for technology teams, that see themselves as critical suppliers of technology to business units, to accept that there are large lessons to be learned about technology development from the business units. It takes a desire to learn and change at a level well above a typical project.

A final unique part is that there must be ongoing programs for building and enhancing a system for managing consistent and ongoing changes in other systems. Depending on your technology preference, there are plenty of analogies from pipelines, powergrids and aircraft that apply here. Famous and fun ones are the flight control systems of intrinsically unstable aircraft such as the F-16 fighter or B-2 bomber. These planes use technology to adjust control surfaces within fractions of a second to maintain steady and controled flight within the extreme conditions faced by combat aircraft. Compared to that, delivering enhancements to a release automation system every few weeks sounds trivial, but maintaining the discipline and control to do so in a large organization can be a daunting task.

So the message here is to deliberately establish a program to manage how changes are applied. Accept that it is going to be a new and unusual thing in your organization and that it is going to require steady support and effort to be successful. Without that acceptance, it will likely not work out.

My next few posts are going to dig into this deeper and begin looking at the common aspects of these programs, how people approach the problem, how they organize and prioritize their efforts, and the types of tools they are using.

Posted in Agile, Cloud, DevOps, Enterprise
Tagged Change Management, Cloud, development, DevOps, enterprise-it, software-development
1 Comment

September 4, 2012

Another Example of Grinding Mental Gears

I recently got a question from a customer who was struggling with the ‘availability’ of their sub-production environments. The situation brought into focus a fundamental disconnect between the Ops folks who were trying to maintain a solid set of QA environments for the Dev team and what the Dev teams needed. To a large extend this is a classic DevOps dilemma, but the question provides an excellent teaching moment. Classic application or system availability as defined for a production situation does not really apply to Dev or multi-level Test environments.

Look at it this way. End user productivity associated with a production environment is based upon the “availability” of the application. Development and Test productivity is based upon the ability to view chagnes to the application in a representative (pre-production) environment. In other words the availability of the _changer_ in pre-production is more valuable to Dev productivity than any specific pre-production instance of the application environment. Those application environment instances are, in fact, disposable by definition.

Disposability of a running application environment is a bit jarring to Ops folks when they see a group of users (developers and testers in this case) needing the system. Everything in Ops tools and doctrine is oriented toward making sure that an application environment gets set up and STAYS that way. That focus on keeping things static is exactly the point to which DevOps is a reaction. Knowing that does not make it easy to make the mental shift, of course. Once made, however, it is precisely why tools that facilitate rapidly provisioning environments are frequently the earliest arrivals when most organizations seek to adopt DevOps.

Posted in Agile, DevOps, Enterprise, Management
Tagged Agile, Change Management, DevOps, enterprise-it
Leave a comment

June 4, 2012

DevOps is NOT a Job Title

Given my recent posts about organizational structure, I feel like I need to clarify my stance on this…

You know a topic is hot when recruiters start putting it in job titles. I do believe that most organizations will end up with a team of “T-shaped people” focused on using DevOps techniques to ensure that systems can be support an Agile business and its development processes. However, I am not a fan of hanging DevOps on the title of everyone involved.

Here’s the thing, if you have to put it in the name to convince yourself or other people you are doing it, you probably are not. And the very people you hope to attract may well avoid your organization because it fails the ‘reality’ test. In other words, you end up looking like you don’t get it. A couple of analogies come to mind immediately.

First, let’s look at a country that calls itself the “People’s Democratic Republic of” somewhere. That is usually an indicator that it is not any of those modifiers and the only true statement is the ‘somewhere’ part. Similarly, putting “DevOps Sysadmin” on top of a job description that, just last week, said “Sysadmin” really isn’t fooling anyone.

Second, hanging buzzwords on job titles is like a 16 year old painting racing stripes on the four door beater they got as their first car. With latex house paint. You may admire their enthusiasm and optimism. You certainly wish them the best. But you have a pretty realistic assessment of the car.

Instead, DevOps belongs down in the job description. DevOps in a job role is a mindset and an approach used to define how established skills are applied. You are looking for a Release Manager to apply DevOps methods in support of your web applications. Put it down in the requirements bullet points just as you would put things like ‘familiar with scripting languages’, ‘used to operating in an [Agile/Lean/Scrum] environment]’, or ‘experience supporting a SaaS infrastructure’.

I realize that I am tilting at windmills here. We went through a spate of “Agile” Development Managers and the number of “Cloud” Sysadmins is just now tapering. So, I guess it is DevOps’ turn. To be sure, it is gratifying and validating to see such proof that DevOps is becoming a mainstream topic. I should probably adopt a stance of ‘whatever spreads the gospel to the masses’. But I really just had to get this rant off my chest after seeing a couple of serious “facepalm” job ads.

Posted in Agile, DevOps, Enterprise, Management
Tagged Agile, culture, development, DevOps, enterprise-it, software-development
2 Comments

May 7, 2012

Start Collaboration with Teaching

Every technology organization should force everyone in the group to regularly educate the group on what they are doing. This should be a cross-discipline activity – not a departmental activity. There are three reasons to do this. The first is obvious – there is an intrinsic value in sharing the knowledge. The second is that the teachers themselves get better at what they are teaching about for the reasons described above. The third is that it serves to create relationships among the groups that will open channels of collaboration as the organization grows.

This will create more opportunities for someone to have a critical insight on a situation and invent something valuable as a result. It may be as basic as the fact that the team is faster at solving problems because they know who to call and have a relationship with that person. It also means that you have a better chance of keeping your ‘bus number at healthier levels thereby making your organization more resilient overall. Of course, it will also make your overall organization more cohesive meaning people will be somewhat more likely to stay and ensuring that you have fewer ‘bus number’ situations in the first place – or at least fewer that were not caused by a bus

April 30, 2012

Classic Metrics for How Good You Are

One of the ‘best metrics ever’ is the classic “bus number”. This measures how many people in an organization can be hit by a bus before that organization’s operations or progress is severely hindered due to that person’s absence. This was a slightly funny way of measuring resilience of an organization versus the anti-pattern of knowledge hoarding in an individual’s brain. The idea is that a resilient organization should have a very high bus number and not be vulnerable to ‘critical staff’.

Think about it next time you are looking at any part of your system. Ask yourself who you would ask about that particular module, image, or whatever. Then ask yourself who you would go to if the first person was unavailable. How confident are you that you would quickly / expediently get your answer? How confident are you that you could just look the information up in a Wiki or other documentation?

If you look at the questions above and start thinking that ‘we would figure it out after a while’ or making other excuses, you minimally have a problem with communications and collaboration. You almost certainly have a process problem. And you may well have a cultural problem. Make no mistake, what it means is that your team/organization/project are playing in traffic and simply waiting for the inevitable to happen.

And when something does happen, say, for example, one of the project’s “hero coders” takes a new job, it will be miserable for all who remain as they try to figure out what the hero was doing. Meanwhile the project’s progress languishes and the deadline becomes unachievable. Morale goes down as frustration goes up. Maybe someone else decides to leave out of a sense of futility; making the problem worse. And it will have been completely avoidable. It will be completely the fault of the leadership that was either not assertive enough to make the hero share their knowledge or undisciplined enough to not include sustainability in their coaching, plans, and day-to-day execution priorities.

This is serious stuff and is worth the investment of time to solve. The habit of focusing on the overall sustainability of the organization is well documented as something that successful, resilient, and sustainable organizations emphasize. This is well documented in the classic book “Good to Great” by Jim Colins, where the book describes the organizations being built as sophisticated machines using the analogy of clock building. The notion is simple, really. That the goal is to build a lasting thing that continues on as people come and go. The project / organization must be bigger than any individual, the individuals involved must understand that, and management must encourage or enforce that mindset. In the book, the organizations that did this radically outperformed their peers in the same markets in the same timeframe.

The reality is that you will probably always have some stuff (ideally only non-critical or very new stuff) that is not well disseminated, but take those in the lens of what they are and triage / prioritize them so that you do not accumulate the knowledge gap as technical debt. Or, if you do, you should do so consciously, visibly, and at a level at which you know you can tolerate the risk.

Posted in Agile, DevOps, Management
Tagged Agile, DevOps, enterprise-it, management
Leave a comment

Crossing Silos

DevOps only works if you cross boundaries

Category Archives: Agile

A System for Changing Systems – Part 3 – How Many “Chang-ee”s

A System for Changing Systems – Part 2 – The “Chang-ee”

A System for Changing Systems – Part 1 – Approach

Another Example of Grinding Mental Gears

DevOps is NOT a Job Title

Start Collaboration with Teaching

Classic Metrics for How Good You Are

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this: