In my last post, I mentioned three things that need to be reliably happening in order to achieve a faster, more predictable release process. The first one was to unify change management for the system between the Ops and Development sides. On the surface, this sounds like a straightforward thing. After all a tool rationalization exercise is almost routine in most shops. It happens regularly due to budget reviews, company mergers or acquisitions, etc.
Of course, as we all know, it is never quite that easy even when the unification is happening for nominally identical teams. For example, your company buys another and you have to merge the accounting system. Pretty straightforward – money is money, accountants are accountants, right? Those always go perfectly smoothly, right? Right?
In fact, unifying this aspect of change management can be especially thorny because of the intrinsic differences in tradition between the organizations. Even though complex modern systems evolve together as a whole, few sysadmins would see themselves as ‘developing’ the infrastructure, for example. Additionally, there are other problems. For instance, operations are frequently seen as service providers who need to provide SLAs for change requests that are submitted through the change management system. And a lot of operational tracking in the ticketing system is just that – operational – and it does not relate to actual configuration changes or updates to the system itself.
The key to dealing with this is the word “change”. Simplified, system changes should be treated in the same way as code changes are handled. Whatever that might be. For example, it could be a user story in the backlog. The “user” might be a middleware patch that a new feature depends on and the work items might be around submitting tickets to progressively roll that up the environment chain into production. The goal is to track needed changes to the system as first-class items in the development effort. The non-change operational stuff will almost for sure stay in the ticketing system. A simple example, but applying the principle will mean that the operating environment of a system evolves right along with its code – not as a retrofit or afterthought when it is time to deploy to a particular environment or there is a problem.
The tool part is conceptually easy – someone manages the changes in the same system that backlog/stories/work items are handled. However, there is also the matter of the “someone” mentioned in that sentence. An emerging pattern I have seen in several shops is to cohabitate an Ops-type with the development team. Whether these people are the ‘ops representative’ or ‘infrastructure developers’ their role is to focus on evolving the environment along with the code and ensuring that the path to production is in sync with how things are being developed. This is usually a relatively senior person who can advise on things, know when to say no, and know when to push. The real shift is that changes to the operating environment of an application become first-class citizens at the table when code or test changes are being discussed and they can now be tracked as part of the work that is required to deliver on an iteration.
These roles have started popping up in various places with interesting frequency. To me, this is the next logical step in Agile evolution. Having QA folks in the standups is accepted practice nowadays and organizations are figuring out that the Ops guys should be at the table as well. This does a great job of pro-actively addressing a lot of the release / promotion headaches that slow things up as things move toward production. Done right, this takes a lot stress and time out of the overall Agile release cycle.