Old Habits Make DevOps Transformation Hard

My father is a computer guy. Mainframes and all of the technologies that were cool a few decades ago. I have early memories of playing with fascinating electro-mechanical stuff at Dad’s office and its datacenter. Printers, plotters, and their last remaining card punch machine in a back corner. Crazy cool stuff for a kid if you have ever seen that gear in action. There’s all kinds of noise and things zipping around.

Now the interesting thing about talking to Dad is that he is seriously geeky about tech. Always fascinated by the future of how tech would be applied and he completely groks the principals and potentials of new technology even if he does not really get the specific implementations. Recently he had a problem printing from his iPhone. He had set it up a long time ago and it worked great. He’s 78 and didn’t bat an eye at connecting his newfangled mobile device to his printer. What was interesting was his behavior when the connection stopped working. He tried mightily to fix the connection definition rather than deleting the configuration and simply recreating it with the wizard. That got me thinking about “fix it” behavior and troubleshooting behavior in IT.

My dad, as an old IT guy, had long experience and training that you fix things when they got out of whack. You certainly didn’t expect to delete a printer definition back in the day – you would edit the file, you would test it, and you would fiddle with it until you got the thing working again. After all, you had just the relatively few pieces of equipment in the datacenter and offices. That makes no sense in a situation where you can simply blow the problematic thing away and let the software automatically recreate it.

And that made me think about DevOps transformations in the enterprise.

I run into so many IT shops where people far younger than my dad struggle mightily to troubleshoot and fix things that could (or should) be easily recreated. To be fair – some troubleshooting is valuable and educational, but a lot is over routine stuff that is either well known, industry standard, or just plain basic. Why isn’t that stuff in an automated configuration management system? Or a VM snapshot? Or a container? Heck – why isn’t it in the Wiki, at least?! And the funny thing is that these shops are using virtualization and cloud technologies already, but treat the virtual artifacts the same way as they did the long-lasting, physical equipment-centric setups of generations past. And that is why so many DevOps conversations come back to culture. Or perhaps ‘habit’ is a better term in this case.

Breaking habits is hard, but we must if we are to move forward. When the old ways do not work for a retired IT guy, you really have to think about why anyone still believes they work in a current technology environment.

This article is on LinkedIn here: https://www.linkedin.com/pulse/old-habits-make-devops-transformation-hard-dan-zentgraf

Advertisement

Your Deployment Doc Might Not be Useful for DevOps

One of the most common mistakes I see people making with automation is the assumption that they can simply wrap scripts around what they are doing today and be ‘automated’. The assumption is based around some phenomenally detailed runbook or ‘deployment document’ that has every command that must be executed. In ‘perfect’ sequence. And usually in a nice bold font. It was what they used for their last quarterly release – you know, the one two months ago? It is also serving as the template for their next quarterly release…

It’s not that these documents are bad or not useful. They are actually great guideposts and starting points for deriving a good automated solution to releasing software in their environment. However, you have to remember that these are the same documents that are used to guide late night, all hands, ‘war room’ deployments. The idea that their documented procedures are repeatablly automate-able is suspect, at best, based on that observation alone.

Deployment documents break down as an automate-able template for a number of reasons. First, there are almost always some number of undocumented assumptions about the state of the environment before a release starts. Second, using the last one does not account for procedural, parameter, or other changes between the prior and the upcomming releases. Third, the processes usually unconsciously rely on interpretation or tribal knowledge on the part of the person executing the steps. Finally, there is the problem that steps that make sense in a sequential, manual process will not take advantage of the intrinsic benefits of automation, such as parallel execution, elimination of data entry tasks, and so on.

The solution is to never set the expectation – particularly to those with organizational power – that the document is only a starting point. Build the automation iteratively and schedule multiple iterations at the start of the effort. This can be a great way to introduce Agile practices into the traditionally waterfall approaches used in operations-centric environments. This approach allows for the effort that will be required to fill in gaps in the document’s approach, negotiate standard packaging and tracking of deploy-able artifacts, add environment ‘config drift’ checks, or any of the other common ‘pitfall’ items that require more structure in an automated context.

This article is also on LinkedIn here: https://www.linkedin.com/pulse/your-deployment-doc-might-useful-devops-dan-zentgraf

Automation and the Definition of Done

“Done” is one of the more powerful concepts in human endeavor. Knowing that something is “done“ enables us to move on to the next endeavor, allows us to claim compensation, and sends a signal to others that they can begin working with whatever we have produced. However, assessing done can be contentious – particularly where the criteria are undefined. This is why the ‘definition of done’ is a major topic in software delivery. Software development has a creative component that can lead to confusion and even conflict. It is not a trivial matter.

Automation in the software delivery process forces the team to create a clear set of completion criteria early in the effort, thus reducing uncertainty around when things are ‘done’ as well as what happens next. Though they at first appear to be opposites, with done defining a stopping point and automation being much more about motion, the link between ‘done’ and automation is synergistic. Being good at one makes the team better at the other and vice-versa. Being good at both accelerates and improves the team’s overall capability to deliver software.

A more obvious example of the power of “done” appears in the Agile community. For example, Agile teams often have a doctrine of ‘test driven development’ where developers should write the tests first. Further examples include the procedural concepts for completing the User Stories in each iteration, or sprint, so that the team can clearly assess completion in the retrospective. Independent of these examples, validation-centric scenarios are an obvious area where automation can help underpin “done”. In the ‘test-driven development’ example, test suites that run at various points provide unambiguous feedback over whether more work is required. Those test suites become part of the Continuous Integration (CI) job so that every time a developer commits new code. If those pass, then the build automatically deploys into the integration environment for further evaluation.

Looking a bit deeper at the simple process of automatically testing CI builds reveals how automation forces a more mature understanding of “done”. Framed another way, the fact that the team has decided to have that automated assessment means that they have implicitly agreed to a set of specific criteria for assessing their ‘done-ness’. That is a major step for any group and evidence of significant maturation of the overall team.

That step of maturation is crucial, as it enables better flows across the entire lifecycle. For example, understanding how to map ‘done-ness’ into automated assessment is what enables advanced delivery methodologies such as Continuous Delivery. Realistically, any self-service process, whether triggered deliberately by a button push or autonomously by an event, such as delivering code, cannot exist without a clear, easily communicated, understanding of when that process is complete and how successful it was. No one would trust the automation were it otherwise.

There is an intrinsic link between “Done” and automation. They are mutual enablers. Done is made clearer, easier and faster by automation. Automation, in turn, forces a clear definition of what it means to be complete, or ‘done’. The better the software delivery team is at one, the stronger that team is at the other.

 

This article is also on LinkedIn here: https://www.linkedin.com/pulse/automation-definition-done-dan-zentgraf