Imagine this hypothetical conversation I didn’t have with someone last week…
THEM: “Is there a DevOps framework?”
ME: “Noooooo, it doesn’t work like that”
ME: “Well DevOps is more like a philosophy, or a set of values and principles. The way you apply those principles and values varies from one organisation to the next, so a framework wouldn’t really work, especially if it was quite prescriptive, like Scrum”
THEM: “But I really want one”
ME: “Ok, I’ll tell you what, I’ll hack an existing framework to make it more devopsy, does that work for you?”
THEM: “Take my money”
So, as you can see, in a hypothetical world, there is real demand for a DevOps framework. The trouble with a DevOps framework, as is always the problem with anything to do with DevOps, nobody can actually agree what the hell DevOps means, so any framework is bound to upset a whole bunch of people who simply disagree with my assumption of what DevOps means.
So, with that massive elephant in the room, I’m just going to blindly ignore it and crash on with this experimental little framework I’m calling DevOpScrum.
In my view (which is obviously the correct view) DevOps is a lot more than just automation. It’s not about Infrastructure as Code and Containers and all that stuff. All that stuff is awesome and allows us to do things in better and faster ways than we ever could before, but it’s not the be-all-and-end-all of DevOps. DevOps for me is about the way teams work together to extract greater business value, and produce a better quality solution by collaborating, working as an empowered team, and not blaming others (and also playing with cool tools, obvs). And if DevOps is about “the way teams work together” then why the hell shouldn’t there be a framework?
The best DevOps framework is the one a team builds itself, tailored specifically for that organisation’s demands, and sympathetic to its constraints. Incidentally, that’s one reason why I like Kanban so much, it’s so adaptable that you have the freedom to turn it into whatever you want, whereas scrum is more prescriptive, and if you meddle with it you not only confuse people, you anger the Scrum gods. However, if you don’t have time to come up with your own DevOps framework, and your familiar with Scrum already, then why not just hack the Scrum framework and turn it into a more DevOps-friendly solution?
Which brings us nicely to DevOpScrum, a DevOps Framework with all the home comforts of Scrum, but with a different name so as not to offend Scrum purists.
The idea with DevOpScrum is to basically extend an existing framework and insert some good practices that encourage a more operational perspective, and encourage greater collaboration between Dev and Ops.
How does it work?
Start by taking your common-or-garden Scrum framework, and then add the following:
Operability features on the backlog
A definition of Done that includes “deployable, monitored, scalable” and so on (i.e doesn’t just focus on “has the product feature been coded?”)
Continuous Delivery as a mandatory practice!
And there you have it. A scrum-based DevOps Framework.
Let’s look into some of the details…
We’ll start with The Team…
A product owner (who appreciates operability – what we once called “Non-Functional Requirements in the olden days. That term is so not cool anymore. It’s less cool than bumbags).
Devs, Testers, BAs, DBAs and all the usual suspects.
Infrastructure/Ops people. Some call them DevOps people these days. These are people who know infrastructure, networking, the cloud, systems administration, deployments, scalability, monitoring and alerting – that sort of stuff. You know, the stuff Scrum forgot about.
Roles & Responsibilities
Pretty similar to scrum, to be fair. The Product Owner has ultimate responsibility for deciding priorities and is the person you need to lobby if you think your concerns need to be prioritised higher. For this reason, the Product Owner needs to understand the importance of Operability (i.e the ability to deploy, scale, monitor, maintain and so on), which is why I recommend Product Owners in a DevOps environment get some good DevOps training (by pure coincidence we run a course called “The DevOps Product Owner” which does exactly what I just described! Can you believe that?!).
There’s no scrum master in this framework, because it isn’t scrum. There’s a DevOpScrum coach instead, who basically does the scrum master coach and is responsible for evangelising and improving the application of the DevOps values and principles.
DevOps Engineers – One key difference in this framework is that the team must contain the relevant infrastructure and Ops skills to get stuff done without relying on an external team (such as the Ops team or Infrastructure team). This role will have the skills to provide Continuous Delivery solutions, including deployment automation, environment provisioning and cloud expertise.
Yep, there’s sprints. 2 weeks is the recommended length. Anything longer than that and it’s hardly a sprint, it’s a jog. Whenever I’ve worked in 3 week sprints in the past, I’ve usually seen people take it really easy in the first couple of weeks, because the end of the sprint seemed so far away, and then work their asses off in the final week to hit their commitments. It’s neither efficient nor sustainable.
Another big difference with scrum is that the Product Backlog MUST contain operability features. The backlog is no longer just about product functionality, it’s about every aspect of building, delivering, hosting, maintaining and monitoring your product. So the backlog will contain stories about the infrastructure that the application(s) run on, their availability rates, disaster recovery objectives, deployability and security requirements (to name just a few). These things are no longer assumed, or lie outside of the team – they are considered “first class citizens” so to speak.
I recommend twice-weekly backlog grooming sessions of about an hour, to make sure the backlog is up-to-date and that the stories are in good shape prior to Sprint Planning.
Because the backlog is different, sprint planning will be subtly different as well. Obviously we’ve got a broader scope of stories to cover now that we’ve got operational stories in the backlog, but it’s important that everyone understands these “features”, because without them, you won’t be able to deliver your product in the best way possible.
I encourage the whole team to be involved, as per scrum, and treat each story on merit. Ask questions and understand the story before sizing it.
I recommend INVEST as a guiding principle for stories. Don’t be tempted to put too much detail in a story if it’s not necessary. If you can get the information through conversation with people, and they’re always available, then don’t bother writing that stuff up in detail, it’s just wasting time and effort.
The difference between Scrum and DevOpScrum in respect to stories is that in DevOpScrum we expect to see a large number of stories not written from an end-user’s perspective. Instead, we expect to see stories written from an operation engineers perspective, or an auditor’s perspective, or a security and compliance perspective. This is why I often depart from the As a… I want… So that… template for non “user” stories, and go with a “What:… Why:…” approach, but it doesn’t matter all that much.
Same as Scrum but if I catch anyone doing that tired old “what I did yesterday, what I’m doing today, blockers…” nonsense I’ll personally come and find you and make a really, really annoying noise.
Please come up with something better, like “here’s what I commit to doing today and if I don’t achieve it I’ll eat this whole family pack of Jelly Babies” or something. Maybe something more sensible than that. Maybe.
At the end of your sprint, get together and work out what you’ve learned about the way you work, the technology and tools you’ve used, the product you’re working on and the general agile health of your team. Also take a look at how the overall delivery of your product is looking. Most importantly, ask yourself if you’re collaborating effectively, in a way that’s helping to produce a well-rounded product, that’s not only feature-rich but operationally polished as well.
Learn whatever you can and keep a record of what you’ve learnt. If any of these lessons can be turned into stories and put on the backlog as improvements, then go for it. Just make sure you don’t park all of your lessons somewhere and never visit them again!
Deliver Working Software
As with Scrum, in DevOpScrum we aim to deliver something every 2 weeks. But it doesn’t have to just be a shiny front-end to demo to your customers, you could instead deliver your roll-back, patching or Disaster Recovery process and demo that instead. Believe it or not, customers are concerned with that stuff too these days.
I personally believe this should be the guiding practice behind DevOpScrum. If you’re not familiar with Continuous Delivery (CD) then Dave Farley and Jez Humble’s book (entitled Continuous Delivery, for reasons that become very obvious when you read it) is still just about the best material on the subject (apart from my blog, of course).
As with Continuous Integration, CD is more than just a tool, it’s a set of practices and behaviours that encourage good working practices. For example, CD requires high degrees of automation around testing, deployment, and more recently around server provisioning and configuration.
So there it is in some of its glory, the DevOpScrum framework (ok, it’s just a blog about a framework, there’s enough material here to write an entire book if any reasonable level of detail was required). It’s nothing more than Scrum with a few adjustments to make it more DevOps aligned.
As with Scrum, this framework has the usual challenges – it doesn’t cater for interruptions (such as production incidents) unless you add in a triage function to manage them.
There’s also a whole bunch of stuff I’ve not covered, such as release planning, burn-ups, burn-downs and Minimum Viable Products. I’ve decided to leave these alone as they’re simply the same as you’d find in scrum.
Does this framework actually work? Yes. The truth is that I’ve actually been working in this way for several years, and I know other teams are also adapting their scrum framework in very similar ways, so there’s plenty of evidence to suggest it’s a winner. Is it perfect? No, and I’m hoping that by blogging about it, other people will give it a try, make some adjustments and help it evolve and improve.
The last thing I ever wanted to do was create a DevOps framework, but so many people are asking for a set of guidelines or a suggestion for how they should do DevOps, that I thought I’d actually write down how I’ve been using Scrum and DevOps for some time, in a way that has worked for me. However, I totally appreciate that this worked specifically for me and my teams. I don’t expect it to work perfectly for everyone.
As a DevOps consultant, I spend much of my time explaining how DevOps is a set of principles rather than a set of practices, and the way in which you apply those principles depends very much upon who you are, the ways in which you like to work, your culture and your technologies. A prescriptive framework simply cannot transcend all of these things and still be effective. This is why I always start any DevOps implementation with a blank canvas. However, if you need a kick-start, and want to try DevOpScrum then please go about it with an open mind and be prepared to make adjustments wherever necessary.
I’ve said before that I’m a big believer that there’s no “one size fits all” solution for DevOps, and nothing in my experience as a DevOps Consultant has led me to change my mind on that one. Each organisation is subtly different enough to warrant their own approach to adopting, and then succeeding with DevOps.
However, I do think there are some good patterns for successful DevOps adoption. “The right ingredients” you might say. But as with cookery and chemistry experiments, it’s the quantity of, and order in which you introduce these ingredients that makes all the difference (I discovered this first-hand as a chemistry undergraduate J ).
Below is a list of 5 steps for starting out on a successful DevOps journey (“DevOps journey” = 100 cliché points btw). It’s not a solution for scaling DevOps – that’s step 6! But if you’re looking for somewhere to start, these 5 steps are essentially the blueprint I like to follow.
- Agree what your goals are, what problems you’re trying to solve, and what DevOps means to you (is it just automation or is it a mindset?). You all need to be on the same page before you start, otherwise you’ll misunderstand each other, and without knowing your goals, you won’t know why you’re doing what you’re doing.
- Build the platform. DevOps relies heavily on fast feedback loops, so you need to enable them before you go any further. This means putting in place the foundations of a highly automated Continuous Delivery platform – from requirements management though to branching strategy, CI, test automation and environment automation. Don’t try to create an enterprise-scale solution, just start small and do what you need to do to support 1 team, or this thing will never get off the ground. You’ll probably need to pull together a bunch of DevOps engineers to set this platform up – this is often how “DevOps teams” come about, but try to remember that this team should be a transitional phase, or at least vastly scaled down later on.
- Assemble the team. We’re talking about a cross-functional delivery team here. This team will include all the skills to design, build, test, deliver and support the product, so we’re looking at a Product Owner, Business Analyst, Developers, Testers, and Infrastructure Engineers among others (it largely depends on your product – it may need to be extended to include UX designers, Security and so on).
- Be agile, not waterfall. Waterfall’s just not going to work here I’m afraid. We’re going to need a framework that supports much faster feedback and encourages far greater collaboration at all times. So with that in mind, adopt a suitable agile framework like scrum or Kanban, but tailor it appropriately so that the “Ops” perspective isn’t left out. For example – your “definition of done” should stretch to include operability features. “Done” can no longer simply mean “passed UAT”, it now needs to mean “Deployable, monitorable and working in Pre-Live” at the very minimum. Another example: Your product backlog doesn’t just contain product functionality, it needs to include operability features too, such as scalability, maintainability, monitoring and alerting.
- Work together to achieve great things. Let the delivery team form a strong identity, and empower them to take full ownership of the product. The team needs autonomy, mastery and purpose to fully unlock its potential.
Once you’ve achieved step 5, you’re well on your way to DevOps, but it doesn’t end there. You need to embrace a culture of continuous improvement and innovation, or things will begin to stagnate.
As I mentioned earlier, you still need to scale this out once you’ve got it working in one team, and that’s something that a lot of people struggle with. For some reason, there’s a huge temptation to try and get every team on-board at the same time, and make sure that they all evolve at the same rate. There’s no reason to do this, and it’s not the right approach.
If you have 20 teams all going through a brand new experience at the same time, there’s going to be a great deal of turmoil, and they’re probably going to make some of the same mistakes – which is totally unnecessary. Also, teams evolve and change at different rates, and what works for one team might not work for another, so there’s no use in treating them the same!
A much better solution is to start with one or two teams, learn from your experience, and move on to a couple more teams. The lessons learnt won’t always be transferrable from one team to the next, but the likelihood is that you’ll learn enough to give yourself a huge advantage when you start the next teams on their journey.
Sure, this approach takes time, but it’s more pragmatic and in my experience, successful.
One final comment on the steps above concerns step 2 – building the Continuous Delivery platform. It’s easy to get carried away with this step, but try to focus on building out a Minimum Viable Product here. There’s no getting away from the need for a high degree of automation, especially around testing. The types of testing you might need to focus on will depend on your product, its maturity, complexity and the amount of technical debt you’re carrying.
Other aspects you’ll need to cover in your Continuous Delivery MVP are deployment and environment automation (of course). Thankfully there are external resources available to give you a kick-start here if you don’t have sufficient skills in-house (there are plenty of contractors who specialise in DevOps engineering, not to mention dedicated DevOps consultancies such as DevOpsGuys J). Don’t spend months and months assessing different cloud providers or automation tools. Speak to someone with experience and get some advice, and crack on with it. Picking the wrong tool can be painful, but no more painful than deferring the decision indefinitely. Anyway, it’s relatively easy to move from Chef to Ansible, or from AWS to Azure (just examples) these days.
Many years ago I worked for a company that spent over a year assessing TFS, while continuing to use VS etc in the meantime. I worked with another company more recently who spent a year assessing various cloud providers, all the while struggling along with creaking infrastructure that ended up consuming everyone’s time. My point is simply that it’s better to make a start and then switch than it is to spend forever assessing your options. It’s even better to take some expert advice first.
I was at DevOps World last week (nothing like Disney World, by the way) and happened to be paying attention to a talk by a chap called Jonathan who worked at Barclays Bank. He briefly mentioned a couple of KPIs that they measure to track the success of their DevOps initiative. He mentioned these:
- Lead Times
This list looked quite good to me, I thought “They sound pretty sensible, I’ll remember those for the next time someone asks me about DevOps KPIs”. The reason I thought this, you see, is because I get asked “What are good DevOps KPIs?” almost every week. Colleagues, clients, friends & family, random strangers, the dog… Everyone asks me. It’s like I’m wearing a T-Shirt that says “Ask me about DevOps KPIs” or something.
So, the time has come to formulate a decent answer. Or, more specifically, write a blog on it, so I can then tell people to read my blog! Hurrah!
A couple of months ago, while discussing a DevOps transformation with a global telecomms company, the subject of metrics and KPIs came up. We’d spent the previous hour or so hearing about how one particular part of the business was so unique and different to all the others, and that any DevOps transformation would need to be specifically tailored to accommodate this business’s unique demands. I totally agree with this approach. However, when the subject of KPIs came up, the “one-size-fits-all” approach was favoured.
It’s common for organisations to want KPIs that span the whole organisation. It’s convenient and allows management to compare and contrast (for whatever good that’ll bring). But does this “one-size-fits-all” approach work? Or does it encourage the wrong behaviours?
You can’t manage what you can’t measure
Personally, I think you need to be very careful about selecting your KPIs and metrics. Peter Drucker once observed that “you can’t manage what you can’t measure”, which sounds sensible enough, but this leads us towards trying to measure everything (because we want to manage as much as we can, right?). But that’s where things get a bit tricky. As soon as we start measuring things, they change – this is known as Goodhart’s Law. But what I’m talking about specifically is people changing their behaviours because they’re being measured.
Once you measure something, it changes
If we’re being measured on utilisation level, we try to expand our work to fill the time we have available, in order to look fully utilised. It’s what people do! By doing this, people lose the “downtime” they used to have, the time when people are most creative, and as a result, innovation suffers.So what should we measure?
It depends on what you’re trying to achieve, and what side-effects you’re able to tolerate. Think very carefully about how your metrics and KPIs could be interpreted by both subordinates and management.
For example, I’m currently working with a team who until recently measured the age of stories in the backlog. The thought was, the larger the number, the longer it’s taking to get stuff done. The reality was different. In reality, there was an increasing number of low priority stories, which were often (and quite legitimately) overlooked in favour of higher priority stories. So what did the metric really prove? That the team were slow or that the team were effective at prioritising?
I think generally speaking that most stats need to be accompanied by a narrative, otherwise they’re open for misinterpretation. But we know that there’s often very little room for narrative, and that the fear of misinterpretation drives people to try to “game” the stats (that is to say, legitimately manipulate the results). And this is another reason why we have to be very careful when we’re planning KPIs and reporting metrics.Data Driven Metrics
In 2014 Gartner produced a report entitled “Data Driven DevOps: Use Metrics to Help Guide Your Journey” in which they listed a range of typical DevOps metrics, categorised by type, such as “Business Performance”, “Operational Efficiency” and so on. I’ve picked out a few of the metrics in the table below. I’ve also added some others which I’ve been using in one form or another. This is by no means an exhaustive list of DevOps KPIs, but it might be somewhere to start if you’re looking for inspiration.
Measuring tangibles and intangibles
One thing to be conscious of is that you can’t really measure things like “culture” and “collaboration” directly. Culture, for example, is an intangible asset, and you can only really measure the result of Culture, rather than the culture itself. The same goes for collaboration.
In the table above, be conscious of things like “happiness”, “value” and “sharing” as these can sometimes be hard to measure directly, not to mention being somewhat subjective.
At IPExpo in London a couple of weeks ago, I was asked if it was possible to “Do DevOps in an ITIL environment”.
My simple answer is “yes”.
ITIL and DevOps are two different things, they both attempt to provide a set of “best practices”; ITIL for Service Delivery and Maintenance, DevOps for Software Delivery and Support.
DevOps is mostly concerned with a couple of things:
- The mechanics of building and delivering software changes (we’re talking about Continuous Delivery, deployment automation, Configuration automation and so on).
- The behaviours, interactions and collaboration between the different functions involved in delivering software (Business, Dev, Test, Ops etc)
ITIL largely stays away from anything to do with the mechanics, and doesn’t touch on culture and collaboration – preferring instead to focus more on the tangible concepts of IT service support. It’s essentially a collection of procedures and processes for delivering and supporting IT services. Most of those procedures and practices are just common sense good ideas.
DevOps isn’t a prescriptive framework, it’s more like a philosophy (in the same way as Agile isn’t a framework). Because it’s not prescriptive, it can work with any framework (such as scrum) provided that framework isn’t at odds with the DevOps philosophy (such as waterfall).
ITIL provides a set of concepts which you then implement in your own way. For example, ITIL promotes the concepts of Incident and Problem Management. It doesn’t tell you exactly HOW you should do them, it simply suggests that these are good processes to have. There are recommendations around actions such as trend analysis and root-cause analysis, but it doesn’t prescribe how you should implement these.Change Control
Probably the area with the greatest amount of cross-over is change management. ITIL explicitly mentions it as a procedure for the efficient handling of all changes, and goes on to talk about Change Advisory Boards, Types of Change, Change Scheduling and a bunch of other “things to do with deploying changes to an environment”.
DevOps also advocates smooth and efficient processes for deploying changes through environments – so there’s no conflict here. The only slight misalignment is that in ITIL, change management is seen as an activity that happens during the Service Transition phase, while in DevOps we tend to advocate the identification and promotion of pre-authorised changes (standard change), which means the change management process effectively starts prior to service transition. But that’s about it really.
Some people get a bit carried away with the role of the Change Advisory Board in ITIL, and insist that every change must pass through some sort of CAB process (usually involving a monthly CAB meeting, where a bunch of stakeholders review all changes queued up for a production deployment, which usually only serves to cause a delay in your software delivery process and add very little value). ITIL doesn’t explicitly say it has to happen this way – it’s not that prescriptive!
Similarly, DevOps doesn’t say you can’t have a CAB process. If you’ve got a highly complex and unstable environment that’s receiving some sporadic high-risk changes, then CAB review is probably a good idea. The only difference here is that DevOps would encourage these Change Advisory Board reviews to happen earlier in the process to ensure risk is mitigated right from the start, rather than right at the end.
So, in summary, ITIL and DevOps are not having a fight in the schoolyard at home time, there’s nothing to see here, go about your business.
We all know a good agile organisation, or at least we’ve all heard about them, where everyone just *gets it*, they’re agile through-and-through, from the top down, bottom up, agile in the middle, and everyone’s a mini Martin Fowler. Yay for them.
We’ve also heard about these DevOps companies, who are leveraging automation in every step of their delivery pipeline. And they’re deploying to production 8,000 times a day with zero downtime and they rebuild their live VMs every 12 seconds. Great work.
Unfortunately the rest of the world sits outside those two extremes (recall Rogers Diffusion of Innovation Curve, principally the early and late majority). A lot of organisations simply don’t know what Agile and DevOps are, where they’ve come from, what the point is, and most importantly, how to do it.
So here’s what happens:
- To become agile they “go scrum” and hire a scrum master or ten
- To be “DevOps” they automate their environments and deployments
Why do they do this? I suspect it’s a number of reasons, but largely it’s because there’s a shit tonne of material out there that supports the view that Scrum is the best agile framework and DevOps means automating stuff.
The results are fairly predictable:
If you “do scrum” instead of understanding agile, you get what’s called Agile Cargo Cult. That basically ends up with people doing all these great scrum practices and ceremonies, but things don’t actually improve, and eventually they start to get worse, so to rectify the situation, teams apply the scrum ceremonies and practices with even greater rigour. Obviously this gets them nowhere, and eventually people within the organisation start to believe “Agile doesn’t work here”, blissfully unaware that they were never actually “agile” in the first place.
Organisations who think DevOps is about automating the Ops tasks just end up “slinging shit quicker”. If you don’t sort out the real problems in your system, you’re basically just making localised optimisations. There’s just no point. If your problem is that your software is hard to run, scale, operate and maintain – don’t try to automate your deployments.
Also, many DevOps initiatives, in my experience, are either driven by Dev, or Ops, but not usually both. And that says it all really.
So, for a lot of organisations who are new to this whole Agile and DevOps thing, there’s clearly an easy path sucking a lot of people in. And that’s a shame, because it results in a lot of frustration. It would be easy to laugh at these organisations, but it’s not their fault. Scrum has become a self-serving framework, seemingly more interested in its own popularity than its effectiveness, and DevOps is anything to anyone.
So, in summary, don’t do scrum, be agile. And don’t confuse DevOps with automating the Ops work.