A look at the last 10 years of Devops. What have we learned in the Devops movement from around 2007. Early Cfengine saturation based on principles of convergence, desired state and idempotency. Puppet becomes mainstream in 2008-2010 replacing a lot of Cfegnine for early web-scale IaaS. Chef in 2009 driving reusable infrastructure with recipes and cookbooks. Continuous Delivery becomes mainstream around 2010. Containers and immutable delivery models 2015-2016. Deming and Goldratt influence on Devops via Gene Kim. Formalization of Lean principles VSM via Rother’s “Learning to See” as well as Kanban for Software from DJ Anderson. Mike Rother’s Kata influence and Dr Spear further Lean influences on practices of Devops. Mike Nygard’s “Release It” circuit breaker patterns on mainstream and actually becoming state of the art in Google’s new Envoy/ITSIO design. 2010 Dr. Cook’s “Why Complex Systems Fail” reference in John Allspaw’s Web Operations book. Dr. Sidney Dekker’s Drift into Failure as input to the Netflix design. All the way up to Dr. Cook, Dr. Woods and John Allspaw’s recent 2017 Stella Report. Finish up with a look at where we are today and a quick glimpse at some potentially interesting new influencers (Cynefin and Simon Wardley’s Mapping).
Are you really “DevOps”? You have spent years going through an amazing transformation. You claim that you are now officially “DevOps”. But, what now? Your customers don’t really care if you are a DevOps shop/factory; they care about how your products –built from that shop/factory– add value to their lives. Common sense stuff, right? Well, unfortunately, no. You’d be surprised as to how many people still quite don’t get what the true spirit of DevOps is. You often hear things like: “We are a team of developers doing operations - we are DevOps!”, or “We are a Dev team continually deploying to production - we are DevOps!”.
While all the above are not false claims, they fail to draw a complete picture of what DevOps is truly about.The first thing that comes to my mind when I hear them is, “Well, all this is great, but now what? Who are you doing all this for? All these changes to your culture, process, and tools for whom? Where do your customers fit into all this? How are all these changes impacting your customers? Most importantly, what are you learning about your customers?”
Come find out more as I explain why and how “DevOps” is more about customer feedback and quick learning than just culture/process/tools. I will focus on four major “DevOps” patterns (from a customer’s perspective); I will throw in some anti-patterns as well while I am at it. These are insights I have gathered over more than a decade of operating some of the top internet sites and games in the world.
Microsoft issued me a Mac when they hired me to help people use Linux on Azure. If this sounds like the beginning of a nerdy joke, it’s because we need to question long-held opinions, let go of deeply-cherished stereotypes, and welcome this new era of open collaboration.
Let’s take the 10,000 foot tour of today’s cloud, containers, and orchestration landscape before diving into specifics we can use when making calls on microservices, backing data stores, and app decomposition. We’ll talk public cloud, containers, and k8s from the “Open at Microsoft” perspective!
Engineers spend a lot of time building dashboards to improve monitoring but still spend a lot of time trying to figure out what’s going on and how to fix it when they get paged. Building more dashboards isn’t the solution, using dynamic query evaluation and integrating tracing is.
The word dojo means “the place of the way.” Originally, dojos were learning halls attached to temples. At Verizon, we’ve created engineering dojos to accelerate our learnings around real challenges in our organization. Through a combination of coaching, doing, and evaluating, dojo participants transform knowledge into wisdom.
One of the main things we do as coaches is to establish shared understanding of the desired outcome by using a technique called story mapping. We’ll show teams how having a visual map helps in all aspects of the delivery lifecycle including development, testing, and operations. Next, we’ll focus on testability first and dive into what kinds of tests we need to create to maximize our coverage around the desired outcome. Finally, we’ll dive right into the code with developers and help teams on things like code redesign, CI pipelines, branching, and AWS.
The structure of a Dojo engagement is usually 6 weeks with 2 and ½ day Sprints. We keep our sprints short on purpose, mainly for coaches and participants to focus coaching needs, and to surface any issues quickly. The short sprints also help in evaluating whether a given technical approach is working and whether we need to pull in other coaches for advice.
The lessons we learned at Verizon are relevant to anyone attempting to undergo a transformation as they reflect our attempts to address the cultural and communications ingredients that are vital in DevOps. In this talk, I’ll share what we learned about starting and running a learning Dojo, why we feel the Dojo model works, and how we are addressing some common DevOps challenges going forward.
Making a great cocktail isn’t hard, but it requires getting a few key details right… just like DevOps.
Do you Amazon Web Services? I do! Its pretty great! Sometimes, when using one of the many services available, you find that it doesnt behave the way you want it to. Do we walk away in anger, throw our hands up in the air with disgust? NO! We can fix it, yes we can! USING ITSELF!
Stories are a powerful way for us to connect with one another. They elicit a range of emotions and can drive us to take action. When presenting data, think about the story it tells, and the actions desired of the audience.
An Ignite that plays off Ambrose Bierce’s The Devil’s Dictionary to present satirical and painfully true definitions of words you hear at conferences all the time.
DevOps talks tend to emphasize SLDC and CI/CD situations. In this talk I teach 5 of the most important DevOps principles (Small Batches, MVP, and The Three Ways of DevOps) by applying those principles outside of the usual web application environment. Case studies improving a new-hire onboarding process, improving a website failover process, deploying a monitoring system, building a PXE-install system, and writing a book. By the end of this talk you’ll have a better understanding of the principles of DevOps in a way that helps you apply them at home and at work.
When a datacenter goes offline, a server gets overloaded, or a binary hits a crashing bug, we usually have a contingency plan. We reduce damage, redirect traffic, page someone, drop low-priority requests, follow documented procedures. But why do many failures still come as a surprise? In this talk, we look at some real life analogs to preventing and managing software failures. Fire partitions. Public safety campaigns. Smoke alarms. Sprinkler systems. Doors that say “This is not an exit”. And fire escapes. What can we learn from the real world about expecting failure and designing for it?
Welcome, bold adventurer! Are you ready to work with me to explore the multiple paths of deployment, activation, and feature flags? The audience drives the talk by choosing different deployment and activation options in real time and we’ll see how the story ends.
We’ll use this format to explore how feature flags can work with CI/CD or other rapid deployment models, and how we can work to make deployments safer and faster incrementally.
Audiences will get a fun, fast-paced exploration of forking deployments. They’ll walk away with an understanding of how to reduce the risk of major changes, how to follow multiple streams of deployment, and how to evaluate the best way to speed deployment in their environments.
This really is a dynamic, audience driven presentation. They’ll select branching options, and I’ll create the talk on the fly based on pre-determined options. As such, it’s going to be hard to show you the outline. Topics I intend to cover include feature flags, scaled deployments, and kill switches.
Moneyball is about baseball. But it’s also about breaking down accepted preconceptions and finding new ways to look at individual skills and how they mesh as a team. We often inherit already-existing teams, and believe that the structure and operation of the team takes best advantage of individual abilities. When we have the opportunity to add people to a team, we often look for skills we think the team is lacking, rather than what will make a more effective team. Sometimes the characteristics that we believe the team needs aren’t all that important in assessing and improving the quality and productivity of that team.
Moneyball is also about people deceiving themselves, believing something to be true because they think they witnessed it. Likewise, when building multidisciplinary teams, we let our observations about what the team does, and how it works, in order to collect and analyze data, and make determinations of how to best perform their mission. In fact, some of the team’s accepted practices may have less an impact on the quality of the end result than we would like.
This presentation examines how to use data in our daily jobs to tell the right story about our state of affairs, and our success in delivering successful outcomes. It takes a look at some of our preconceptions about individual skills, and whether those preconceptions are actually supported by data and research. It identifies characteristics that can give pointers on building and running a high-performance team.
It applies the Moneyball approach to team building and managing DevOps teams, and to give teams the best bang for their buck in evaluating their own capabilities and project requirements, looking at their work in a new way, and delivering the highest quality results possible.
At some point, we all find ourselves at a SQL prompt making edits to the production database. We know it’s a bad practice, and we always intend to put in place safer infrastructure before we need to do it again, but what does a better system actually look like?
This talk progresses through 5 strategies for teams using a Python stack to do SQL writes against a database, to achieve increasing safety and auditability:
Raw SQL queries
Local one-off scripts
Deploy and run scripts from an application server
Run scripts from Jenkins with command line arguments
Build a Script Runner application
We’ll talk about the pros and cons of each strategy, and help you determine which one is right for your specific needs.
By the end of this talk, you’ll be ready to start upgrading your infrastructure for making changes to your production database safely!
Most companies slow down as they get larger, but some actually get faster. This talk will discuss the speaker’s experiences leading high-performing engineering teams at Google, eBay, and Stitch Fix, and will discuss the organization, the processes, and the culture that can help a company move fast – and even accelerate – as it grows. Modern software-service models take advantage of the great benefits in having the same team both build the software as well as operate it in production – we call this DevOps, or simply “You Build It; You Run It”. What does this mean in practice? Organizationally, it means small teams with well-defined areas of responsibility, directly aligned with the business. The teams are cross-functional, meaning that each team has all the skill sets it requires to do its job, while at the same time relying on other teams for supporting services, tools, and libraries. Process-wise, it means doubling down on practices like test-driven development and continuous delivery. Using continuous delivery practices, high-performing teams can and do release their applications and services multiple times a day. This enables them to iterate rapidly, experiment courageously, and fail more quickly. Culturally, it means end-to-end ownership. Each team owns its software end-to-end, from design to development to deployment to retirement. The same engineers who are responsible for the features are responsible for quality, performance, operations, and maintenance. This ownership puts incentives in the right place to encourage building maintainable, observable, and operable systems from the start. All these techniques and approaches are available to everyone, and practical examples in this talk will help other organizations on their journey.
In this talk I’ll lay out a few key principles for monitoring microservices and the containers they are based on.
If you ever need to validate certificates or certificate chains before deploying them, Golang provides a near foolproof test method.
A 3rd party developed a tool that was then handed off to our DevOps team to manage and maintain. Before I could do any re-engineering work, I had to resolve a critical issue—the certificates on the ELBs were about to expire and needed updating.
I assumed that if the ELB, NGINX, or httpd started, it was a good sign. This was a false assumption on my part and I ended up serving a bad chain for a few minutes. This did not break the site, but it was definitely not the way I wanted things to remain.
I needed a tool that would fail if the certificate chain provided was incorrect. I wanted a lightweight tool that could be publicly accessible. Conducting a third-party analysis of the certificates and configuration was a requirement. There were no tools that I could find meeting this need, so I decided to build my own. I turned to the open source language, Golang.
A detailed breakdown of how I built a tiny web server to fit my needs along with what each package is doing as detailed in the article linked above.
DevOps professionals are required for teams to advocate on behalf of software. Without Developer Advocates, code exists as text in a repository. By spreading the value of good information we can help good practices become part of the DevOps experience.
DevOps teams can recognize and avoid groupthink by questioning group decisions and advocating multiple approaching to solving a given problem.
DevOps trends are clear on measuring systems Mean Time To Recovery rather than Mean Time Between Failures. I argue that worrying about time between failures actually causes more harm than worrying about recovery. But do we think of our human systems the same way as our digital? I’ll apply lessons learned in SysOps to HumanOps.
I’ll talk about how our complex social systems act like complex computer systems and how focusing on MTTR rather than MTBF is a good thing between people, not just machines. I’ll cover the environmental requirements for focusing on MTTR and discuss potential conflict resolution steps for a jumping off point in your organization or community.
I interviewed 14 of the 17 Agile Manifesto authors for a special podcast project with the intent to chronicle the manifesto story. What emerged was much more. The story of why the event was needed, what the vision was, how this was ruined in industry. Learn how DevOps is the true agility enabler.