Watch How Netflix Updates Its Cloud Code

Screen Shot 2016-03-09 at 3.32.08 PM

Ever wanted a walkthrough of how a major tech company builds and releases code? Netflix is offering just such a thing on its corporate blog.

Netflix claims 75 million members worldwide, a massive customer base for any cloud-based service. And with other tech firms such as Amazon elbowing into the streaming-content space, any prolonged degradation in service can easily translate into customers jumping ship for a rival service. In other words, no pressure on Netflix’s developers and engineers to deliver as seamless an experience as possible, even as they constantly update the backend.

At the core of Netflix’s software operations is Spinnaker, a “Continuous Delivery Platform” available (should you want to tinker around with it) on GitHub. The platform can deploy and manage simultaneous clusters on both AWS and Google Cloud—useful when you’re trying to deliver the latest episode of “Daredevil” (above) to 5 million people:

Screen Shot 2016-03-09 at 3.24.17 PM

Netflix created Spinnaker by studying how its teams delivered assets to its cloud servers, then breaking down that process into discrete stages (i.e., deploying to cloud provider, manual judgment, etc.). Spinnaker allows teams to create and implement, and manage those stages, either serially or in parallel.

Right now, Netflix developers build and test Java applications via Nebula, a set of Gradle plugins. Any charges are committed to a central git repository. “Once the change is committed, a Jenkins job is triggered,” explained Netflix’s blog. “Our use of Jenkins for continuous integration has evolved over the years. We started with a single massive Jenkins master in our datacenter and have evolved to running 25 Jenkins masters in AWS.”

Netflix doesn’t modify its AWS instances live; instead, it creates a new Amazon Machine Image (AMI) for each deployment. The company has a custom-built platform, dubbed “The Bakery,” that exposes an API that allows for the creation of AMIs globally.

After that, Spinnaker pipelines deploy the AMI to as many instances as desired. “A successful bake will trigger the next stage of the Spinnaker pipeline, a deploy to the test environment,” the blog added. “From here, teams will typically exercise the deployment using a battery of automated integration tests. The specifics of an application’s deployment pipeline becomes fairly custom from this point on.”

What’s especially interesting about Netflix’s setup is its reliance on small, relative autonomous tech teams, as opposed to a centralized, top-down hierarchy. Those teams supposedly have a good deal of autonomy, “building and pushing changes at a speed they are comfortable with” (per the blog). The company’s transition from a “monolithic, datacenter-based Java application” to “cloud-based Java microservices” facilitated that team-oriented transition.