18F is shipping products left and right lately. Every Kid in a Park, College Scorecard, and Federalist have all debuted in the last month, and we’ve got more on the way. Other teams might ask how we are shipping so many products, so close together, with enough resilience to stand up to the traffic generated by the President’s weekly address. Do we have an army of infrastructure people on each one? Are we compromising on security? The answer is no. Instead, we’ve had a small team tackling the core infrastructure issues in an effort to enable our small development teams to improve delivery of 18F products.
Boat-builders don’t launch ships by backing a trailer down a boat ramp. Every boat they make is going to get launched, so they use a shipyard to make it as easy as possible to build right near the water and get it launched quickly and easily, even at battleship scale.
At 18F, we’ve developed cloud.gov, a shipyard aimed at ensuring our teams can get even the smallest of boats in the water in a production-worthy state from day one of development.
Scaling infrastructure expertise and reducing grunt work
Previously, infrastructure meant dealing with data centers, procuring racks of hardware, flushing out single points of failure, cabling, and managing baroque networking. Now, providers like Amazon Web Services, Google Compute Engine, and other cloud vendors have commoditized that infrastructure into something that you can buy rather than building and managing yourself. This Infrastructure as a Service (IaaS) allows teams to work more effectively and focus their time on problems that are unique to the organization.
But even with IaaS in our toolbox, infrastructure experts are needed in order to best manage, configure, and secure these IaaS resources. Those experts need to ensure security-hardened operating system versions are in use, and that vulnerability scans and software updates are happening regularly. They need to understand how traffic gets routed from the outside world and balanced across our hosts to deal with surges of traffic. The government domain further requires them to have compliance expertise in order to ensure the service being delivered satisfies a byzantine regulatory framework, and then generate a mountain of documentation to prove it to other people.
At 18F, we want our teams to stay small and quick but still deliver best-practice services that can stand up to punishment. A core challenge has been reducing the need for those highly-skilled infrastructure resources to be on every one of our teams. We could have focused on scaling up a centralized infrastructure team to look after all of these concerns for all of our development teams, but that is not viable in the long term. So instead we have focused on enabling people with broad and shallow development expertise to accomplish things that would normally require specialized experts.
The magic happens when an infrastructure team encapsulates their expertise, and then exposes that expertise as a service which can be used directly by developers. This is what’s known as “Platform-as-a-Service” (PaaS), and it’s a force-multiplier that bridges that gap between small service teams and advanced infrastructure skillsets, while keeping your headcount under control. 18F has built on the open source project Cloud Foundry to create our own PaaS that we call cloud.gov.
Creating a feedback loop
Creating cloud.gov has had a tremendous effect for our teams. The job of the infrastructure team is shifting from “trying to look after all the detailed infrastructure work on all the teams” to “provide spot support and consultancy directly when teams run into trouble using cloud.gov, then use what was learned to increase cloud.gov’s capabilities.” Meanwhile, the development teams can incorporate advanced operational capabilities without focusing a big portion of their efforts on them.
This results in a feedback loop where the infrastructure team steers the development of cloud.gov in the direction of making common irritations go away. It’s that agile, user-centered iteration loop we strive for in all our products, only the users in this case are development teams.
Giving those teams more time to work on their boat and less time worrying about the boat launch allows 18F to keep our teams small, our products robust, and our launches frequent.
Getting your own shipyard
We’re now exploring whether 18F’s cloud.gov meets the needs of several federal agencies who are participating in a cloud.gov pilot program. We plan to see what works, what doesn’t work, and fix any process bottlenecks that pop up, and then hope to roll the service out more broadly.
If you’re interested in cloud.gov, be sure to drop your email address in the cloud.gov form, or drop by our #devops-public Slack channel to chat about it.