Unleashing the 🐙

By Shraya Ramani

Today we are open-sourcing sso, our single-sign-on authentication proxy — known internally as the S.S. Octopus — used to secure access to internal services at BuzzFeed. This post will cover an overview of the history of authentication at BuzzFeed, why we wrote sso, and a little bit about how it works.

We think sso is a useful solution to a problem that many organizations face. Because we’ve benefited so much from the open source community, we’re excited to give back by making it available for anyone to use!

BuzzFeed’s software ecosystem is comprised of hundreds of microservices that interact with each other in a variety of ways. A subset of those services are applications exposed on the public internet which must only be accessible to privileged users.

As BuzzFeed’s global employee base grew, the need to expose tools to the internet for internal use became more apparent, which created an equally growing need to secure those applications with user authentication. To establish a single source of truth for identity, we standardized on how we protect each of those applications by using Bitly’s open source oauth2_proxy service, which is a reverse proxy that uses third party OAuth2 providers (Google, GitHub, and others) to authenticate and authorize requests.

Google’s identity aware proxy, which embodies their Beyond Corp philosophy, was an inspiration for us. It’s a generally effective pattern for microservices because it allows the developers to focus on their services’ primary functionality instead of reimplementing authentication every time. BuzzFeed’s use of oauth2_proxy allowed developers to rapidly grow the number of internal applications deployed on the platform.

For a while, using oauth2_proxy in front of services was an easy drop-in solution for developers creating services; however, as the number of services grew more rapidly over the years, the solution was not as scalable as we had hoped.

As operators, managing the proliferation of boilerplate auth proxy services proved difficult. Critical security fixes required 100+ patches and deploys, since each protected microservice had its own auth proxy service to go with it. Auditing and controlling access across those services was also an ongoing challenge.

Scalability issues were not exclusive to the operators and developers. End users were required to sign into each application separately, which could be frustrating and confusing. These separate logins also prevented the development of seamless workflows between related tools. Finally, this had the unintended side effect of training users to blindly click through the OAuth2 login flow, instilling bad security habits.

Our solution to these pain points is sso, which allowed us to replace every individual oauth2_proxy service with a single system providing a seamless and secure single sign-on experience, easy auditing, rich instrumentation, and a painless developer experience.

sso is an OAuth2-friendly adaptation of the Central Authentication Service protocol (CAS). The CAS protocol uses a “federated” approach, where all authentication is handled by a centralized service, instead of individual applications.

Our implementation is comprised of two services, sso-auth and sso-proxy, that cooperate to perform a nested authentication flow and proxy requests:

  • sso-auth is the central authentication service, which directs a user through an OAuth flow with a third-party provider (e.g. Google).
  • sso-proxy ensures all requests are authenticated and authorized according to sso-auth before proxying them to upstream services, and signs requests to allow verification that the requests originate from sso-proxy
  • Both sso-auth and sso-proxy store user session information in long-lived, encrypted cookies, but sso-proxy transparently re-validates the user’s session with sso-auth on a short, configurable interval to ensure quick propagation of authentication/authorization changes.
  1. When an organization would like to secure their services behind sso, they create a wildcard DNS entry *.sso.pacworld.com which points at their deployment of sso-proxy *
  2. They want to use sso to secure their service, which is deployed at ghost-land-internal.pacworld.com **
  3. So, they add ghost-land to their sso-proxy configuration ✝✝ file:
- service: ghost-land
default:
from: ghost-land.sso.pacworld.com
to: ghost-land-internal.pacworld.com
allowed_groups:
- ghosts@pacworld.com

4. Now, their employees can securely access Ghost Land at ghost-land.sso.pacworld.com

* In practice, we usually create a more user-friendly domain like ghost-land.pacworld.com that points to ghost-land.sso.pacworld.com.
** sso upstreams are defined using a static config file, and ‘service discovery’ is handled by DNS.

When a user visits an sso-protected site for the first time, they are redirected to sso-auth and prompted to authenticate with an authoritative third party provider (Google) before proceeding to their destination.

When Pinky visits a different sso-protected site for the first time, their browser will be redirected to sso-auth, and will immediately be redirected back to sso-proxy because they have already authenticated, logging them in automatically and transparently.

Every step along the way, sso-auth ensures that the user is authenticated against the authoritative third party OAuth provider, and sso-proxy ensures that the user is authorized to access each specific upstream based on their email address and group membership.

With sso, the process for adding authentication to service is now much more straightforward — just a simple configuration change. Maintaining the security of our services is also much simpler; a security bug fix now only needs to made in one place, rather than 100. Users love it too — they only have to login once to be able to access all of the services behind sso, rather than having to login many times!

We have clear visibility into sso because of the rich instrumentation baked into the system, including statsd metrics and structured logging, that allows us to have a better understanding of our internal services.

Detailed instrumentation gives us great visibility into the usage and performance of our internal applications

As mentioned above, sso is built on top of Bitly’s open source oauth2_proxy, which has been community verified and hardened. Throughout its development at BuzzFeed we have made sso a priority target for penetration testing by researchers on our bug bounty program — we’ve paid bounties for a number of reported issues!

In preparation for open sourcing we also engaged with Security Innovation, a widely respected agency who count Microsoft, Symantec, and Amazon as clients, to do a more in-depth, week long assessment, with full access to source code and design documents. This found no major issues, which gives us the confidence to open source sso today. However, being mindful that the security landscape changes rapidly, we will continue to make sso available for BuzzFeed’s bug bounty program and encourage responsible disclosure of any security issues there!

Here is the link to the GitHub repo and quickstart guide. We’d love your feedback, so please try it out and open some issues (or pull requests)!

First of all, this project would not exist if it weren’t for Justin Hines, who developed the central ideas and helped bring it to life with our original founding team, Michael Hansen, Will McCutchen, and myself. We’d also like to thank Andrew Mulholland, Dan Katz, Dan Meruelo, Eleanor Saitta, Logan McDonald, Lystra Batchoo, and Matt Reiferson, for their work on open sourcing this project. Thanks to Kelsey Scherer for our amazing octoboi logo, which we love very much. Finally, we extend a huge thank you to the BuzzFeed organization as a whole for valuing open source work and supporting our team throughout this process, especially the Infrastructure squads!

To keep in touch with us here and find out what’s going on at BuzzFeed Tech, be sure to follow us on Twitter @BuzzFeedExp where a member of our Tech team takes over the handle for a week!