3 years ago, we wrote ‘What does Unsplash cost?’ to give a totally transparent look at the bills associated with hosting one of the largest photography sites in the world.
Since then, Unsplash has continued to grow tremendously, now powering more image use than the major image media incumbents, Shutterstock, Getty, and Adobe, combined.
With Unsplash’s public API, we power over 1000+ mainstream applications, including Medium, Trello, Squarespace, Tencent, Naver, Square, Adobe, and Dropbox.
All of that growth means two things: more traffic and bigger bills.
In the interest of transparency, Chris and I thought we were overdue for an update.
It’s 2019. What does it cost to host Unsplash?
Back in 2016, Unsplash had just crossed 1 billion images viewed and 5.5M photos downloaded per month.
Our team was smaller and our product was a lot less developed, which led to less services and less in-house processing. We had one main application, a traditional Rails monolith, that consumed a handful of services to create the basic Unsplash experience.
Heavy features like search and realtime photo stats were in their infancy, which led to much simpler data processing requirements and the use of 3rd party services like Keen and a handful of CRON jobs.
The final monthly breakdown for April 2016 was:
- Web Servers: $2,731.23
- Monitoring: $630.00
- Data Processing: $1,000.00
- Image Hosting: $11,170.00
- Other: $2,127.39
Total (USD): $17,658.62
A lot has changed.
For one, Unsplash is a hell of a lot bigger. 10+ times bigger. We now get more traffic from our API partners than our own website and official apps, despite these growing significantly.
Partnering with some of the largest consumer facing apps in the world has pushed our engineering team to match their practices around redundancy, monitoring, and availability, which requires more supporting resources and services.
Our product team has continued to push the envelope for core features like search and contributor stats, requiring more and more data to be processed in greater and greater volumes.
All of these things have pushed our architecture to be more complex, while also increasing the baseline costs.
Total monthly cost: $29,763
We continue to use Heroku as our main web platform. Despite its premium cost over AWS, Azure, and Google Cloud, Heroku’s built-in deployment and configuration tools allow our team to move faster, more confidently, and more reliably.
As we’ve detailed previously, the alternatives would undoubtably be cheaper on paper. But in reality, the increased simplicity and freedom offered by Heroku for a small, product-focused team is a major cost savings advantage.
In addition to our main web servers and databases using Heroku, we use Fastly for distributed CDN caching, Elastic Cloud for our Elasticsearch clusters, and Stream for our feed and notification architecture.
Total monthly cost: $7,679
Our team is small for Unsplash’s size, with our total product team counting in at just 11 people.
With no one dedicated to dev-ops, ensuring Unsplash is running smoothly and never goes down, requires a lot of instrumentation and reporting.
Despite the volume of metrics we monitor and report on, New Relic, Sentry, and Datadog remain fairly inexpensive solutions. Our logging is certainly our largest monitoring expense, but the detailed information is crucial when debugging issues or rolling out new features.
Total monthly cost: $15,223
Data processing has been the area with the largest relative increase since 2016. Back then, analytics and data were an afterthought in our development process. We relied on tools like Google Analytics for user analytics and Keen for product metrics like photo views and downloads.
Since then, we’ve needed to expand our data collection, aggregation, and reporting significantly, both from a product and a company perspective. As Unsplash has grown, the volume has also increased considerably, with hundreds of millions of events tracked every day.
We’ve replaced Google Analytics and Keen with an open-source data pipeline, Snowplow Analytics. Snowplow takes care of the data collection and formatting, allowing Tim, our data engineer, to focus on data aggregation, modelling, and visualization.
We’ve also expanded the role of the data architecture in the product to handle all of our machine learning and search processing. As we go forward, we expect this to continue to be the biggest area of expansion.
Total monthly cost: $42,408
Imgix is our single biggest expense, but we love it. Yes there are cheaper options, but trust us when we say that they aren’t as good for what we do.
We send petabytes of data through Imgix’s CDN and render more than 250 million variations of our source images every month. Their reliability, performance, and flexibility is unmatched, and negotiating our contract through them actually allows us to discount our CDN costs due to their bulk negotiations with CDN providers.