COSCUP 2024

Building Petabyte-Scale PostgreSQL Clusters with Bagger
2024-08-03, 13:00–13:30 (Asia/Taipei), TR409-2

This talk discusses challenges of storing petabytes of log data in PostgreSQL and some of their potential solutions. PostgreSQL itself doesn't have everything you might need for a transactional processing distributed database, but it is remarkably capable in other areas. This talk provides one example.


When I was at Adjust we replaced ElasticSearch with an inhouse solution built on PostgreSQL in order to avoid scalability limits in ElasticSearch which we had hit at about 1PB in size. This talk covers:

  • The design of the system scaling linearly to vast amounts of data
  • Why and How We Patched PostgreSQL to support our endeavor
  • An Open Source project called Bagger built on our experience
  • Scalability tradeoffs of the design

Chris Travers has over 25 years of experience with PostgreSQL and other open source technologies. He has worked as a software developer and engineer, database administrator, engineering manager, and consultant. He formerly lead both the platform teams (using Gentoo Linux) and the database teams (using PostgreSQL) at Adjust. He has also contributed to a variety of open source projects including PostgreSQL.