Monday, April 27, 202
I am often asked if Postgres supports compression, and my answer is always a complicated dance around what "compression" level they are asking about. There are six possible levels of database compression:
- single field
- across rows in a single page
- across rows in a single column
- across all columns and rows in a table
- across tables in a database
- across databases
Number one (single field) is currently done by toast. Number two (across rows in a single page) is a practical optimization where a compression routine blindly looks for repeating values in a page without understanding its structure. The difficulty of implementing this happens when a page is stored using its compressed length (rather than the uncompressed 8k), the page contents change, and the new contents compress less well than the previous contents. In this case, the compressed page contents would be larger and it would be very complex to fit the page into the existing space in the file. A different file layout is really required for this, so pages can be placed anywhere in the file, without affecting index access. A team is working on adding this feature using Postgres's table access method interface.
Number three (across rows in a single column) is the classic definition of a columnar database. A team is also working on that. Just like number two, this requires using a different storage layout than Postgres's default, and the table access method interface makes this possible.
Number four can be done using file system compression. Numbers five and six would be nice, but it unclear how this could be done efficiently without adding unacceptable complexity to the database.