Thursday, 6 August 2015

Directory format and default compression

After upgrading some clusters to PostgreSQL 9.4.4 I noticed an increase of the database backup. Because the databases are quite large I'm taking the advantage of the parallel export introduced with PostgreSQL 9.3.

The parallel dump uses the PostgreSQL's snapshot export with multiple backends. The functionality requires the dump to be in directory format where a toc file is saved alongside with the compressed exports, one per each table saved by pg_dump.

Initially I believed that the latest release introduced a change in the default compression level. However I noticed that using the custom format the resulting file was smaller.

For testing purposes I've created a database with two tables big 700MB each one. Using the custom format with the default compression level they are saved in 442 MB. The directory format stores the same data in 461MB, +4%.

After some research I spotted a recent change introduced in the version 9.4.2.
In pg_dump, fix failure to honor -Z compression level option together with -Fd

With another search, this time on the committers mailing list, I found this commit .

Basically the change  in the 9.4.2 "ignored the fact that gzopen() will treat "-1" in the mode argument as an invalid character", causing the compression level to be set to 1, the minimum.

The fix is already committed and should appear in the next minor release. Until then, if using the directory format, the  workaround is to pass the flag -Z 6 to pg_dump.

The bug has been backpatched to the version 9.1. However I've never noticed this issue until I switched from the 9.3.9 to the 9.4.4.

No comments:

Post a Comment