Friday 1 August 2014

Chapter 8's final two parts. pg_dumpall and the export's validation

pg_dumpall

The pg_dumpall program is mostly a wrapper for pg_dump. It's usage is mainly for dumping all the databases in the cluster. Having lesser options than pg_dump is also less flexible. However one option is absolutely useful and should be included in any backup plan to ensure a rapid recovery in case of disaster.
The -globals-only option saves on the standard output all the cluster wide options like the tablespaces, the roles and the privileges. The passwords are dumped as well encrypted in md5 format. The dump format for this utility is only text. The best way to save the globals is to specify the -f option followed by the file name. This file can be loaded into an empty cluster to recreate the global objects. The tablespaces, if any present, must have the filesystem already in place before running the sql as PostgreSQL doesn't create the filesystem structure.

This example shows the program call and the contents of the output file.
postgres@tardis:~/dmp$ pg_dumpall --globals-only -f main_globals.sql
postgres@tardis:~/dmp$ cat main_globals.sql 
--
-- PostgreSQL database cluster dump
--

SET default_transaction_read_only = off;

SET client_encoding = 'UTF8';
SET standard_conforming_strings = on;

--
-- Roles
--

CREATE ROLE postgres;
ALTER ROLE postgres WITH SUPERUSER INHERIT CREATEROLE CREATEDB LOGIN REPLICATION;






--
-- PostgreSQL database cluster dump complete
--

postgres@tardis:~/dmp$

Backup validation

There's little advantage in having a backup if this is not restorable. The corruption danger is at various levels and unfortunately the problem appears only when the restore is needed.
The corruption have many causes. If the dump is saved locally a damaged filesystem can hide the problem meanwhile the corrupted block is not read. Also the disk subsystem with undetected problems will result in a silent corruption. In order to limit this kind of problems the filesystem of choice should be a solid one with strong journaling support.

The disk subsystem should guarantee the data reliability rather the speed. Slow disks when backing up the data, in particular if in the compressed format don't limit the speed, being the cpu power the real bottleneck.
If the dump file is transferred over the network is a good idea to generate a md5 checksum to check the file integrity after the transfer.

All those measures don't give the security the backup is restorable. The only test capable to ensure the backup is good is a test restore on separate server. This can be a single test or a more structured check. Which strategy to adopt is determined by the amount of data, the time required for the restore and the backup schedule.

The general purpose databases, which size is measurable in hundreds of gigabytes, the restore can complete in few hours and the continuous test is feasible. For the VLDB, which size is measured in terabytes, the restore can take more than one day, in particular if there are big indices requiring expensive sort on disk for the build. In this scenario a weekly restore gives a good perception if there are potential problems with the saved data.

No comments:

Post a Comment