Monday, 25 August 2014

The missing chapter 6 part 1 and two, data integrity

I've started the sixth chapter, the one on the data integrity I've forgotten. There are the first two parts alongside with the introduction. I've also updated the book on slideshare with the new cover and the last incomplete chapter for the developers. The beautiful cover is made by Chiaretta & Bon. Kudos and many thanks.

I've also uploaded the latex sources on github for anybody to fork and review my crappy english. My former colleague and friend Craig Barnes already started reviewing the tex files, many thanks for the priceless help.


Here's the github repository url : https://github.com/the4thdoctor/pgdba_books

Data integrity

There's just one thing worse than losing the database. Having the data set full of rubbish. The data integrity has been part of PostgreSQL since the beginning. It offers various levels of strength ensuring the data is clean and consistent. In this chapter we'll have a brief look to the various constraints available. The PostgreSQL's constraints can be grouped in two kind. The table constraints and the column constraints. The table constraints are defined on the table's definition after the field's list. The column constraints appear in the field's definition after the data type. Usually for the primary keys and the unique keys the definition is written as table constraint.
The constraint applies the enforcement to any table's row without exclusion. When creating a table constraint on a fully populated table the data is validated first. Any validation error aborts the constraint creation. However, the foreign keys and check constraints accept the clause NOT VALID. With this clause the database assumes the data is valid and skips the validation. The cration is almost immediate. The new constraint is then enforced only for the new data. When using this option the data must be consistent.

Primary keys

A primary key is the unique row identifier. Having this constraint enforced ensures the row can be addressed directly using the key value. A primary key can be enforced on a single or multi column. The data aspect must be unique with the strictest level. That means the NULL values are not permitted in columns participating to the primary key. When creating a primary key this implicitly adds a new unique index on the affected fields. In order to avoid the exclusive lock on the affected table the unique index can be built before the primary using the CONCURRENTLY clause and then used in the primary key definition as shown in 8.3. Using the primary key is the fastest way to access the table's contents.
There is the primary key definition as table and column constraint.



With the table's constraint definition is possible to specify the constraint name and to have a multi column constraint. When writing a multi column constraint the participating columns should be listed separate by commas.

The most common primary key implementation, and probably the best, is to have a serial column as primary key. A serial field is short for integer NOT NULL which default value is associated to the nextval for an auto generated sequence. Because the sequence have its upper limit to the bigint upper limit, this ensures the data does not wraps in the table's lifetime. In the case the primary key is expected to reach the value of 2,147,483,647 the type of choice should be bigserial rather serial. This will create the primary key's field as bigint which upper limit is 9,223,372,036,854,775,807.

However it's still possible to alter the field later in order to match the new requirements. Because changing the data type requires a complete table's rewrite, any view referencing the affected column will abort the change.

Here's the t_data's type change output with the client message level set to debug3.



Dealing with a big amount of data presents also the problem to have enough space for fitting twice of the original table plus the downtime caused by the exclusive lock on the affected relation. A far better approach is to add a new bigint NULLable column without default value. Setting up a trigger for the inserts will keep in sync the new values with the original primary key. Then an update procedure will set the value for the rows. This should run in small batches to avoid to overfill the pg_xlog directory with long running transactions. When everything is in place the new column could then become NOT NULL and a unique index will finally enforce the uniqueness for the new field.

The primary key can then be dropped and recreated using the new unique index. This is permitted only if there's no foreign key referencing the field. In this case a multi drop and create statement is required. The final cleanup should include the trigger's drop and the old primary key removal. Any view using the old primary key should be rebuilt before the drop.

Unique keys

The unique keys are similar to the primary keys. They enforce the uniqueness using an implicit index but they allow the presence of NULL values. Their usage is for enforcing uniqueness on columns not used as primary key. Similar to the primary key the unique constraints are based on a unique index. In fact there's little difference between the unique index and the unique key except the presence of the latter in the system table pg_constraint.

Sunday, 17 August 2014

Chapter 10 part 2 - The binary formats

The three binary formats supported by pg_dump are the custom, the directory and the tar format. The first two can be accessed randomly by the restore program and have the parallel restore support, being the best choice for having a flexible and reliable restore. Before the the 9.3 the only format supporting the parallel restore was the custom. With this version the directory format accepts the -j switch. This feature, combined with the parallel dump seen in 9.3 is a massive improvement for saving big amount of data. The tar format does have the limit of 12 GB in the archive's file size and doesn't offer the parallel restore nor the selective restore.

The custom format is a binary archive with a table of contents pointing the various archive sections. The directory format is a directory which name is the value provided with the -f switch. The directory contents are a toc.dat file, where the table of contents and the schema are stored. For each table there is a gzip file which name is a number corresponding to the toc entry for the saved relation. Those files store the data restore for the relation.

The restore from the binary formats requires the pg_restore usage. Because almost all the pg_dump's switches are supported by pg_restore we'll not repeat the look out. Take a look to 9.1 for the switch meanings. Anyway this is the pg_restore's help output.

pg_restore restores a PostgreSQL database from an archive created by pg_dump.

Usage:
  pg_restore [OPTION]... [FILE]

General options:
  -d, --dbname=NAME        connect to database name
  -f, --file=FILENAME      output file name
  -F, --format=c|d|t       backup file format (should be automatic)
  -l, --list               print summarized TOC of the archive
  -v, --verbose            verbose mode
  -V, --version            output version information, then exit
  -?, --help               show this help, then exit

Options controlling the restore:
  -a, --data-only              restore only the data, no schema
  -c, --clean                  clean (drop) database objects before recreating
  -C, --create                 create the target database
  -e, --exit-on-error          exit on error, default is to continue
  -I, --index=NAME             restore named index
  -j, --jobs=NUM               use this many parallel jobs to restore
  -L, --use-list=FILENAME      use table of contents from this file for
                               selecting/ordering output
  -n, --schema=NAME            restore only objects in this schema
  -O, --no-owner               skip restoration of object ownership
  -P, --function=NAME(args)    restore named function
  -s, --schema-only            restore only the schema, no data
  -S, --superuser=NAME         superuser user name to use for disabling triggers
  -t, --table=NAME             restore named table(s)
  -T, --trigger=NAME           restore named trigger
  -x, --no-privileges          skip restoration of access privileges (grant/revoke)
  -1, --single-transaction     restore as a single transaction
  --disable-triggers           disable triggers during data-only restore
  --no-data-for-failed-tables  do not restore data of tables that could not be
                               created
  --no-security-labels         do not restore security labels
  --no-tablespaces             do not restore tablespace assignments
  --section=SECTION            restore named section (pre-data, data, or post-data)
  --use-set-session-authorization
                               use SET SESSION AUTHORIZATION commands instead of
                               ALTER OWNER commands to set ownership

Connection options:
  -h, --host=HOSTNAME      database server host or socket directory
  -p, --port=PORT          database server port number
  -U, --username=NAME      connect as specified database user
  -w, --no-password        never prompt for password
  -W, --password           force password prompt (should happen automatically)
  --role=ROLENAME          do SET ROLE before restore

If no input file name is supplied, then standard input is used.

Report bugs to <pgsql-bugs@postgresql.org>.
If the database connection is omitted pg_restore sends the output to the standard output. The switch -f sends the output to a filename though. This is very useful if we want to check the original dump file is readable, executing a restore onto the /dev/null device.
The the custom and directory formats show their power when restoring on a database connection in a multi core system. Using the -j switch it's possible to specify the number of parallel jobs for the data and the post data section. This can improve massively the recovery time, running the most time consuming actions in multiple processes.

The word parallel can be confusing in some way. PostgreSQL does not supports multithreading. That means each backend process will use just only one cpu. In this context, each job take care of a different area of the restore's table of contents, The TOC is split in many queues with a fixed object list to process. For example one queue will contain the data restoration for a table, and the relation's indices and constraints.

The switch -section offers a fine grain control on which section of the archived data will be restored. In a custom and directory format there are three distinct sections.
  • pre-data This section restores only the schema definitions not affecting the speed and reliability of the data restore. e.g. table's DDL, functions creation, extensions, etc.
  • data The data restore itself, by default saved as COPY statements to speed up the process
  • post-data This section runs the restore for all the objects enforcing the data integrity, like the primary and foreign keys, triggers and the indices which presence during the restore slow down the data reload massively.
The switch -C creates the target database before starting the restoration. To do this the connection must happen first on another database.

We'll now will see how to restore the database seen in 10.1 in the same two steps approach, using the custom format.

Let's start with a complete database dump using the custom format.
postgres@tardis:~/dump$ pg_dump -Fc -f db_addr.dmp  db_addr
pg_dump: reading schemas
pg_dump: reading user-defined tables
pg_dump: reading extensions
pg_dump: reading user-defined functions
pg_dump: reading user-defined types
pg_dump: reading procedural languages
pg_dump: reading user-defined aggregate functions
pg_dump: reading user-defined operators
pg_dump: reading user-defined operator classes
pg_dump: reading user-defined operator families
pg_dump: reading user-defined text search parsers
pg_dump: reading user-defined text search templates
pg_dump: reading user-defined text search dictionaries
pg_dump: reading user-defined text search configurations
pg_dump: reading user-defined foreign-data wrappers
pg_dump: reading user-defined foreign servers
pg_dump: reading default privileges
pg_dump: reading user-defined collations
pg_dump: reading user-defined conversions
pg_dump: reading type casts
pg_dump: reading table inheritance information
pg_dump: reading event triggers
pg_dump: finding extension members
pg_dump: finding inheritance relationships
pg_dump: reading column info for interesting tables
pg_dump: finding the columns and types of table "t_address"
pg_dump: finding default expressions of table "t_address"
pg_dump: finding the columns and types of table "t_city"
pg_dump: finding default expressions of table "t_city"
pg_dump: flagging inherited columns in subtables
pg_dump: reading indexes
pg_dump: reading indexes for table "t_address"
pg_dump: reading indexes for table "t_city"
pg_dump: reading constraints
pg_dump: reading foreign key constraints for table "t_address"
pg_dump: reading foreign key constraints for table "t_city"
pg_dump: reading triggers
pg_dump: reading triggers for table "t_address"
pg_dump: reading triggers for table "t_city"
pg_dump: reading rewrite rules
pg_dump: reading large objects
pg_dump: reading dependency data
pg_dump: saving encoding = UTF8
pg_dump: saving standard_conforming_strings = on
pg_dump: saving database definition
pg_dump: dumping contents of table t_address
pg_dump: dumping contents of table t_city
We'll use a second database for the restore.





We'll then restore just the schema using the following command.
postgres@tardis:~/dump$ pg_restore -v -s -d db_addr_restore_bin db_addr.dmp 
pg_restore: connecting to database for restore
pg_restore: creating SCHEMA public
pg_restore: creating COMMENT SCHEMA public
pg_restore: creating EXTENSION plpgsql
pg_restore: creating COMMENT EXTENSION plpgsql
pg_restore: creating TABLE t_address
pg_restore: creating SEQUENCE t_address_i_id_addr_seq
pg_restore: creating SEQUENCE OWNED BY t_address_i_id_addr_seq
pg_restore: creating TABLE t_city
pg_restore: creating SEQUENCE t_city_i_id_city_seq
pg_restore: creating SEQUENCE OWNED BY t_city_i_id_city_seq
pg_restore: creating DEFAULT i_id_addr
pg_restore: creating DEFAULT i_id_city
pg_restore: creating CONSTRAINT pk_i_id_city
pg_restore: creating CONSTRAINT pk_id_address
pg_restore: creating FK CONSTRAINT fk_t_city_i_id_city
pg_restore: setting owner and privileges for DATABASE db_addr
pg_restore: setting owner and privileges for SCHEMA public
pg_restore: setting owner and privileges for COMMENT SCHEMA public
pg_restore: setting owner and privileges for ACL public
pg_restore: setting owner and privileges for EXTENSION plpgsql
pg_restore: setting owner and privileges for COMMENT EXTENSION plpgsql
pg_restore: setting owner and privileges for TABLE t_address
pg_restore: setting owner and privileges for SEQUENCE t_address_i_id_addr_seq
pg_restore: setting owner and privileges for SEQUENCE OWNED BY t_address_i_id_addr_seq
pg_restore: setting owner and privileges for TABLE t_city
pg_restore: setting owner and privileges for SEQUENCE t_city_i_id_city_seq
pg_restore: setting owner and privileges for SEQUENCE OWNED BY t_city_i_id_city_seq
pg_restore: setting owner and privileges for DEFAULT i_id_addr
pg_restore: setting owner and privileges for DEFAULT i_id_city
pg_restore: setting owner and privileges for CONSTRAINT pk_i_id_city
pg_restore: setting owner and privileges for CONSTRAINT pk_id_address
pg_restore: setting owner and privileges for FK CONSTRAINT fk_t_city_i_id_city
The dump file is specified as last parameter. The -d switch tells pg_restore which database to connect for the restore. By default the postgres user usually connects using the ident operating system daemon or the trust authentication method, when connected as local. That's the reason why in this example there's no need of specifying the username or enter the password.

The second restore's step is the data load. In the example seen in 10.1 we used the pg_dump with -disable-triggers switch in order to avoid failures caused by constraint violation. With the custom format the switch is used at restore time.
postgres@tardis:~/dump$ pg_restore --disable-triggers -v -a -d db_addr_restore_bin db_addr.dmp 
pg_restore: connecting to database for restore
pg_restore: disabling triggers for t_address
pg_restore: processing data for table "t_address"
pg_restore: enabling triggers for t_address
pg_restore: executing SEQUENCE SET t_address_i_id_addr_seq
pg_restore: disabling triggers for t_city
pg_restore: processing data for table "t_city"
pg_restore: enabling triggers for t_city
pg_restore: executing SEQUENCE SET t_city_i_id_city_seq
pg_restore: setting owner and privileges for TABLE DATA t_address
pg_restore: setting owner and privileges for SEQUENCE SET t_address_i_id_addr_seq
pg_restore: setting owner and privileges for TABLE DATA t_city
pg_restore: setting owner and privileges for SEQUENCE SET t_city_i_id_city_seq
However, this approach does not prevent the slowness caused by the indices when reloading the data. If a restore with multiple steps is required (e.g. creating the database schema and check all the relations are in place before starting) the section switch is a better choice. Let's see how it works with the example seen before.

We'll first restore the pre-data section10.1.
postgres@tardis:~/dump$ pg_restore --section=pre-data -v  -d db_addr_restore_bin db_addr.dmp 
pg_restore: connecting to database for restore
pg_restore: creating SCHEMA public
pg_restore: creating COMMENT SCHEMA public
pg_restore: creating EXTENSION plpgsql
pg_restore: creating COMMENT EXTENSION plpgsql
pg_restore: creating TABLE t_address
pg_restore: creating SEQUENCE t_address_i_id_addr_seq
pg_restore: creating SEQUENCE OWNED BY t_address_i_id_addr_seq
pg_restore: creating TABLE t_city
pg_restore: creating SEQUENCE t_city_i_id_city_seq
pg_restore: creating SEQUENCE OWNED BY t_city_i_id_city_seq
pg_restore: creating DEFAULT i_id_addr
pg_restore: creating DEFAULT i_id_city
pg_restore: setting owner and privileges for DATABASE db_addr
pg_restore: setting owner and privileges for SCHEMA public
pg_restore: setting owner and privileges for COMMENT SCHEMA public
pg_restore: setting owner and privileges for ACL public
pg_restore: setting owner and privileges for EXTENSION plpgsql
pg_restore: setting owner and privileges for COMMENT EXTENSION plpgsql
pg_restore: setting owner and privileges for TABLE t_address
pg_restore: setting owner and privileges for SEQUENCE t_address_i_id_addr_seq
pg_restore: setting owner and privileges for SEQUENCE OWNED BY t_address_i_id_addr_seq
pg_restore: setting owner and privileges for TABLE t_city
pg_restore: setting owner and privileges for SEQUENCE t_city_i_id_city_seq
pg_restore: setting owner and privileges for SEQUENCE OWNED BY t_city_i_id_city_seq
pg_restore: setting owner and privileges for DEFAULT i_id_addr
pg_restore: setting owner and privileges for DEFAULT i_id_city
Again the pg_restore loads the objects with the ownership and privileges. What's missing is the constraints creation. The second step is the data section's load.
postgres@tardis:~/dump$ pg_restore --section=data -v  -d db_addr_restore_bin db_addr.dmp 
pg_restore: connecting to database for restore
pg_restore: implied data-only restore
pg_restore: processing data for table "t_address"
pg_restore: executing SEQUENCE SET t_address_i_id_addr_seq
pg_restore: processing data for table "t_city"
pg_restore: executing SEQUENCE SET t_city_i_id_city_seq
pg_restore: setting owner and privileges for TABLE DATA t_address
pg_restore: setting owner and privileges for SEQUENCE SET t_address_i_id_addr_seq
pg_restore: setting owner and privileges for TABLE DATA t_city
pg_restore: setting owner and privileges for SEQUENCE SET t_city_i_id_city_seq
This section simply loads the table's data and sets the sequence values. Apart for the ownership no further action is performed. Finally we'll run the post-data section.
postgres@tardis:~/dump$ pg_restore --section=post-data -v  -d db_addr_restore_bin db_addr.dmp 
pg_restore: connecting to database for restore
pg_restore: creating CONSTRAINT pk_i_id_city
pg_restore: creating CONSTRAINT pk_id_address
pg_restore: creating FK CONSTRAINT fk_t_city_i_id_city
pg_restore: setting owner and privileges for CONSTRAINT pk_i_id_city
pg_restore: setting owner and privileges for CONSTRAINT pk_id_address
pg_restore: setting owner and privileges for FK CONSTRAINT fk_t_city_i_id_city
With this run the constrains (and eventually all the indices) are created in the best approach possible when dealing with the bulk data processing.

Loading the data contents without indices maximise the speed. The constraint and index build with the data already in place results in a faster build and a fresh index without any bloat.