Tuesday 4 March 2014

Chapter 3 -part 1 - Install structure

Install structure

In this chapter we'll look at the PostgreSQL installation. Whatever installation method you did, the PostgreSQL binaries are the same. The packaged version comes with few extra utilities we'll take a look later as those present some caveats to be aware.
The source installation puts all the binaries into the bin sub directory in the target location specified by the -prefix parameter.
The packaged install puts the binaries into a folder organised per major PostgreSQL version.
e.g. /usr/lib/postgresql/9.3/bin/

The core binaries

We'll look first at the core binaries like postgres or psql.
A separate section is dedicated to wrappers and the contributed modules.


postgres

Is the database itself. It's possible to start it directly or using the pg_ctl utility.
The latter is the best way to control the instance except in case of the XID wraparound failure. In this case the only way to run the instance is executing postgres in single user mode.
For historical reason there's usually also the postmaster symbolic link to postgres.


pg_ctl

As mentioned before this is the utility for managing the PostgreSQL instance.
It can start,stop, reload the postgres process. It's also capable to send kill signals to the running instance.
pg_ctl accepts on the command line various options. The most important are the -D or -pgdata= to specify the database storage area and the -m to specify the shutdown mode. Check the section 4.2.2 for details.
pg_ctl also requires the action to perform on the instance.
The supported actions are

  • init[db] initialises a directory as PostgreSQL data area
  • start starts a PostgreSQL instance
  • stop shutdowns a PostgreSQL instance
  • reload reloads the configuration's files
  • status checks the PostgreSQL instance running status
  • promote promotes a standby server
  • kill sends a custom signal to the running instance


initdb

Is the binary which initialises the PostgreSQL data area. initdb requires an empty directory to initialise. Various options can be specified on the command line, like the character enconding or the collation order.


psql

Is the PostgreSQL command line client. Despite his look very essential is one of the most flexible tools available to interact with the server. As is part of the core distribution is always present.


pg_dump

Is the binary dedicated to the backup. Generates consistent backups in various formats. The default is plain text. Supports the parallel dump implemented using the snapshot exports.
The switch to set is -F followed by a letter to indicate the wanted output format.

  • p saves the sql statements to reconstruct the schema and/or data in plain text with no compression.
  • c is the custom PostgreSQL format. Supports parallel restore, compression and object search.
  • d the dump is saved in a directory. With this format is possible to dump in parallel.
  • t saves the dump in the standard tar format.
Please on't be confused by the pg_dumpall . This does look more like a wrapper for pg_dump rather a dedicated program.
As pg_dumpall doesn't support all the pg_dump features is still very useful to save the cluster wide objects like the users with the switch -globals-only.


pg_restore

As the name suggests this is used to restore the database's dump. It can read the backups generated in all formats by pg_dump. The restore target can be a PostgreSQL connection or a file. If the backup's format is directory or custom then pg_restore can run the data load and index/key creations in multiple jobs.


pg_controldata

The program query the pg_control file where instance's vital informations are stored. The pg_control is one of the most important cluster's files. With a corrupted pg_control the instance cannot start.


pg_resetxlog

If the WAL files or the get corrupted the instance cannot perform a crash recovery. pg_resetxlog can solve the problem and make the instance startable but keep in mind this must the last chance, after trying any other possible solution.
The reset removes the WAL files and creates a new pg_control. The XID are also restarted.
The instance becomes startable at the cost of losing any reference between the transactions and the data files. All the physical data integrity is lost and any attempt to run DML queries results in data corruption.
The on line manual is absolutely clear on this point.
After running pg_resetxlog the database must start without user access, the entire content must be dumped, the data directory must be dropped and recreated from scratch using initdb and then the dump file can be restored using psql or pg_restore

No comments:

Post a Comment