Wednesday 12 February 2014

Chapter 1 - PostgreSQL at glance

With this article I'll start a publication series of the WIP books on postgresql.
I've decided to split all the original book in three or more parts simpler for me to manage and write.
I'll publish a new chapter when is finished and the book's pdf when is complete.
The document is targeted to the DBA and to people who want to learn the PostgreSQL administration.
This first chapter is a mere introduction, is surely full of errors and typos. Please forgive me, I'm not English after all and I've studied French at school.

Thanks for reading my stuff.

PostgreSQL at glance

PostgreSQL is a first class product with high end enterprise class level features. This first chapter is a general review on the product with a brief talk on the database's history.

Long time ago in a galaxy far far away...

Following the works of the Berkeley's Professor Michael Stonebraker, in the 1996 Marc G. Fournier asked for any volunteer interested in revamping the Postgres 95 project.

The answer came from Bruce Momjian,Thomas Lockhart e Vadim Mikheev, the very first PostgreSQL Global Development Team.


Every time a new major release is released, new powerful features join the rich set of the product's functionalities. There's a small excerpt of what the latest version offer in terms of flexibility and reliability.

Write ahead logging

Like any RDBMS worth of this name PostgreSQL have the write ahead logging feature. In short, when a data block is updated the change is saved in a reliable location, the so called write ahead log. The effective write on the datafile is performed later. Should the database crash the WAL is scanned and the saved blocks are replayed during the crash recovery. PostgreSQL stores the redo records in fixed size segments, usually 16 MB. When the wal segment is full PostgreSQL switches to a newly created or recycled wal segment in the process called log switch.

Point in time recovery

When the log switch happens is possible to archive the previous segment in a safe location. Taking an inconsistent copy of the data directory is possible to restore a fully functional cluster because the archived wal segments have all the informations to replay the physical data blocks on the inconsistent data files. The restore can be, optionally stopped at a given point in time. For example is possible to recover a PostgreSQL cluster to one second before the a catastrophic happening (e.g. a table drop).

Standby server and high availability

The inconsistent snapshot can be configured to stay up in continuous archive recovery. PostgreSQL 8.4 supports the warm standby configuration where the standby server does not accept connections. From the version 9.0 is possible to enable the hot standby configuration to access the standby server in read only mode.

Streaming replication

The wal archiving doesn't work in real time. The wal shipping happens only after the log switch and in a low activity server this can leave the standby behind the master for a while. Using the streaming replication a standby server can get the wal blocks over a database connection in almost real time.


PostgreSQL fully supports the transactions and is ACID compliant. From the version 8.0 the save points were introduced.

Procedural languages

Amongst the rich of feature procedural language pl/pgsql, many procedural languages such as perl or python are available for writing database functions. The DO keyword was introduced in the 9.1 to have anonymous function's code blocks.


The partitioning, implemented in PostgreSQL is still very basic. The partitions are tables connected with one empty parent table using the table's inheritance. Defining check constraints on the partitioned criteria the database can exclude, querying the parent table, the partitions not affected by the where condition. As the physical storage is distinct for each partition and there's no global primary key enforcement nor foreign keys can be defined on the partitioned structure.

Cost based optimizer

The cost based optimizer, or CBO, is the one of PostgreSQL's point of strenght. The query execution is dynamically determined and self adapting to the underlying data structure or the estimated amount of data affected. PostgreSQL supports also the genetic query optimizer GEQO.

Multi platform support

PostgreSQL nowadays supports almost any unix flavour and from the the version 8.0 is native to Windows.


The tablespace support permits the data files fine grain distribution on the OS filesystems.


The way PostgreSQL keeps things consistent is the MVCC which stands for Multi Version Concurrency Control. The mechanism is neat and efficient, offering great advantages and one single disadvantage. We'll see in detail further but keep in mind this important sentence.
There's no such thing like an update in PostgreSQL.


Triggers to execute automated tasks on when DML is performed on tables and also views are supported at any level. The events triggers are also supported.


The read only views are well consodlidated in PostgreSQL. In the version 9.3 was added the support for the materialized and updatable. Also the implementation is still very basic as no incremental refresh for the mat views nor update is possible on complex views. Anyway is still possible to replicate this behaviour using the triggers and procedures.

Constraint enforcement

PostgreSQL supports primary keys and unique keys to enforce local data meanwhile the referential integrity is guaranteed with the foreign keys. The check constraint to validate custom data sets is also supported.

Extension system

PostgreSQL implements the extension system. Almost all the previously known contrib modules are now implemented in this efficient way to add feature to the server using a simple SQL command. 

No comments:

Post a Comment