Tuesday 16 September 2014

Book done!

The book is now complete. There's still a lot to do for reviewing the writing and fixing the bad grammar. Anyway is about 107 pages and I'm pretty satisfied. I've worked on this document for 4 months in my spare time and I became more confident with my English during the  writing.

So, what's next? I'll spend the next couple of weeks reviewing and fixing the book. After that I'll put the pdf  on lulu.com free for download and in hardcopy. The latter will have a production/shipping cost of course. I'll also put the mobi version on amazon for kindle distribution. Unfortunately amazon doesn't allow free books. I'll put the minimum price possible 0.99$. Anyway if anybody wants the mobi for free just give me a shout.

I've already started the second volume. This one will cover the PostgreSQL's advanced topics like the memory manager and the HA.

And now there are the final two sections. The restore performance and the dba advice.

Restore performances

When restoring a database, in particular in a disaster recovery scenario, the main goal is to have the data back on line as fast as possible. Usually the data section's restore is fast. If the dump has been taken using the copy statements, which are enabled by default, the reload requires a fraction of the entire restore's time. Taking the advantage of the parallel jobs, available for the custom and directory format, it's still possible to improve the data section's reload.

The other face of the coin is the post-data section. Because the objects in this section are mostly random access operations, the completion can require more time than the data section itself; even if the size of the resulting objects is smaller than the table's data. This happens because the unavoidable sort operations are CPU and memory bound. The parallel restore gives some advantage, but as seen in 10.2 each loop's process is single threaded.

Setting up an emergency postgresql.conf file can speed up the restore, reducing the time up to 40% than the production's configuration. What it follows assumes the production's database is lost and the data is restored from a custom format's backup.


When reloading the data from the dump, the database performs a so called bulk load operation. The PostgreSQL's memory manager have a subroutine which protects the shared segment from the block eviction caused by IO intensive operations. It's then very likely the ring buffer strategy will be triggered by the restore, sticking the IO in a small 4 MB buffer protecting the rest of the memory. A big shared buffer it can cache the data pages when in production but becomes useless when restoring. A smaller shared buffer, enough to hold the IO from the restore processes will result in more memory available for the backends when processing the post-data section. There's no fixed rule for the sizing. A gross approximation could be 10 MB for each parallel job with a minimum cap of 512 MB.


The wal_level parameter sets the level of redo informations stored in the WAL segments. By default is set to minimal, enabling the xlog skip. Having the database in with a standby, or simply using the point in time recovery as alternate backup strategy requires the parameter to be set to archive or hot_standby. If this is the case and you have a PITR or standby to failover, stop reading this book and act immediately. Restoring from a physical backup is several time faster than a logical restore. If you have lost the standby or PITR snapshot then before starting the reload the wal_level must be set to minimal.


Turning off fsync can improve massively the restore's speed. Having this parameter turned off is not safe, unless the cache is have the backup battery to prevent data loss in case of power failure. However, even without the battery at restore time having the fsync off is not critical. After all the database is lost, what else can happen?

checkpoint_segments, checkpoint_timeout

The checkpoint is a vital event in the database activity. When occurs all the pages not yet written to the data files are synced to disk. This in the restore context is a disturbance. Increasing the checkpoint segments and the timeout to the maximum allowed values will avoid any extra IO. In any case the dirtied blocks will be written on disk when the buffer manager will need to free some space.


There's no point in having vacuumed the tables after a complete reload. Unfortunately autovacuum does not know if a table is being restored. When the limit for the updated tuples is recognised the daemon starts a new process wasting precious CPU cycles. Turning off temporarily the setting will let the backends to stay focused on the main goal. The data restore.


Limiting the max connections to number of restore jobs is a good idea. It's ok also giving a slight headroom for one or two connections, just in case there's need to log in and check the database status. This way the available memory can be shared efficiently between the backends.


This parameter affects the index builds which are stored in the restore's post-data section. Low values will results in the backends sorting on disk and slowing down the entire process. Higher values will keep the index build in memory with great speed gain. The value should be carefully sized keeping in mind the memory available on the system. This value should be reduced by a 20% if the total ram is up to 10 GB and by 10% if bigger. This reduction is needed to consider the memory consumed by the operating system and the other processes. From the remaining ram must be subtracted the shared_buffer's memory. The remaining value must be divided by the expected backends to perform the restore. For example if we have a system with 26GB a shared_buffer of 2 GB and 10 parallel jobs to execute the restore, the maintenance_work_mem is 2.14 GB.
26 - 10% =  23.4
23.4 - 2 = 21.4
21.4 / 10 = 2.14
Ignoring this recommendation can trigger the swap usage resulting in a slower restore process.

Get DBA advice

The database administration is weird. It's very difficult to explain what a DBA does. It's a job where the statement ``failure is not an option" is the rule number zero. A DBA usually works in antisocial hours, with a very limited time window. A DBA holds a lot on the shoulders and that's the reason why those people can seem hostile or living in their own world.
Despite the strange reputation, a database expert is an incredible resource for building up efficient and scalable designs. Nowadays is very simple to set up a PostgreSQL instance. Even with the default configuration parameter the system is so efficient that basically doesn't show any problem when running at low or medium load. If this, at first sight, seems a fantastic feature actually is a really bad thing. Any mistake at design level is hidden and when the problem sooner or later appears is too late.

This final advice is probably the most important of the entire chapter. If you have a DBA don't be shy. Ask for any suggestion, even if the solution seems obvious or if the task is small. The database layer is a completely different universe. A small mistake can result in a very big problem.

Of course if there's no DBA, that's bad. Never sail without a compass. Never start a database project without an expert. A professional to take care of the most important part, the foundations.

If this is the case, ask your employer to hire a PostgreSQL DBA as soon as possible.

Saturday 13 September 2014

Chapter 11 - A couple of things to know before start coding...

 This is almost the entire chapter 11. I'm still writing the final section, I'd like to put into a separate post though. I've also almost finished the restore's performance. After this the book is complete. I will start a review to make it a decent writing before publishing onto lulu.com and amazon kindle.

I'm not sure amazon permits to sell books for free I'll find a solution anyway.

A couple of things to know before start coding...

This chapter is completely different from the rest of the book. It's dedicated to the developers. PostgreSQL is a fantastic infrastructure for building powerful and scalable applications. In order to use al its potential there are some things to consider. In particular if coming from other DBMS there are subtle caveats that can make the difference between a magnificent success or a miserable failure.

SQL is your friend

Recently the rise of the NOSQL approach,has shown more than ever how SQL is a fundamental requirement for managing the data efficiently. All the shortcuts, like the ORMs or the SQL engines implemented over the NOSQL systems, sooner or later will show their limits. Despite the bad reputation around it, the SQL language is very simple to understand. Purposely built with few simple English words is a powerful tool set for accessing, managing or defining the data model. Unfortunately this simplicity have a cost. The language must be parsed and converted into the database structure and sometimes there is a misunderstanding between what the developer wants and what the database understands.
Mastering the SQL is a slow and difficult process and requires some sort of empathy with the DBMS. Asking for advice to the database administrator is a good idea to get introduced in the the database mind. Having a few words with the DBA is a good idea in any case though.

Design comes first

One of the worst mistakes when building an application is to forget about the foundation, the database. With the rise of the ORM this is happening more frequently than it could be expected. Sometimes the database itself is considered as storage, one of the biggest mistake possible.
Forgetting about the database design is like building a skyscraper from the top. If unfortunately the project is successful there are good chances to see it crumbling down instead of making money.
In particular one of the PostgreSQL's features, the MVCC, is completely ignored at design time. In 7.6 it's explained how is implemented and what are the risks when designing a data model. It doesn't matter if the database is simple or the project is small. Nobody knows how successful could be a new idea. Having a robust design will make the project to scale properly.

Clean coding

One of the the first things a developer learns is how to write formatted code. The purpose of having clean code is double. It simplifies the code's reading to the other developers and improves the code management when, for example, is changed months after the writing. In this good practice the SQL seems to be an exception. Is quite common to find long queries written all lowercase, on one single line with keywords used as identifier. Trying just to understand what such query does is a nightmare. What follow is a list of good practice for writing decent SQL and avoid a massive headache to your DBA. If you don't have a DBA see 11.4.

The identifier's name

Any DBMS have its way of managing the identifiers. PostgreSQL transforms the identifier's name in lowercase. This doesn't work very well with the camel case, however it's still possible to mix upper and lower case letters enclosing the identifier name between double quotes. This makes the code difficult to read and to maintain. Using the underscores in the old fashion way it makes things simpler.

Self explaining schema

When a database structure becomes complex is very difficult to say what is what and how it relates with the other objects. A design diagram or a data dictionary can help. But they can be outdated or maybe are not generally accessible. Adopting a simple prefix to add to the relation's name will give an immediate outlook of the object's kind.

Object Prefix
Table t_
View v_
Btree Index idx_bt_
GiST Index idx_gst_
GIN Index idx_gin_
Unique index u_idx_
Primary key pk_
Foreign key fk_
Check chk_
Unique key uk_
Type ty_
Sql function fn_sql_
PlPgsql function fn_plpg_
PlPython function fn_plpy_
PlPerl function fn_plpr_
Trigger trg_
rule rul_

A similar approach can be used for the column names, making the data type immediately recognisable.

Type Prefix
Character c_
Character varying v_
Integer i_
Text t_
Bytea by_
Numeric n_
Timestamp ts_
Date d_
Double precision dp_
Hstore hs_
Custom data type ty_

Query formatting

Having a properly formatted query helps to understand which objects are involved in the data retrieval and the relations between them. Even with a very simple query a careless writing can make it very difficult to understand.

A query like this have many issues .

  • using lowercase keywords it makes difficult to spot them
  • the wildcard * hides which columns are effectively needed; it retrieves all the columns consuming more bandwidth than required; it prevents the index only scans
  • the meaningless aliases like a and b it make difficult to understand which relations are involved.
  • writing a statement with no indenting it makes difficult to understand the query's logic.
Despite existence of tools capable to prettify such queries, their usage doesn't solve the root problem. It's better to write immediately decent SQL following those simple rules.

  • All SQL keywords should be in upper case
  • All the identifiers and kewrords should be grouped by line break and indented at the same level
  • In the SELECT list specify all and only the columns required by the query
  • Avoid the auto join in order to make clear the relation's logic
  • Adopt meaningful aliases
And that's the fixed SQL.

Sunday 7 September 2014

Chapter 6 final parts. Foreign, check and null constraints

Foreign keys

A foreign key is a constraint enforced using the values another table's field. The classical example is the tables storing the addresses and cities. We can store the addresses with the city field, inline.

Being the city a duplicated value over many addresses, this will cause the table bloat by storing long strings, duplicated many and many times, alongside with the the address. Defining a table with the cities and referencing the city id in the addresses table will result in a smaller row size.

When dealing with referencing data, the main concern is the data consistency between the tables. In our example, putting an invalid identifier for the city in the t_addresses table will result missing data when joining. The same result will happen if for any reason the city identifier in the t_cities table is updated.

The foreign keys are designed to enforce the referential integrity. In our example will we'll enforce a strong relationship between the tables.

The key is enforced in two ways. When a row with an invalid i_id_city hits the table t_addresses the key is violated and the transaction aborts. Deleting a city from the t_cities table which id is referenced in the t_addresses, will violate the key. The same will updating a i_id_city referenced in the t_addresses.

The enforcement is performed via triggers. The pg_dump or pg_restore option -disable-trigger will permit the the data restore with the schema already in place. For more informations take a look to 9 and 10.

The FOREIGN KEYS have been enriched with an handful of options which make them very flexible. The referenced table can drive actions on the referencing using the two options ON DELETE and ON UPDATE. By default the behaviour is to take NO ACTION if there are referencing rows until the end of the transaction. This is useful if the key check should be deferred to the end of the transaction. The other two actions are the RESTRICT which does not allow the deferring and the CASCADE which cascade the action to the referred rows.
If we want our foreign key restrict the delete with no deferring and cascade any update, here's the DDL.

Another very useful clause available with the foreign and check constraints is the NOT VALID. When the constraint is created with NOT VALID, the initial check is skipped making the constraint creation instantaneous. This is acceptable if the actual data is consistent. The constraint is then enforced for all the new data. The invalid constraint can be validated later with the command VALIDATE CONSTRAINT.

Check constraints

A check constraint is a user defined check to enforce specific condtions on the rows. The definition can be a condition or a used defined function. In this case the function must return a boolean value. As for the foreign keys, the check accepts the NOT VALID clause to speed up the creation.

The check is satisfied if the condition returns true or NULL. This behaviour can produce unpredictable results if not fully understood. An example will help to clarify. Let's create a CHECK constraint on the v_address table for enforcing a the presence of a value. Even with the check in place the insert without the address completes successfully.

This is possible because the v_address does not have a default value and accepts the NULL values. The check constraint is violated if, for example we'll try to update the v_address with the empty string.

Our check constraint will work as expected if we set for the v_address field a fallback default value.

Please note the existing rows are not affected by the default value change.
The message for the update and the insert is exactly the same because PostgreSQL attempts to create a new row in both cases. When the constraint fails the transaction is rolled back leaving the dead row in place. We'll take a better look to the MVCC in 7.6.

Not null

For people approaching the database universe the NULL value can be quite confusing. A NULL value is an empty object without any type or meaning. Actually when a field is NULL it doesn't consumes physical space. By default when defining a field this is NULLable. Those fields are quite useful to omit some columns, for example, at insert time.
The NULLable fields can be enforced with an unique constraint as the NULL are not considered as duplicates. When dealing with the NULL it's important to remind the NULL acts like the mathematical zero. When evaluating an expression with a NULL, the entire expression becomes NULL.
The NOT NULL is a column constraint which forbids the presence of NULL in the affected field. Combining a NOT NULL with a unique constraint is exactly like having enforced a PRIMARY KEY. When altering a field the table is scanned for validation and the constraint's creation is aborted if any NULL value is present on the affected field.

For example, if we want to add the NOT NULL constraint to the field v_address in the t_addresses table the command is just.

In this case the alter fails because the column v_address contains NULL values from the example seen in 6.4. The fix can be performed in a single query using the coalesce function. This function returns the first not null value from the left. This way is possible to change the v_address on the fly to a fixed placeholder .

Adding new columns with NULL is quick. PostgreSQL simply adds the new attribute in the system catalogue and manages the new tuple structure considering the new field as empty space. Adding a field with NOT NULL requires the presence of the DEFAULT value as well. This is an operation to consider carefully when dealing with big amount of data. This way the table will be locked in exclusive mode and a complete relation's rewrite will happen. A far better way is to add a NOT NULL field is to add at first as NULLable field. A second alter will add the default value to have the new rows correctly set. An update will then fix the NULL values without locking the table. Finally the NOT NULL could be enforced without hassle.