Saturday 25 August 2012

Danger Master! Danger!

to configure the PITR you should save the $PGDATA with tar or in alternative you can use pg_dumpall
PostgreSQL trainer describing the point in time recovery

An oid type can carry on 2Gb
PostgreSQL guru explaining the binary data type

Maybe you can use pg_resetxlog to solve your problem
A clueless user trying to solve a mayor version incompatibility data directory issue

Danger, man at work

This is a small excerpt of what I've seen browsing the tecnical forums and community websites.
My first reaction reading all this nonsense was angry as these guys have good reputation in the PostgreSQL community and their ignorance can cause data loss, damages and, definitely, harm PostgreSQL.
Obviously it's impossible to cover anything so I'm writing this post as warning do not trust people only by the reputation. You should investigate any suggested solution and absolutely DO NOT TRY ON THE PRODUCTION, NEVER!.

It's an airplane? It's an ufo? It's a dumb a**

The point in time recovery is a wonderful feature introduced in the version 8.1 and enable the capabilty to restore the entire database cluster at the exact date and time you want.
In this post I'm not describing how to configure it as the documentation is well written. I show how it works to make clear the danger in the trainer's quote.

The PITR is a physical backup where you actually copy the data files meanwhile the database is running. You can do it safely because the write ahead logs generated during the copy are archived and all the changes can be applied to the inconsistent data files later when you need to restore the cluster.
This is possible because the blocks stored in the archived WAL carry on the information where the block is located on disk. In the restore process the physical data block in the data file is overwritten by the WAL data block during the startup process when the database performs the instance recovery.
The pg_dumpall is a logical backup wrapper for the general purpose pg_dump. If you look the pg_dump's source code you will see a collection of SELECT command within the strict SERIALIZABLE transaction isolation level.

The result is an output file with the SQL commands to run against a different cluster to recreate the databases, the tables and the data.

The restore with this tool require an empty cluster generated with initdb up and running. pg_dumpall does not require instance recovery but a full running instance, that means all the archived WAL files are useless as they are usable only during the recovery.

So, if you follow the procedure suggested by the trainer and choose the pg_dumpall you will illude to have a PITR because you are using a logical tool to restore a physical backup and you will loose any database change between your pg_dumpall and the moment the live database server was lost.

OID is not a TARDIS

PostgreSQL have the bytea built in data type for binary objects. As is a TOASTable type with variable lenght the maximum allowed size is 1Gb. The binary data in this way is stored in line as any table column and you can read and write the content using the conventional SQL statements.
If the binary data doesn't fit in 1 Gb PostgreSQL allow another strategy to store up to 2 Gb as large object.
The built in functions lo_import and lo_export can read and write files from the disk and store in a system table called pg_largeobjects.
This table have three columns,

  • loid TYPE oid
  • pageno TYPE integer
  • data TYPE bytea

Whe the function lo_import is called PostgreSQL stores the binary data in the pg_largeobjects splitting the extra space in two pages identified by the same loid value and different pageno. The function return the OID value to reference the stored large object.
The function lo_export does the reverse, read the binary chunks for the given OID and return the binary stream on disk.
So, the second quote show how lack of attention and simple logical deduction have the PostgreSQL Guru.

I'll burn my house to light a candle

The next three seconds after reading this quote I remained speechless, the suggestion is probably the most dangerous thing, and far away the most brutish, you can do on your data directory.
I discovered this idiocy by browsing the forum and probably too late to avoid the disaster as the post was months old.
As any PostgreSQL DBA should know the WAL files are used to save the data block changes on non volatile memory before the write actually reach the data file as, if any crash happens, the data block can be applied to the datafile reading from the WAL.
The process is the same used for PITR, the only difference is no archived wal is used and only the files in the pg_xlog are used to reply the blocks on the data files.
The global directory contains a 8k file called pg_control where the last checkpoint location is stored and tells the database which WAL to start the replay during the instance recovery.
Any corruption in the pg_control or in the WAL files results in an instance unable to start. If any, and I repeat any solution is useless is still possible to start the instance using the program pg_resetxlog.
This program does one simple thing. Delete anything in the pg_xlog and generate a new pg_control restarting the XID count from the beginning.
The instance can start again but any boundary between the data blocks and the WAL is lost and any attempt to run read write queries can result in data corruption.
The on line manual is absolutely clear on this.
After running pg_resetxlog the database must start without user access , the entire content must be dumped using pg_dumpall, the data directory must be dropped and recreated from scratch using initdb and then the dump file restored using psql or pg_restore
After the scaring part let me explain the problem, a common one indeed, and the solution.
If you use PostgreSQL as the packaged version shipped with the most common Linux distributions can happen, if you do the distribution upgrade, PostgreSQL version jump a major release, for example from 8.4 to 9.1.
When this happens the new binaries refuse to start on the data directory initialized with the previous major version.
The solution is quite simple. You can compile a matching PostgreSQL version, start the instance and dump the contents to restore in the new major version data directory or, if you have an enterprise amount and you are in hurry to return on line, you can use pg_upgrade, a powerful tool to do the in place upgrade.
I hope these three examples have scared you enough to test and check the documents before doing anything.

Saturday 18 August 2012

The angry dba

Before anything on database administration let me introduce some best practice I acquired during my experience in the consulting company I mentioned before.
This company offer Oracle services to important enterprises in Italy and, despite the size, is well structured and organized.
The first thing I've learned was how to write good documents, not too much verbose but complete in any part.
Then I was inducted to the company's best practice in SQL coding as Oracle have some limitations in the size and characters allowed in the identifiers.
PostgreSQL does not have this limitations giving freedom of choice for any format and structure, with unicode characters (if the database supports UTF8 of course) and a 64 characters maximum length for the identifiers.
This is a good thing but, believe me, sometimes it can be a bloody nightmare.
I've written some guidelines to get things sorted out.

Rule 1 – No camel case or reserved keywords in the identifier name

Let start from the beginning.
Should you need to create a table to store debit notes for the year 2002, which name will you choose?
My old school development habits suggest me something like debit_notes_2002.
A new school developer or maybe used to work with AGILE, SCRUM,ITIL and S**** will probably use the name DebitNotes2002.
This in PostgreSQL, is a problem because all the names are converted in lower case making the camel case useless.
It's possible to keep the case preserved enclosing the identifiers between double quotes “.
That mean that ANY query involving your beautiful identifiers will contain the quotation and this affects the readability of the query.
This happens for any identifier named after reserved keyword as well.
Same story for the column names .

Rule 2 - self explaining schema  

In the rule 1 I declared I would name the table for debit notes as debit_notes_2002.
I lied.
I like to put prefix in any identifier to keep track which relation I'm are dealing with.
PostgreSQL stores any relation in one system table named pg_class and give you system views like pg_tables and pg_indexes to query.
But why I should bother in system catalogue queries or people inquiry if I can let the database describe itself?
The way to do this is quite simple.
I put a prefix on any identifier to tell what kind it is or which data carries.
Adding a t_ prefix in the table name will tell to any developer and, most important to any dba/performance tuner, that the relation carries physical data.
Adding a v_ prefix to any view will tell that we are dealing with a short cut to a query and, hopefully, this will avoid the bad habit to join views resulting in the query planner going nuts.
On the same way giving a prefix to the columns to tell the data type will reduce to the developers the risk to get a wrong behaviour for wrong typecast, and will give to the dba precious informations about the space and storage used by the table.
A good structure for building self explaining names can be this.

Unique index
Primary key
Foreign key
Unique key
Sql function
Plpgsql function
Plpython function
Plperl function
Table 1: Object prefix

The same for the data type used for the column.
An example for the most common data type is shown in table 2

Character varying

Rule number 4 – format the SQL queries

Sometime I receive requests to debug or optimize queries written in this way
select * from debitnoteshead a join debitnoteslines b on debnotid where a.datnot=b. datnot and b.deblin>1;
Frustrating isn't it?
Despite my legendary bad temper when I'm in service I'm absolutely professional and probably too much kind because I don't bounce the request and I fix the query.
I know I know, I'm doing wrong.
The company doesn't pay my salary to hit tab, enter and ctrl+u, but to keep the company's systems up, running and sound.
So, if you don't want to waste your time formatting when you receive something like this, I suggest to simply bounce back the request asking for a pretty format.
Some simple rules can enhance the readability.
  1. SQL keywords in upper case
  2. all the identifiers must be indented on the same level
  3. All the same level SQL keywords must be indented at same level
  4. Avoid the SELECT * as is a waste of memory and doesn't tell anything useful
  5. If possible inner join the tables explicitally in the WHERE condition
  6. In the join use the keyword ON to make clear which column is used for joining
  7. In the table list use self explaining aliasing
Using those rules the previous query looks like this.
               debitnoteshead hrd,
               debitnoteslines lns
               AND hdr.datnot=lns.datnot
               AND lns.deblin>1
Now look beauty.

With this last advice ends my first blog post.
Just small afterword before ending.
Probably someone will think this guidelines are too strict to be accepted.
Obviously for the code already written nothing can be done, but can still teach the developers to write decent code.
I think this is a goal can be achieved and believe me, it worth it.
For the new projects this guidelines give you a powerful vision for the database structure without wasting your time through data dictionaries or asking people about the data.
IMHO our job is a war where the casualties are downtime and data loss.
I consider the data the most precious thing a company own and definitely our mission is to guarantee the smallest reaction time to fix any problem minimizing any possible loss of money.

Thursday 16 August 2012


What is blue, bigger in the inside and with time travel capabilities?
 If your answer is the TARDIS, then, yes you're close but the right one is PostgreSQL.

 Welcome to my little quiet space in the noisy and confusing PostgreSQL universe.

 I've started working seriously with databases in the 2004, before I toyed with small web applications MySQL driven, then my life changed when I was employed by a big consulting company in Florence. I became member of what I call the Oracle dba friar.

I discovered how big and challenging can be working with serious amount of data and how important is to have clear, defined and well written informations, possibly in one single place and not scattered around the time and space.

I left the Oracle world in the 2007 and I created one of the biggest PostgreSQL conference in Europe, if I well think, before me the name PGDay was unknown.

Now I moved from Italy to the UK, actually I'm Italian so please accept my apologies for my english, and I had confirmed my feelings.
The pure  PostgreSQL dba is a rare animal, personally I know only two of them, myself and a clever guy from the east midlands I had the pleasure to work with.

I tried to understand why, despite the wide use by great companies PostgreSQL have so few real dba and so many  people pretend to be.

Imho I think the reason can be the lack of good training on this complex argument.

Yes, you can get training from great companies, directly from the developers, but let me ask you a question.

Should you want to learn how to drive a Formula 1, who do you ask?
The engineer or the driver?

I am the driver.