Monday 28 November 2016

pg chameleon 1.0 alpha1

As mentioned in the presentation I did at the Brighton PostgreSQL Meetup, pg chameleon is a commute project.
In general I work on it when I travel. And thanks to the pgconf and other requirements I travelled a lot recently.

This is the presentation's recording (sorry the audio is suboptimal).

So I was able to build a minimum viable product which seems to work properly.

The alpha1 release is available for download and testing here

The system comes with the following limitations.

Installation in virtualenv

For working properly you should use virtualenv for installing the requirements via pip. I'm planning to make the package available via pip once it reaches the RC level.

No daemon yet

The script should be executed in a screen session to keep it running. Currently there's no respawning of the process on failure nor failure detector.

psycopg2 requires python and postgresql dev files

The psycopg2's pip installation requires the python development files and postgresql source code.
Please refer to your distribution for fulfilling those requirements.

DDL replica limitations

DDL and DML mixed in the same transaction are not decoded in the right order. This can result in a replica breakage caused by a wrong jsonb descriptor if the DML change the data on the same table modified by the DDL. I know the issue and I'm working on a solution.
Test please!
Please submit the issues you find.
Bear in mind this is an alpha release. if you use the software in production keep an eye on the process to ensure the data is correctly replicated.

Saturday 19 November 2016

Goodbye to Ubuntu, hello Slackware

When I first read about the systemd I truly believed the linux community would be able to improve this init system. Or at least to stop it if not good enough to be used as replacement for sysvinit.

I was wrong. After a strong and poisonous debating Debian switched to systemd in Jessie. Several contributors left Debian and organised a fork called Devuan which is pushing for the init freedom.

Several distributions switched to this accessory kernel which has lost long time ago the qualification of init. I didn't realise I already had a small part of systemd in my home pc with ubuntu 14.04 until the day of when I upgraded the release to 14.04.4.
The kernel switched to the branch 4 for the old LTS and I wanted to upgrade. After running the usual apt-get update; apt-get dist-upgrade I rebooted the pc and I found that I wasn't able to login anymore in kde.

Opening a console and trying to login manually did not help. The login asked for username again immediately after entering the correct username.

For some reasons systemd.logind got a nervous breakdown after the upgrade.

Searching that particular issue on internet did not return any sort of documents. Only people with similar issues on different distributions which workarounds never worked for me.

After wasting 2 days I decided to reinstall from scratch ubuntu 14.04.4.
Good enough I had all my personal data on a different device and I do regular backups.
But what if this would happen on a production server, where the maintenance windows are well controlled and if you screw up the timeline you get your ass grilled by the users and if you cannot solve quickly this can affect your career.

After this happening I became more sensitive about systemd  taking over the linux ecosystem. The reports all around me were quite scary. Friends with servers unable to mount filesystems after a reboot.  User processes killed on logout. And yes, my bad experience with systemd logind. I needed to find a different distribution.

For my vm I'm using for personal projects I switched to FreeBSD. Simple, clear, robust. It was like rejuvenating 10 years. For the desktop I didn't feel confident enough to switch on this operating system. I started my search but nothing satisfied completely my needs.

Then I remembered a twit from the Debian systemd's mantainer.

"If you like obsolete software, I have heard that Slackware is looking for users"

Well, I can tell everything about slackware except it's obsolete.
I'm writing from a shiny Slackware 14.02 with a kernel 4.4.29.
The system comes with kde 4.14.21 and I was able to reinstall it from scratch in few minutes.
The slackware approach is for having super stable software. The slackbuilds repository covers all the missing with the sbo scripts working in a similar way the ports work on FreeBSD.

With slackware I got the same feelings I had with FreeBSD.
Stable, simple to undestand and simple to debug if any issue happens.

And this on Linux, an operating system I'm still in love.

So, goodbye  Ubuntu, hello Slackware. May your days be long and without systemd.

Saturday 12 November 2016

presenting pg chameleon at the brighton postgresql meetup

Back from Estonia after the amazing European PGConf 2016 I'm about to give a talk on the MySQL to PostgreSQL replication project I'm developing.

But first things first. I want to thank all the organisers of the PGConf Europe. The talks were super interesting the location absolutely stunning and the old and new friends I met, remembered me how amazing is the PostgreSQL community. Unfortunately I missed the social events (did somebody said a DBA can have social life? :P).

The meetup will be in Brighton UK Friday 18th of November. There will be  a chat starting at 19.00 with pizza and beer offered by our sponsor.
The talk will start around 19.30 and will be streamed on youtube here.

Here's the talk description.

pg_chameleon is a lightweight replication system written in python. The tool connects to the mysql replication protocol and replicates the data in PostgreSQL.
The author's tool will talk about the history, the logic behind the functions available and will give an interactive usage example.

If you are in Brighton and want to join us please RSVP here.

Saturday 22 October 2016

If you close your eyes you can smell Estonia

It's almost time for the pgconf 2016. This year the conference is in Tallinn the capital of Estonia. The conference is packed with super interesting talks which I really look forward to listen.

I'll also present the talk life on a rollercoaster which tells the story of the last 4 years  of a DBA(just guess who's he :P)  dealing with large large PostgreSQL installations.

I decided to make the talk with a narrative, and hopefully, entertaining form, in order to avoid to bore the audience to death.

I will also submit a lightning talk for the pg_chameleon tool I'm developing.

I'm pretty excited to meet old friends from the PostgreSQL community.

Well, that's all for now.

See you in Tallinn very soon :)

Wednesday 5 October 2016

An evening with Psycopg2

The next meetup at Brighton will be Friday 14th October at 19.30.

Daniele Varrazzo will kindly talk about the most popular PostgreSQL driver in the python universe.
Psycopg is the most used PostgreSQL driver for the Python programming language.

As a driver it stays at a relatively low level, allowing to use all the features offered by the database. Behind the scenes, it does its best to convert the rich Python data model into the likewise rich PostgreSQL data model:
we will take a look at the adaptation mechanism and how to customize it to add new data types or change the default behaviour.

We will also take a look at how psycopg allows Python to tap into the PostgreSQL publish/subscribe mechanism and how this can be used to build reactive and distributed applications.
The presentation's live stream will be available at this link

Please RSVP here.

See you in Brighton!

Tuesday 20 September 2016

Mighty morphin power elephant

Back in the 2013 I started playing with sqlalchemy to create a simple extractor from heterogeneous systems to be pushed in postgresql.
I decided to give the project a name which recalled the transformation and I called pg_chameleon.

To be honest I didn't like sqlalchemy.  Like any other ORM adds an interface to the data layer with a mental approach to the data itself. I lost the interest to developing a migrator very soon, and after all there are thousands of similar tools thousands of times better than mine (e.g. the awesome pgloader)

However recently I revamped the project after discovering a python library capable to read the mysql replication protocol. In few weeks I cleaned all the sqlalchemy stuff, rebuilt the metadata extraction using the information_schema and finally I had an usable tool to replicate the data across the two systems.

I've also changed the license from GPL to the 2 clause BSD.

The tool requires testing. I'm absolutely sure is full of bugs and issues, but it seems to work quite nice.

Some key aspects:

  • Is developed in python 2.7. Bear with me, I'll build a port to python 3.x when/if the project will get to an initial  release.
  • I use tabs (4 space tabs). Bear with me again. I tried to use spaces and I almost thrown my laptop out of the window
  • is not working. I'll fix this as soon as I'll do a release.
  • Yes, the sql part use the "hungarian notation" and the keywords are uppercase with strange indentation on the statements .  
  • The DDL are not yet replicated. I'm thinking to a clever approach to the problem.

That's it. If you want to test it please do and try to break the tool :)

The tool is on github here:

Friday 2 September 2016

News from the outer ring

After the summer break the Brighton PostgreSQL meetup restarts with the monthly technical talks.

This time is my round again. I'll speak on how to scale the backup and recovery on large postgres installations.

Actually this is the talk I've submitted to the european pgconf.
I made the talk in a storytelling form in order to avoid to bore the audience to the death. The talk should be  quite entertaining with explanation of  the issues solved  by the DBA over the years.

As google is removing the hangouts on air I'm using youtube live and OBS to stream this event. It's the first time I try and I cannot guarantee it will work.

I'll record the presentation just in case the stream is broken.

Event details

PostgreSQL - backup and recovery with large databases


Friday 9th September 19.00 London Time Zone

Location: Brandwatch - 1st Floor Sovereign House, Church St, Brighton, East Sussex BN1 1UJ, Brighton

Dealing with large databases is always a challenge.
The backups and the HA procedures evolve meanwhile the database installation grow up over the time.
The talk will cover the problems solved by the DBA in four years of working with large databases, which size increased from 1.7 TB single cluster, up to 40 TB in a multi shard environment.
The talk will cover either the disaster recovery with pg_dump and the high availability with the log shipping/streaming replication.
The presentation is based on a real story. The names are changed in order to protect the innocents.

RSVP here

Live stream (hopefully) here

Monday 6 June 2016

Another evening with PostgreSQL

Thanks to the Ferrara's University, departement of Engineering I'll attend to another evening with PostgreSQL.

This time the event will be a seminary with all talks made by me.

The first talk will cover our beloved elephant's history.

The second talk will look to the engine with  particular attention to the MVCC and the memory manager.

The final talk will cover thebackup and recovery with pgdump and the streaming replication.

The talks will be in Italian and, hopefully, I'll be able to stream them via google hangout.

Here is the event page in Italian .

The schedule is

15:00 Don’t panic! - Introduzione a PostgreSQL
15:40 Coffee break
15:50 The Ravenous Bugblatter Beast of Traal - PostgreSQL: uno sguardo dentro al motore
16:50 Coffee break
17:00 Mostly harmless - Backup & recovery in PostgreSQL
17:50 Closing remarks - Q&A

There is also a Facebook event to RSVP.

The event is Free of charge.
See you in Ferrara!

Thursday 11 February 2016

Brigthton PostgreSQL Meetup news

I've been  busy recently  and I failed to update on the last meetup news.
I apologise for that.

We had a very interesting meetup in January.

Alexey Bashtanov explained how the grouping works in postgres and how to improve or even re implement in C the grouping functions.
On the meetup page there are the pictures from the meeting .
The presentation's recording is available there and the slides are free to download on slideshare there.

We are already preparing the next meetup announced for 18th of March.
That's the abstract.

The next postgresql meetup will be a technical talk on the ways in which PostgreSQL can communicate with the outside world.

This talk will cover PostgreSQL Foreign Data Wrappers which are tables that
can read and write to external services. 

Efficient ways to run queries from applications will also be covered.

About the speaker.
Matthew Franglen works as a team lead in Brandwatch. He has a long history
of development, with over 10 years of experience.

As usual the talk will be preceded by pizza and a nice chat. We'll also do a recording.

If you are in Brighton please join us here.

Saturday 2 January 2016

It's bigger on the inside

I'll be at the University of Ferrara Saturday 9th of January for a PostgreSQL afternoon.

This is the confirmed schedule.

15:00 - Federico Campoli: PostgreSQL, the big the fast and the (NOSQL on) Acid
15:40 - Michele Finelli:  The PostgreSQL's transactional system
16:20 - Coffee break / general chat
16:40 - Federico Campoli: Streaming replication
17:30 - Federico Campoli: Query tuning in PostgreSQL
18:00 - Michele Finelli: An horror fairy tale: how we have lost a database
18:20 - Wrap up and general chat

All talks are in Italian language.
The conference is free.
 All times are European time zone (GMT +1).

The venue's address is:
Via Guercino, 47

Please RSVP here

See you Saturday then.