Tuesday 20 August 2013

the chamelion circuit

I decided to create a brand new project to migrate schema and data from mysql (for now) to PostgreSQL.

I've named pg_chamelion as the database from can be anything. As I said I'm starting with mysql but when I'll get an usable product I will write libraries for sqllite and other dbms.

I'm using sqlalchemy, quick and dirty.

I'm not a great fan of those database abstraction layers; they are making developers ignorant about a wonderful language called SQL and generating monsters IMHO.

Anyway, the metadata package is idiots proof, I've written a conect-read-transform library in barely 120 rows and now I'm able to dump the ddl on file or write directly over a postgresql connection.

The next step is to move data across the databases; I'll give freedom of choice, statements or dump reload using copy.

My old script used statements and incredibly worked fine even for medium size databases, but if I want my new library to be usable for the new buzzword, bigdata I need an high performance strategy or this will remain a nice toy.

The project can be pulled here, GPL V3 licence.
https://github.com/the4thdoctor/pg_chameleon