Monday, October 27, 2014

My first postgres extension

After attending this year's pgconf.eu I decided to write my first postgres extension. I did this with 2 goals in mind. First of all, I wanted the extension. Secondly, I wanted to document how long it takes and how complicated it is.
A few words to my background, I am mainly a Perl programmer with a few years experience in programming C before I switched to Perl. Knowing C, I am not afraid to write Perl extensions in XS. Also, I am a contributor to the Apache mod_perl project which is also written in C. I know Postgres since version 6.4. But I haven't really delved into it until last year. Since then, I have learned a lot and even found a bug in postgres. The conference in Madrid was my second event of that type. Before I attended the PG conference in Dublin last year.

The idea

At work, we replicate databases over transcontinental links. Monitoring the replication lag is essential for us. With a stock PG 9.3, one can use the pg_stat_replication view on the master to get a picture how far the replicas lag behind. Usually, I use a command like this:

SELECT client_addr,
       application_name,
       state,
       flush_location,
       pg_size_pretty(pg_xlog_location_diff(pg_current_xlog_location(),
                                            flush_location)) as lag
  FROM pg_stat_replication
 ORDER BY 2, 1;

There are 2 points to watch out. First, the query must run on the master. This may or may not be a problem. Second, the lag output is in units of bytes. You may have a guess of the average bandwidth between master and replica. So, you might be able to derive an educated guess how long it will take for the lag to dissolve.
Anyway, at the conference I listened to a talk by Petr Jelinek about Uni Directional Replication. He mentioned to have measured the replication lag in units of time. Later I asked him how. It turned out he simply updated a special table with at timestamp. That way, records containing the time on the master are written to the WAL stream and replicated to the slave. Given both master and slave are time-synchronized by means of NTP or similar, one could simply compare the current clock time with the time in that table on the slave and get an impression at what time the WAL that is currently replayed on the slave was generated on the master and, thus, how far the slave lags behind in units of time. After a bit of pondering I decided that a custom background worker, a feature that comes with PG 9.3, is the right way to implement the idea.

Ready, steady, ...

It is the Sunday after the conference. At 12:45 I started my little project. First, I tried to find a template. So, I asked aunt Google for postgres extension. One of the first links that came up was http://www.postgresql.org/docs/9.3/static/extend-extensions.html. Here I read for a while what the constituents of an extension are. The next link also looked promising: http://pgxn.org/. First, I browsed a few extensions. Then I spotted the custom background worker link in the tag cloud on the home page. It led me to a page with 2 projects, config_log and worker_spark. The latter one has a repository on Github with exactly one commit and that only contained an almost empty README.md.
So, I went for config_log. After reading the README I made sure I have a pg_config and that it is found in $PATH.
Then:

git clone git://github.com/ibarwick/config_log.git
cd config_log
make &&
sudo make install

That went smoothly!

I created the extension in my database and played a bit with it. Seemed to work as expected. By now, it's 13:45.

Next, I studied the code and decided to first create a streaming_lag.control file:

# streaming_lag extension
comment = 'streaming lag in seconds instead of bytes'
default_version = '0.0.1'
module_pathname = '$libdir/streaming_lag'
relocatable = true
superuser = true

and a Makefile for my extension:

MODULES = streaming_lag
OBJS = streaming_lag.o

EXTENSION = streaming_lag
EXVERSION = $(shell sed -n \
              "/^default_version[[:space:]]*=/s/.*'\(.*\)'.*/\1/p" \
              streaming_lag.control)

DATA = streaming_lag--$(EXVERSION).sql

PG_CONFIG = pg_config

# verify version is 9.3 or later

PG93 = $(shell $(PG_CONFIG) --version | \
         (IFS="$${IFS}." read x v m s; expr $$v \* 100 + $$m \>= 903))

ifneq ($(PG93),1)
$(error Requires PostgreSQL 9.3 or later)
endif

PGXS := $(shell $(PG_CONFIG) --pgxs)

include $(PGXS)

The clock shows 14:30 and I start to work on streaming_lag.c where the actual code resides. With a 2 hour break for a late lunch I worked until midnight. Then I had a completely functional version of my extension on Github including documentation. Also by midnight, I had figured out how to get an account on pgxn.org and requested one.

Remarks

Besides the config_log code I studied several postgres header files and C code. The documentation there is amazing. A great source of information was also the worker_spi extension that I later found in the postgres source in contrib/.
In my extension I need a constantly ticking clock. At first I tried to use the timer mechanism provided on Linux by timer_create() because according to POSIX this is the way to go in new programs. But I wasn't able to link my shared library with -lrt in a way that it loads then correctly into the backend process. So, I switched back to the older setitimer() technique. I searched a bit the postgres code to see if it provides an interval timer. I found a place where it uses setitimer() but the it_interval part that makes the timer recurring was not usable through the provided interface. As this is my first encounter with the postgres source code, maybe I have overlooked something.
At first, I implemented the extension similar to worker_spi which creates all the necessary tables when the background worker starts up. Then I read about relocatable extensions and decided to go that way. Now, I even had something to put into the 3rd constituent of an extension, the SQL file.

The next day

First, I found a reply in my inbox confirming my new account on pxgn.org. Now, the extension can also be found there.
So far I was quite convinced that my extension works but a good piece of software requires automated tests. I couldn't find, however, a simple and reliable way to do that. And since the main goal of this effort was to get a glimpse of how to create postgres extensions, I decided to manually set up a test scenario between a postgres instance on my notebook as replica and a database on a server nearby connected via WIFI as master.
The extension was installed in the default configuration except for the precision GUC. Here I configured to write a timestamp every 200msec. Then I created a simple table with one column and 10 million rows. To generate WAL I simply updated all rows. I figured, over WIFI that should generate a measurable lag.
And it did. Using psql's \watch utility and redirecting the output into a file, I was able to produce the following diagram.




















I spent another 3 hours doing that. Then I sat down to write this blog. BTW, this is also the first time I use blogger.com.

Lessons learned

  • writing postgres extensions, even in C, is easy
  • it took me about 2.5 hours from scratch to find all the resources required to implement the project
  • the whole effort including the real test on the next day took about 12 hours. I expected much more!