The Future of Perl in PostgreSQL
by Andrew Dunstan05/25/2006
My first article about PL/Perl discussed using triggers written in PL/Perl, and the second showed how to exchange data with the database and running database commands from inside PL/Perl. This final article in the series explores the improvements in PostgreSQL versions 8.0 and 8.1 and some of the possible improvements coming down the track.
Updates
PostgreSQL 8.0 and 8.1 have added and improved several features. Most of them make life easier for developers.
Communication Between Functions
Before version 8.0, each trusted function received its own Safe container. There was no effective way to share data between trusted functions, or between trusted and untrusted functions. Untrusted functions could share data with each other, as they shared a common global namespace, but having to use untrusted functions just to be able to share data was not very satisfactory. Starting with version 8.0, all trusted PL/Perl functions share a single Safe container, and there is a special shared hash called %_SHARED that is available to all PL/Perl functions, both trusted and untrusted, to allow them to share data. This can also be very useful if you need to save some state between function calls.
Suppose that you want to compute the cumulative sum of a certain column. You could do it with a set returning function as shown in the previous article, or you could do it far more simply using shared data and a pair of tiny PL/Perl functions:
CREATE FUNCTION init_csum()
RETURNS void
LANGUAGE plperl AS $$
$_SHARED{csum} = 0.0;
$$;
CREATE OR REPLACE FUNCTION csum(float)
RETURNS float
LANGUAGE plperl AS $$
my $observation = shift;
return $_SHARED{csum} += $observation;
$$;
Be sure to call init-csum before each statement where you call csum, to reset the accumulator.
To use this, make sure that the data is in the right order before it gets passed to the csum function. The easiest way to do this is with a subquery:
SELECT init_csum();
SELECT date,
to_char(rain,'99D99') as rainfall,
to_char(csum(rain),'999G999D99') as "Cumulative Rainfall"
FROM (SELECT date,
rain
FROM daily_rain
WHERE date BETWEEN '2004-01-01' and '2004-01-09'
ORDER BY date
) as inner_query;
The result (which looks like it comes from a rainforest) looks like:
date | rainfall | Cumulative Rainfall
------------+----------+---------------------
2004-01-01 | 8.40 | 8.40
2004-01-02 | 6.35 | 14.75
2004-01-03 | 9.47 | 24.22
2004-01-04 | 4.91 | 29.13
2004-01-05 | 1.14 | 30.28
2004-01-06 | 7.68 | 37.96
2004-01-07 | 2.03 | 39.98
2004-01-08 | .25 | 40.23
2004-01-09 | 4.01 | 44.24
(9 rows)
UTF-8 Support
From version 8.1 of PostgreSQL, the PL/Perl glue code detects if the database uses UTF-8 encoding. If so, then it encodes strings for PL/Perl as UTF-8 so that Perl treats the strings the right way. The code detects at compile time whether the Perl version is modern enough to support UTF-8 encoded strings, and only takes this action if so.
There is nothing that the programmer should need to do to get UTF-8 support--if it's available, it just happens automatically.
use strict;
You might think that this is something that programmers could choose to do. However this is not so--calling use is a forbidden operation inside of the Safe containers. Yet as all good Perl programmers know, strict mode is very nearly the greatest thing since the birth of Larry Wall. PostgreSQL version 8.1 has some switches that let you turn strict mode on (and off, if you really must).
The best way to do this is globally in the database server config file postgresql.conf. Two simple lines there will turn on strict mode and you can not worry any more about it:
custom_variable_classes = 'plperl'
plperl.use_strict = true
The first line allows a prefix (or list of prefixes) for a configuration settings that the PostgreSQL core code doesn't know about. (This facility originated for use by PL/Java, but is usable by any loadable modules--it's all part of the PostgreSQL extension philosophy.) The second line sets a flag that the PL/Perl code checks.
To allow strict mode, but not to force it, specify the custom_variable_classes setting.
Another way to turn on strict mode in trusted code is to prepend some code to your function:
BEGIN { strict->import(); }
Finally, if you need to turn on strict mode temporarily, and you have set the custom_variable_classes properly, use:
SET plperl.use_strict = 'true';
Doing this will not last past the current session, and will not affect already-compiled PL/Perl functions. I recommend to turn it on once in your postgresql.conf file and then live happily ever after.
Validator Function and Logged Non-Fatal Warnings
From version 8.1, PL/Perl has a validator function that is called when you create a function. Previously, PL/Perl did not compile the function when it was created, preferring to wait until its use. That was something of a nuisance, so PostgreSQL now compiles it using the validator when it is created. Of course, there are many Perl errors that you can't catch until the code runs, and this is no different. Nevertheless, it is it still useful--just as Perl's -c command line flag is useful.
Version 8.1 also helps developers by no longer dropping non-fatal warnings. They are now logged nicely at level NOTICE.
Pages: 1, 2 |



