Batch Updates with PL/pgSQLby David Wheeler
The previous article in this series, Practical PL/pgSQL: Managing Ordered Sets, created four functions to simplify the management of ordered collections as many-to-many relationships. The two more complex functions,
entry_coll_tag_add(), take an iterative approach to managing those relationships. By
iterative, I mean that they use loops to iterate over an array of IDs in order to do the right thing for each.
The downside to this approach is that the performance of those functions is directly proportional to the number of IDs in the array (
Ο(n)). It would be ideal to make the runtime of the functions constant, regardless of the number of IDs in the array (
Fortunately, there is a way to do just that in PostgreSQL. Before then however, think back to the Fibonacci examples from the first article in this series, Introduction to PostgreSQL PL/pgSQL. Returning to those examples, I'll introduce some new concepts in a simpler format than the collection functions allow.
As I mentioned in the first article, PostgreSQL functions can return a value of any supported data type. I didn't mention that they can also return sets of a particular type. A set is a list of values of a particular type, but rather than returning those values as a list or an array (as you might expect in a dynamic programming language), PostgreSQL functions return them as rows of data.
Suppose that you need to get a list of Fibonacci numbers up to a particular
place in the Fibonacci sequence. Writing such a function in PL/pgSQL is as
simple as modifying the
fib_fast() function to return each
Fibonacci number as it's calculated. It does so using the PL/pgSQL
RETURN NEXT statement. Here are
fib_fast() and the
new set-returning function
1 CREATE OR REPLACE FUNCTION fib_fast( 2 fib_for integer 3 ) RETURNS integer AS $$ 4 DECLARE 5 ret integer := 0; 6 nxt integer := 1; 7 tmp integer; 8 BEGIN 9 FOR num IN 1..fib_for LOOP 10 11 tmp := ret; 12 ret := nxt; 13 nxt := tmp + nxt; 14 END LOOP; 15 16 RETURN ret; 17 END; 18 $$ LANGUAGE plpgsql;
1 CREATE OR REPLACE FUNCTION fibs_to( 2 max_num integer 3 ) RETURNS SETOF integer AS $$ 4 DECLARE 5 ret integer := 0; 6 nxt integer := 1; 7 tmp integer; 8 BEGIN 9 FOR num IN 1..max_num LOOP 10 RETURN NEXT ret; 11 tmp := ret; 12 ret := nxt; 13 nxt := tmp + nxt; 14 END LOOP; 15 16 RETURN NEXT ret; 17 END; 18 $$ LANGUAGE plpgsql;
There are really only three differences aside from the function names, and I've emphasized them in
fibs_to(). The first difference is on line three, where the
fibs_to() declaration indicates that it returns a
SETOF integer instead of simply an
SETOF keyword tells PostgreSQL that this function returns a set.
The other differences are that, rather than simply returning the value of the
ret integer variable,
fibs_to() uses the
RETURN NEXT statement to return each Fibonacci number after its calculation in the loop. The final
RETURN NEXT statement returns the final Fibonacci number in the sequence.
Those are the only changes necessary to create a set-returning function. As such a function,
fibs_to() must also be called in a different context. While you can call
fib_fast() in a
try=% select fib_fast(8); fib_fast ---------- 21 (1 row)
fibs_to() essentially behaves like a table, and you must treat it as such by using it in a
try=% select * from fibs_to(8); fibs_to --------- 0 1 1 2 3 5 8 13 21 (9 rows)
Be warned, however, that while it looks like
fibs_to() and behaves like a continuation, (and, for most practical purposes, is treatable like a continuation), PostgreSQL actually buffers all of the values returned by
RETURN NEXT and only returns them to the calling context after the function has calculated them all. That means that if you write a set-returning function that returns a lot of values, you need to make sure that your server's memory can handle it.
That caveat aside, set-returning functions can be extremely useful.