O'Reilly Databases

oreilly.comSafari Books Online.Conferences.

We've expanded our coverage and improved our search! Search for all things Database across O'Reilly!

Search Search Tips

advertisement
AddThis Social Bookmark Button

Print Subscribe to Databases Subscribe to Newsletters

Batch Updates with PL/pgSQL
Pages: 1, 2, 3, 4, 5

Batch-Update Syntax

PostgreSQL supports inserting multiple rows at once using its special INSERT INTO expression syntax, which essentially allows you to select a series of rows and then insert them elsewhere, all in a single statement. For example, suppose that you want to copy all of the rows in table foo into table bar (that is, all rows that aren't already in table bar). Here's how to do it in a single INSERT statement:



INSERT INTO bar (a, b, c)
SELECT d, e, f FROM foo
WHERE  foo.d not in (SELECT a FROM bar);

That's it. By using a SELECT statement in the expression part of the INSERT statement, you can insert multiple rows in a single statement, and even limit the rows through the use of the WHERE clause. Of course, I'm sure that you don't duplicate data like this unless you're deprecating table foo in favor of table bar, right? This syntax does have its uses, though.

As for batch updates, you likely already know that you can update multiple rows at once using a WHERE clause in an UPDATE statement. As with INSERT, UDPATE can also update a series of rows from an expression, generally a SELECT statement. This is possible because, in PostgreSQL, a FROM clause can be an expression, and UPDATE supports the FROM clause. For example, if you want to update all of the rows in a bar table with values from the foo table, where bar.a has the same value as foo.d, you can write:

UPDATE bar
SET    b = foo.e, c = foo.f
FROM   (SELECT d, e, f FROM foo) AS foo
WHERE  a = foo.d;

Pretty simple, right? Not only can you use an expression in the FROM clause, but you can use AS to name it so that the remaining SQL code can treat it exactly like a table. Now how do you construct such SELECT statements within your functions?

From Arrays to Sets

To take advantage of bulk updates in collection functions, you need some way to create a SELECT statement that generates a row for every value in the array passed to the function. Then you can use that SELECT statement as part of the INSERT and UPDATE statements to update the collections with the same number of queries, regardless of how many IDs are in the array.

When I initially tackled this problem, my first attempt was to create a function that converts an array into a set. Fortunately, this turned out to be pretty easy, given how set-returning functions work:

CREATE OR REPLACE FUNCTION array_to_set(
    arr anyarray
) RETURNS SETOF anyelement AS $$
BEGIN
    FOR idx IN array_lower(arr, 1)..array_upper(arr, 1) LOOP
        RETURN NEXT arr[idx];
    END LOOP;
END;
$$ LANGUAGE plpgsql;

PostgreSQL provides several pseudo-types, which are convenient for contexts that need to handle any number of different data types. Because this example code doesn't really care about the data type of the array it converts, I have declared that array_to_set() can take any kind of array--denoted as anyarray--as an argument, and that it returns a SETOF anyelement. Indeed, this function works pretty well:

try=% select * from array_to_set( ARRAY[3,4,10,56,2] );
 array_to_set 
--------------
            3
            4
           10
           56
            2
(5 rows)

Then I realized that I needed not only each value in the array, but also its position in the array so that I could populate the ordering column in the collection table. That would be trickier to write, given the limited return values of a function. Fortunately, there is another approach.

Series Generation

PostgreSQL comes with a very nice set-returning function called generate_series(). This function takes three arguments: a beginning number, an ending number, and an optional step number, and returns a set of numbers from the beginning number to the ending number, each incremented by the step number. It's easiest to understand when you see it in action:

try=% select * from generate_series(2, 4);
 generate_series 
-----------------
               2
               3
               4
(3 rows)

try=% select * from generate_series(1, 10, 2);
 generate_series 
-----------------
               1
               3
               5
               7
               9
(5 rows)

Nice, eh? It's a powerful solution to the problem of creating a SELECT statement that generates rows for each number in the series and each value in an array--without having to call the array_to_set() function at all:

try=% SELECT gs.ser, coll.ids[gs.ser] as id
try-% FROM   (SELECT ARRAY[ 2,4,6,7,8 ]) AS coll(ids),
try-%        generate_series(1, 5) AS gs(ser);
 ser | id
-----+----
   1 |  2
   2 |  4
   3 |  6
   4 |  7
   5 |  8
(5 rows)

Because PostgreSQL FROM clauses may be SQL expressions, you can call generate_series() in the FROM clause and use its values to iterate over the array for each row. The key to being able to reference the series is the AS gs(ser) clause, which gives the call to generate_series() the table name gs and its lone column the name ser. Then simply reference the appropriate value in each row as gs.ser.

Thus, using the generate_series() set-returning function, you can get at each element in the array for each row, while also outputting the series number, all in a single query. This is exactly what you need in order to use batch updates in the collection-management functions.

Pages: 1, 2, 3, 4, 5

Next Pagearrow




Tagged Articles

Post to del.icio.us

This article has been tagged:

postgresql

Articles that share the tag postgresql:

Managing Many-to-Many Relationships with PL/pgSQL (17 tags)

Writing PostgreSQL Functions with PL/pgSQL (11 tags)

Datamining Apache Logs with PostgreSQL (8 tags)

Introducing Slony (8 tags)

Batch Updates with PL/pgSQL (8 tags)

View All

sql

Articles that share the tag sql:

How to Misuse SQL's FROM Clause (74 tags)

Configuring Database Access in Eclipse 3.0 with SQLExplorer (28 tags)

Managing Many-to-Many Relationships with PL/pgSQL (18 tags)

Hierarchical SQL (16 tags)

The Outer Limits of SQL JOINs (14 tags)

View All

plpgsql

Articles that share the tag plpgsql:

Managing Many-to-Many Relationships with PL/pgSQL (6 tags)

Batch Updates with PL/pgSQL (5 tags)

View All

programming

Articles that share the tag programming:

Rolling with Ruby on Rails (1374 tags)

Very Dynamic Web Interfaces (279 tags)

Ajax on Rails (231 tags)

Understanding MVC in PHP (202 tags)

A Simpler Ajax Path (186 tags)

View All

database

Articles that share the tag database:

MySQL FULLTEXT Searching (54 tags)

Live Backups of MySQL Using Replication (53 tags)

Advanced MySQL Replication Techniques (53 tags)

Dreaming of an Atom Store: A Database for the Web (49 tags)

How to Misuse SQL's FROM Clause (38 tags)

View All

Sponsored Resources

  • Inside Lightroom

Related to this Article

Access 2013 For Dummies Access 2013 For Dummies
March 2013
$24.99 USD

PHP, MySQL, JavaScript & HTML5 All-in-One For Dummies PHP, MySQL, JavaScript & HTML5 All-in-One For Dummies
by Steve Suehring
March 2013
$44.99 USD

Advertisement
Sign up today to receive special discounts,
product alerts, and news from O'Reilly.
Privacy Policy >
View Sample Newsletter >
  • Youtube
  • http://www.youtube.com/OreillyMedia
  • Twitter
  • Subscribe
  • View All RSS Feeds >
O'Reilly Media

800-889-8969 or 707-827-7019
Monday-Friday 7:30am-5pm PT
©2011, O'Reilly Media, Inc.
All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners.
  • About O'Reilly
  • Academic Solutions
  • Contacts
  • Customer Service
  • Careers
  • Press Room
  • Privacy Policy
  • Terms of Service
  • Writing for O'Reilly
  • Community
  • Authors
  • Forums
  • Membership
  • Newsletters
  • RSS Feeds
  • User Groups
  • Partner Sites
  • makezine.com
  • makerfaire.com
  • craftzine.com
  • igniteshow.com
  • PayPal Developer Zone
  • O'Reilly Insights on Forbes.com