Improved Tag-Setting Function
entry_coll_tag_set()? Remember the original:
1 CREATE OR REPLACE FUNCTION entry_coll_tag_set ( 2 obj_id integer, 3 coll_ids integer 4 ) RETURNS VOID AS $$ 5 BEGIN 6 PERFORM true FROM entry WHERE id = obj_id FOR UPDATE; 7 8 UPDATE entry_coll_tag 9 SET tag_order = -tag_order 10 WHERE entry_id = obj_id 11 12 FOR iloop IN 1..array_upper(coll_ids, 1) LOOP 13 IF coll_ids[iloop] IS NULL THEN 14 CONTINUE; 15 END IF; 16 17 UPDATE entry_coll_tag 18 SET tag_order = iloop 19 WHERE entry_id = obj_id 20 AND tag_id = coll_ids[iloop]; 21 22 IF FOUND IS false THEN 23 INSERT INTO entry_coll_tag (entry_id, tag_id, tag_order) 24 VALUES (obj_id, coll_ids[iloop], iloop); 25 END IF; 26 END LOOP; 27 28 DELETE FROM entry_coll_tag 29 WHERE entry_id = obj_id AND tag_order < 0; 30 END; 31 $$ LANGUAGE plpgsql;
The number of queries run by this function is even more variable than the number run by
entry_coll_tag_add(), as each ID in coll_ids will trigger the execution of either one or two queries: one if the
UPDATE statement updates a row, and two if it doesn't. Here is the batch-updating version:
1 CREATE OR REPLACE FUNCTION entry_coll_tag_set ( 2 obj_id integer, 3 coll_ids integer 4 ) RETURNS VOID AS $$ 5 BEGIN 6 PERFORM true FROM entry WHERE id = obj_id FOR UPDATE; 7 8 UPDATE entry_coll_tag 9 SET ord = -ord 10 WHERE entry_id = obj_id; 11 12 IF FOUND IS false THEN 13 INSERT INTO entry_coll_tag (entry_id, tag_id, ord) 14 SELECT obj_id, coll_ids[gs.ser], gs.ser 15 FROM generate_series(1, array_upper(coll_ids, 1)) 16 AS gs(ser) 17 WHERE coll_ids[gs.ser] IS NOT NULL; 18 ELSE 19 UPDATE entry_coll_tag SET ord = ser 20 FROM ( 21 SELECT gs.ser, coll_ids[gs.ser] as move_tag 22 FROM generate_series(1, array_upper(coll_ids, 1)) AS gs(ser) 23 WHERE coll_ids[gs.ser] IS NOT NULL 24 ) AS expansion 25 WHERE move_tag = entry_coll_tag.tag_id 26 AND entry_id = obj_id; 27 28 INSERT INTO entry_coll_tag (entry_id, tag_id, ord ) 29 SELECT obj_id, coll_ids[gs.ser], gs.ser 30 FROM generate_series(1, array_upper(coll_ids, 1)) AS gs(ser) 31 WHERE coll_ids[gs.ser] NOT IN ( 32 SELECT tag_id FROM entry_coll_tag 33 WHERE entry_id = obj_id 34 ); 35 36 DELETE FROM entry_coll_tag 37 WHERE entry_id = obj_id AND ord < 0; 38 END IF; 39 END; 40 $$ LANGUAGE plpgsql;
I admit that, at first glance, this looks more complicated than the original, but that's only because I added an optimization that I could have included in the original, had I noticed it before. Here is the function piece-by-piece.
- Once again, the first two statements in the function body are the same as before. The function retains the entry row locking
PERFORMstatement and the
UPDATEstatement that negates the values for the tag_order column for all records associated with obj_id.
- Line 12 is where the changes begin, and also where I added the optimization missing in the original. If the globally available PL/pgSQL Boolean variable
false, it means that the
UPDATEstatement in lines 8-10 updated no records, and there is no need for any fancy tricks to update previously existing rows, because there are none.
- Lines 13-17 use
generate_series()in a batch
INSERTstatement to simply insert all of the tag IDs in coll_ids. This is about as clear as it gets, with the gs.ser
columnboth accessing individual array values and supplying the corresponding value for tag_order. The
WHERE coll_ids[gs.ser] IS NOT NULLexpression skips over any
NULLvalues that might be in coll_ids.
- Line 18 starts the block that handles the case where the
UPDATEstatement at lines 8-10 did update one or more rows. The upshot is that any tag IDs in coll_ids that are already in the table must be updated to their appropriate tag_order values, those that don't exist must be inserted, and any that are left over must be deleted.
- Accordingly, lines 19-26 use a batch
UPDATEto update any existing records with new tag_order values. It uses
FROMclause to create the appropriate rows to be updated, with a
WHEREclause in the expression that creates the gs table to skip
NULLvalues and a
WHEREclause for the
UPDATEstatement itself to map the
UPDATEto the appropriate rows, where the entry ID and tag ID match.
- Rows 28-34 then insert the new tags into the collection table, once again using a batch
generate_series()to create the new rows to be added. As in the revamped
NULLs and prevents inserting existing tags, which would conflict with the unique index.
- Finally, rows 36-37 delete any leftover associations, just as before, by removing those with a tag_order less than zero.
Phew! That was a lot to cover, but the result is that this function now also runs a constant number of queries for each execution. Or nearly constant. If there are no existing rows in the collection table, it will run only three queries. Otherwise, it will run five. In neither case will it ever run any more than five.