|
Oracle 9i Release 2 Developments for PL/SQL Collectionsby Steven Feuerstein, coauthor of Oracle PL/SQL Programming, 3rd Edition and Bryn Llewellyn04/01/2003 |
Editor's note: In the previous article in their continuing series on new Oracle 9i features, Steven Feuerstein and Bryn Llewellyn looked at Utl_Http and showed how you can use it in an Oracle 9i database to implement a requestor in a B2B implementation. In this article, the pair explores PL/SQL collections in Oracle 9i Release 2, with particular emphasis on associative arrays. Find out why extending the flexibility of the collection syntax, storage, and access, makes it possible to write much simpler, more efficient code than was possible in earlier versions.
For the first several years after Steven published the first edition of Oracle PL/SQL Programming in 1995, he evangelized the use of packages as a fundamental building block of PL/SQL-based applications. This seemed to be a critical message for a number of years, as relatively few developers knew about and used packages. Lately, it appears that the "package story" has caught on; most developers do deploy the majority of their functionality from within packages. In the course of querying students and presentation attendees about their programming habits, however, Steven has discovered another very helpful aspect of PL/SQL that's being drastically under-utilized: the collection.
A PL/SQL collection is, in its essence, very similar to a single-dimensioned array. A
collection allows you to maintain lists of information, and it can be used to improve query
performance and also simplify the code you write to manage data within a PL/SQL
program. These collection data structures come in three flavors: nested table, varying
array (a.k.a. VARRAY), and associative array. We can't, within the scope of this article,
offer a complete introduction to collections. You can obtain such coverage from any
number of PL/SQL texts and the Oracle documentation. Our intention in this article is,
instead, to let you know about some very interesting developments in Oracle 9i Release 2
regarding PL/SQL collections.
|
Related Reading
|
We will, in fact, focus on one particular type of collection: the associative array. For
those of you who have worked with collections over the years, this will be an unfamiliar
term. Some of you will remember back in Oracle 7 when collections were first introduced;
at that point, they were called "PL/SQL tables," since they were similar to very simple
relational tables (consisting of a single column) but were declared and manipulated only
within PL/SQL programs. Then in Oracle 8, Oracle introduced two other kinds of single-
dimensioned lists (VARRAYS and nested tables). In the process, they changed the name
of the original collection type from "PL/SQL table" to "index-by table" (reflecting its
mandatory INDEX BY BINARY_INTEGER clause).
VARRAYS and nested tables can be used both in schema-level declarations
(especially, for example, for the type of a column of a relational table) and in PL/SQL
declarations. Index-by tables can be used only in PL/SQL declarations. Now, in Oracle 9i Release 2, Oracle has once again renamed this collection type, this time from "index-by table" to "associative array." Why another change? The term associative array is the name commonly used in other programming languages (including Perl, C++, JavaScript, and Cymbal, to name a few) to refer to a data structure that stores pairs of keys and values. By making this change, PL/SQL becomes more
consistent with the nomenclature of much of the programming world, and therefore
makes PL/SQL a bit more accessible to developers who are new to PL/SQL, but have
experience in other languages.
Associative arrays can still be used only in PL/SQL declarations. The wonderful development regarding collections in Oracle 9i Release 2 is, however, not in the name change, but in some significant new functionality. Let's take a look.
In the bad old days, there was just one way to declare an associative array:
DECLARE
TYPE names_list_t IS
TABLE OF employee.last_name%TYPE
INDEX BY BINARY_INTEGER;
The "INDEX BY BINARY_INTEGER" clause was fixed and unchangeable. This
meant that the only index allowed on an associative array was the row number, and the
row number had to have been declared as BINARY_INTEGER.
There were several drawbacks to this scheme. First, it required reliance on an
outmoded datatype, since BINARY_INTEGER has since been superseded by
PLS_INTEGER as a more efficient integer datatype. Second, it meant that if the list being
manipulated had a non-integer key, the developer had to write some very complex and/or
compute-intensive logic (namely, to perform full collection scans or create alternative
indexes via hashing) to take advantage of collections.
These restrictions have now been lifted. You can now declare associative arrays to be
indexed by BINARY_INTEGER, PLS_INTEGER, VARCHAR2 and even anchored
declarations of those types using %TYPE. All of the following statements are valid
declarations of associative array types with integer indexes:
DECLARE
TYPE array_t1 IS
TABLE OF NUMBER
INDEX BY BINARY_INTEGER;
TYPE array_t2 IS
TABLE OF NUMBER
INDEX BY PLS_INTEGER;
TYPE array_t3 IS
TABLE OF NUMBER
INDEX BY POSITIVE;
TYPE array_t4 IS
TABLE OF NUMBER
INDEX BY NATURAL;
BEGIN
...
END;
And, very interestingly, if you do declare a type based on a constrained
BINARY_INTEGER subtype, such as POSITIVE, then, if you try to reference a negative
row number, you'll get an error, as shown here:
SQL> DECLARE
2 TYPE pos_only_t IS
3 TABLE OF NUMBER
4 INDEX BY POSITIVE;
5
6 pos_only pos_only_t;
7 BEGIN
8 pos_only (-9) := 1;
9 END;
10 /
DECLARE
*
ERROR at line 1:
ORA-06502: PL/SQL: numeric or value error
ORA-06512: at line 8
Note: Even at Oracle 8i Version 8.1.7, the syntax to use subtypes of
BINARY_INTEGER for an index-by table was allowed, but the implied constraint wasn't
enforced.
You can even use a user-defined subtype, thus:
DECLARE
SUBTYPE my_integer IS PLS_INTEGER NOT NULL;
TYPE array_t4 IS
TABLE OF NUMBER
INDEX BY my_integer;
BEGIN
...
END;
So PL/SQL is now much more flexible than it used to be when it comes to declaring integer-indexed collections. (Note: There are still restrictions. See the section titled "Invalid declarations" for reminders of which syntax is still not deemed acceptable.)
Much more exciting, however, is the fact that you can now declare associative arrays
to have VARCHAR2 or string index values! Here are some examples of such
declarations:
DECLARE
TYPE array_t1 IS
TABLE OF NUMBER
INDEX BY VARCHAR2(64);
TYPE array_t3 IS
TABLE OF NUMBER
INDEX BY VARCHAR2(32767);
TYPE array_t4 IS
TABLE OF NUMBER
INDEX BY employee.last_name%TYPE;
BEGIN
...
END;
/
It's especially impressive that Oracle now lets us use %TYPE to declare an associative
array with a VARCHAR2 index. This allows us to avoid hard-coding a VARCHAR2
maximum length in the TYPE statement.
|
There are still many INDEX BY clauses that aren't valid, even if the datatype is,
ostensibly, consistent or can be converted to something consistent with VARCHAR2 or
BINARY_INTEGER. You won't be able to declare an associative array type based on any
of the following clauses:
INDEX BY NUMBER
INDEX BY INTEGER
INDEX BY DATE
INDEX BY VARCHAR2
INDEX BY CHAR(n)
INDEX BY <some table>.<some column>%TYPE
where <some column> isn't of a type that can by used explicitly as the target of INDEX BY.
We'll finish up this article by examining a scenario in which VARCHAR2-indexed
collections are put to use. First, however, let's walk through the example of declaring and
using such a collection shown in Example 1.
| Description | |
| 2 | Associative array type declaration. Each row of a collection based on this type contains a string of up to 64 characters. |
| 4-5 | Declarations of two collections based on the population_type. The first list contains the populations of countries. The second list contains the populations of continents. |
| 10-18 | Populate individual rows in both the country and continent population lists. Notice that the row "numbers" aren't numbers, but are instead the names of countries and continents. Notice that we assign a value in line 15 to the "Antarctica" row in the continents collection and then we override that value on line 17. |
| 20-21 | Obtain and display the number of rows in the collection. All the traditional
collection methods may be used with VARCHAR2-indexed collections.
COUNT still returns the number of rows in the collection |
| 23-31 | Obtain and display information about the first and last defined rows in the
continent collection. This takes some getting used to. The FIRST and LAST
methods don't return integer values; instead, they return the string value that
is the lowest or highest in the sort order defined for the current character set
in the database. |
| 33-38 | Iterate through all the defined rows in the collection, using the NEXT method. |
1 DECLARE
2 idx VARCHAR2(64);
3 TYPE population_type IS TABLE OF NUMBER INDEX BY idx%TYPE;
4
5 country_population population_type;
6 continent_population population_type;
7
8 howmany PLS_INTEGER;
9 BEGIN
10 country_population('Norway') := 4000000;
11 country_population('Greenland') := 100000;
12 country_population('Iceland') := 750000;
13
14 continent_population('Australia') := 30000000;
15
16 continent_population('Antarctica') := 1000;
17
18 continent_population('Antarctica') := 1001;
19
20 howmany := country_population.COUNT;
21 DBMS_OUTPUT.PUT_LINE ('COUNT = ' || howmany);
22
23 idx := continent_population.FIRST;
24 DBMS_OUTPUT.PUT_LINE ('FIRST row = ' || idx);
25 DBMS_OUTPUT.PUT_LINE (
26 'FIRST value = ' || continent_population(idx));
27
28 idx := continent_population.LAST;
29 DBMS_OUTPUT.PUT_LINE ('LAST row = ' || idx);
30 DBMS_OUTPUT.PUT_LINE (
31 'LAST value = ' || continent_population(idx));
32
33 idx := country_population.FIRST;
34 WHILE idx IS NOT NULL
35 LOOP
36 DBMS_OUTPUT.PUT_LINE ( idx || ' = ' || country_population(idx) );
37 idx := country_population.NEXT(idx);
38 END LOOP;
39 END;
Here's the output one would see in SQL*Plus when running this script with
SERVEROUTPUT turned ON:
COUNT = 3
FIRST row = Antarctica
FIRST value = 1001
LAST row = Australia
LAST value = 30000000
Greenland = 100000
Iceland = 750000
Norway = 4000000
So why would a developer care about the fact that you can now index by VARCHAR2 in
addition to PLS_INTEGER? First of all, in general, you'll want to take advantage of
associative arrays when you need to maintain any sort of lists of data in your PL/SQL
programs. Sure, you can use relational tables to manage lists, but they involve much more
programming and CPU overhead. The code you write for associative arrays is lean and
mean.
|
Also In This Series Using PL/SQL Records in SQL Statements HTTP Communication from Within the Oracle Database Multi-Level Collections in Oracle 9i |
The following scenarios generally indicate a need for collections:
Repeated access to the same, static database information. If, during execution of your program (or during a session, since your collection can be declared as package data and thereby persist with all its rows for the entire session), you need to read the same data more than once, load it into a collection. Multiple scannings of the collection will be much more efficient than multiple executions of a SQL query.
Management of program-only lists. You may build and manipulate lists of data that exist only within your program, never touching a database table. In this case, collections-and, specifically, associative arrays-will be the way to go.
Let's now look at a specific scenario in which a VARCHAR2-indexed array would be
ideal. The requirement to look up a value via a unique non-numeric key is a generic
computational problem. Of course, the Oracle 9i Database provides a solution with SQL
and a B*-tree index. But there's a set of scenarios where considerable performance
improvement can be obtained by instead using an explicit PL/SQL implementation. This
was true even before the new features discussed in this article were available. These
scenarios are characterized by very frequent lookups in a relatively small set of values,
usually in connection with flattening a relational representation for reporting or for UI
presentation.
For this article, we'll work with an English-French vocabulary and translation mechanism. Suppose we have a set of English-French vocabulary pairs stored persistently in the most obvious way in a schema-level table:
-- translations.sql
CREATE TABLE translations (
english varchar2(200),
french varchar2(200));
and we have data in the table as follows (populated by the translations.sql file):
SELECT * FROM translations;
ENGLISH FRENCH
-------------------- ----------
computer ordinateur
tree arbre
book livre
cabbage chou
country pays
vehicle voiture
garlic ail
apple pomme
desk ‚scritoire
furniture meubles
Our task is to allow lookup from French to English, and to allow efficient addition of new vocabulary pairs. We'll immediately turn to the package construct to provide a clean interface to this functionality, as shown in Example 2.
CREATE OR REPLACE PACKAGE vocab
IS
FUNCTION lookup (p_english IN VARCHAR2)
RETURN VARCHAR2;
PROCEDURE new_pair (
p_english IN VARCHAR2, p_french IN VARCHAR2);
END vocab1;
/
The vocab.new_pair procedure performs a straightforward insert into the table:
PROCEDURE new_pair (
p_english IN VARCHAR2,
p_french IN VARCHAR2)
IS
BEGIN
INSERT INTO translations
(english, french)
VALUES (p_english, p_french);
END new_pair;
This vocabulary is, furthermore, static during the user's session (that is, during the time when the user application accesses the translation table, no changes are being made to this table). Our challenge then becomes: What's the most efficient way to implement the lookup procedure?
We certainly have a wide set of choices, including:
Pure SQL approach: Simply query the English word for the French each time it's needed.
Full collection scan, a.k.a. "linear search": Use the "traditional" INDEX BY
BINARY_INTEGER collection to cache all the French-English pairs. Search the
entire collection for a match each time a lookup is needed.
Hash-based indexing: Build our own VARCHAR2-based index using Oracle's
hashing algorithm.
VARCHAR2-indexed associative array: Cache all French-English pairs using the
French word as the key, allowing direct lookup of the English word, all within
PL/SQL.
In the following sections, we'll examine each of these approaches, implemented in
distinct packages (vocab1 through vocab4). Then we'll execute a test script that compares
the performance of these implementations.
|
The vocab1 package offers the traditional, pure SQL approach to solving this problem:
CREATE OR REPLACE PACKAGE BODY vocab1
IS
FUNCTION lookup (
p_english IN VARCHAR2)
RETURN VARCHAR2
IS
v_french
translations.french%TYPE;
BEGIN
SELECT french
INTO v_french
FROM translations
WHERE english = p_english;
RETURN v_french;
END lookup;
...
END;
This is a wonderfully simple implementation and demonstrates the elegance of the SQL language. Each time Lookup is invoked, we make a roundtrip between PL/SQL and SQL, involving a "context switch." Oracle has been working hard to reduce the overhead of this switch, but you still pay a price. It would also be very convenient if this most straightforward of approaches is also the most efficient. We'll soon find out!
index-by-binary_integer TableIf we identify the context switch between PL/SQL and SQL as a potential performance problem, then we should seek ways to avoid that switch. One way to do this is to cache all the data in program memory, thereby removing the need for repeated SQL access. To do this, we must:
Retrieve all the data from our static vocabulary table.
Store the data in a session-persistent, PL/SQL data structure (the most appropriate in this case being the index-by table).
Scan through that collection to find the French word and then return the associated English word.
This technique has been available ever since Oracle 7.3.4. The implementation is contained in the vocab2.sql script file and is shown in Example 3. Here are high-level explanations of the major areas of functionality:
| Description | |
| 3-10 | Declare the index-by table type, the collection itself, and a query used to initialize the collection. |
| 12-23 | Use a FOR loop to scan through the collection linearly. As soon as a match
is found, RETURN that value. For those of you who prefer a more structured
approach using a WHILE loop, you'll find such an implementation in the
vocab2.sql script file. It's slightly less efficient than the FOR loop. |
| 26-35 | Initialize the collection by transferring the contents of each row to the collection. This happens only the first time the lookup function (or any other program in the package) is called. Then the data persists in the collection. |
1 CREATE OR REPLACE PACKAGE BODY vocab2
2 IS
3 TYPE word_list IS TABLE OF translations%ROWTYPE
4 INDEX BY BINARY_INTEGER /* can't use pls_integer pre-9.2 */;
5
6 g_english_french word_list;
7
8 CURSOR trans_cur
9 IS
10 SELECT english, french FROM translations;
11
12 FUNCTION lookup (p_english IN VARCHAR2)
13 RETURN VARCHAR2
14 IS
15 BEGIN
16 FOR indx IN 1 .. g_english_french.LAST ()
17 LOOP
18 IF g_english_french (indx).english = p_english
19 THEN
20 RETURN g_english_french (indx).french;
21 END IF;
22 END LOOP;
23 END lookup;
24
25 BEGIN /* package initialization */
26 DECLARE
27 indx PLS_INTEGER := 0;
28 BEGIN
29 FOR rec IN trans_cur
30 LOOP
31 indx := indx + 1;
32 g_english_french (indx).english := rec.english;
33 g_english_french (indx).french := rec.french;
34 END LOOP;
35 END;
36 END vocab2;
As you can see, the algorithm is trivial, but does require a fair amount of coding. In
Oracle 9i Release 2, we can simplify and improve the performance of the initialization
section, by the way, by using the BULK SELECT to directly populate the collection, as is
shown here:
BEGIN
OPEN trans_cur;
FETCH trans_cur BULK COLLECT INTO g_english_french;
CLOSE trans_cur;
This depends on the exciting new functionality in Oracle 9i that allows much wider
use of RECORD binds in SQL statements than previously was possible. (Here,
g_english_french is a table of RECORDs of the same shape as the translations table.) This will be the subject of our next article.
The linear search algorithm suffers from the well-known disadvantage that on average we'll examine half the elements in the index-by table before we find a match. A possible improvement is to maintain the elements in lexical sort order and to use a binary chop algorithm: Compare the search target to the half-way element to determine which half it's in; repeat this test recursively on the relevant half. This requires much more elaborate coding-and testing-especially if you also need to write update and insert procedures that modify the database table and the cached collection. This is the sort of code that one writes as an exercise in a computer science class. One shouldn't have to go through such efforts when writing production code with a language as robust and mature as PL/SQL.
And, in fact, we don't-even prior to Oracle 9i Release 2. An alternative path is
available to us by taking advantage of the Oracle hashing algorithm provided by the
DBMS_UTILITY.GET_HASH_VALUE. Without going into all the details, a hash
algorithm transforms a string into a number. If you use the algorithm correctly (give it
enough distinct integer values to choose from), there's a very good chance (but no
guarantee; see the end of this section for details on "conflict resolution") that every
distinct string will convert to a distinct integer value. It's possible, therefore, to use
hashing to construct our own string-based index, which can then be used to look up the
French word from the English.
You'll find the hash-based implementation in the vocab3.sql script, the key elements of which are shown in Example 4. While it's beyond the scope of this article to offer a thorough explanation of the technique, the following table provides an overview of those key elements.
| Description | |
| 3-8 | Declarations of variables and constants used for consistent hashing, as well as the index-by table holding the French translations. In this case, the row number of the collection will be the hashed value of the English word. |
| 12-18 | The lookup function takes the English word, hashes it to an integer, and then returns the French word found in that row. This makes more sense when you look at the initialization section. |
| 21-26 | The initialization section. In this case, for each row retrieved from the table, we hash the English word into an integer and then deposit the corresponding French word into that row in the collection. |
1 CREATE OR REPLACE PACKAGE BODY vocab3
2 IS
3 hash BINARY_INTEGER;
4 g_hash_base CONSTANT NUMBER := 1;
5 g_hash_size CONSTANT NUMBER := 1000000 ;
6
7 TYPE word_list IS TABLE OF translations.french%TYPE
8 INDEX BY BINARY_INTEGER;
9
10 g_english_french word_list;
11
12 FUNCTION lookup (p_english IN VARCHAR2) RETURN VARCHAR2
13 IS
14 BEGIN
15 hash := DBMS_UTILITY.get_hash_value (
16 p_english, g_hash_base, g_hash_size);
17 RETURN g_english_french (hash);
18 END lookup;
19
20 BEGIN /* package initialization */
21 FOR rec IN (SELECT english, french FROM translations)
22 LOOP
23 hash := DBMS_UTILITY.get_hash_value (
24 rec.english, g_hash_base, g_hash_size);
25 g_english_french (hash) := rec.french;
26 END LOOP;
27 END vocab3;
What a neat and clean algorithm! Unfortunately, it's a bit naive and definitely not
suited for real-world use. The sad fact of the matter is that no one has yet come up with a
hashing algorithm that can guarantee that two distinct values for the name IN parameter
to get_hash_value will always produce distinct hash or integer values. For hashing to be
"ready for prime time," one has to implement a "conflict resolution" algorithm (in other
words, if "TABLE" and "CHAIR" both hash to the same number, you have to detect this and store the information in separate, traceable rows).
Oracle doesn't provide built-in conflict resolution. On the other hand, it's not all that difficult to implement this logic; one of the simplest techniques is called "linear probe." The altind.pkg script contains an implementation of such conflict resolution logic; you'll only need to make minimal changes to adapt it to your own circumstances.
|
Of course, a yet more elaborate ambition pre-Oracle 9i Database Release 2 would be- after studying the relevant computer science textbooks-to implement a B*-tree structure in PL/SQL, horror of wheel re-invention notwithstanding!
The datastructure might look like this:
type Node_t is record (
value varchar2(20),
left_child binary_integer /* refer to the array... */,
right_child binary_integer /* ...index of the... */,
parent binary_integer /* ...relevant element. */ );
type Btree_t is table of Node_t index by binary_integer;
the_tree Btree_t;
However, the implementation would be very far from trivial, and is certainly too long and complex for inclusion in this article.
A far better approach-newly possible in Oracle 9i Database Release 2-is to use
precisely the B*-tree organization of the values but to do so implicitly via a language
feature, the index-by-varchar2 table.
The index-by-varchar2 table is optimized for efficiency of lookup on a non-numeric
key, where the notion of sparseness isn't really applicable. The index-by-*_integer table
(where now *_integer can be either pls_integer or binary_integer), in contrast, is optimized for compactness of storage on the assumption that the data is dense.
This implies that there might even be cases where, even though the key is inherently
numeric, it's better to represent it as an index-by-varchar2 table via a To_Char
conversion.
You can think of the index-by-varchar2 table as the in-memory PL/SQL version of the
schema-level index organized table. Using this approach, we can produce a simple,
intuitive package for both populating and looking up the French translations of English
words. See Example 5 and the following table for an explanation of this virtually
transparent code.
| Description | |
| 3-5 | Declare a VARCHAR2-indexed collection type, and an instance of that type.
In this case, each row of the collection contains a word in French, and the
index into that row is the word in English. |
| 7-12 | The lookup function simply returns the value found in the row for that English word. |
| 15-18 | The initialization section of the package deposits the French word in to the row indicated by the English word. |
1 CREATE OR REPLACE PACKAGE BODY vocab4
2 IS
3 TYPE word_list IS TABLE OF translations.french%TYPE
4 INDEX BY translations.english%type;
5 g_english_french word_list;
6
7 FUNCTION lookup (p_english IN VARCHAR2)
8 RETURN VARCHAR2
9 IS
10 BEGIN
11 RETURN g_english_french (p_english);
12 END lookup;
13
14 BEGIN /* package initialization */
15 FOR j IN (SELECT english, french FROM translations)
16 LOOP
17 g_english_french (j.english) := j.french;
18 END LOOP;
19* END vocab4;
You can't really get much more straightforward than that! In fact, the body of function Lookup is now so trivial that you may decide to dispense with it altogether (suppressing a mild qualm about violating some rule of best practice, since you're exposing the collection in the package specification) and use this stripped-down implementation (see vocab5.sql):
CREATE OR REPLACE PACKAGE vocab5
IS
TYPE word_list IS
TABLE OF translations.french%TYPE
INDEX BY translations.english%type;
lookup word_list;
END vocab5;
/
CREATE OR REPLACE PACKAGE BODY vocab5
IS
BEGIN /* package initialization */
FOR indx IN (
SELECT english, french
FROM translations)
LOOP
lookup (indx.english) :=
indx.french;
END LOOP;
END vocab5;
/
Note that it's not yet possible to use an index-by-varchar2 table as the target or source
of a bulk-binding construct. We can't, in other words, take advantage of the BULK
COLLECT syntax to deposit all the rows of translations directly into the lookup
collection in a single roundtrip.
And the Winner Is...
Whew. OK. So we have now a total of five different implementations we can test for optimal performance:
FOR loop (vocab2.sql)VARCHAR2 index) via a function call (vocab4.sql)VARCHAR2 index) via direct access (vocab5.sql) To compare performance, we take advantage of the DBMS_UTILITY.GET_TIME
function, which helps us calculate elapsed time down to the hundredth of a second. We've
encapsulated this function inside a "timer object," which is created in the tmr.ot script.
This object type allows us to easily declare, start, and stop multiple virtual timers inside
our PL/SQL code, as shown here (a fragment of the vocab.tst timing script):
DECLARE
v translations.french%TYPE;
db_tmr tmr_t :=
tmr_t.make ('DB lookup', 10000);
...
BEGIN
db_tmr.go;
FOR indx IN 1 .. 10000
LOOP
v := vocab1.lookup ('computer');
END LOOP;
db_tmr.STOP;
When we run the vocab.tst script for 5,000, 7,500, and 10,000 iterations of the FOR loops (as shown earlier), we get results that are typified by Table 1. Here are some
observations on these results:
Linear searches of collections should always be avoided, especially since, as of Oracle 9i Release 2, the alternatives are so easy to write. Not only is the linear search very much slower than any of the alternatives for a given number of iterations, but also (and even more to its detriment) it doesn't scale linearly as the number of iterations increases.
Note: In the test as constructed here, the time for the linear lookup scales as "sum of 1-to-N" for N iterations. It's easy to show that our reported numbers follow this pattern, either by doing some algebra or by writing a trivial program to sum 1-to- 5000, 1-to-7500, and 1-to-10000. This is left as an exercise for the reader!
Database lookups supported by an appropriate index are very efficient (10,000 executions of the query-based lookup function took less than two seconds)-they just aren't the most efficient path to take.
The associative array and hash-based lookups are all dramatically faster (an order of magnitude or higher) than the database lookup. Their improvement over linear search is far greater.
Removal of the function interface for the associative array (VARCHAR2 index) didn't
appreciably improve performance. You should avoid direct access to data structures
and instead hide them behind functions to improve overall maintainability of your
code.
Perhaps most impressive, the elapsed time for the index-supported database lookup and both the hash-based and associative array lookups scaled linearly as the number of iterations increased by an order of magnitude. These are all very stable algorithms.
These measurements were made on a stand-alone, high-end laptop running Windows 2000. The times are in seconds, and are the raw results of a single run of the test. Of course, they vary from run to run. Forgive us for not taking the next step and quoting our figures with appropriate precision and standard deviations!
| Method | 5000 Iterations | 7500 Iterations | 10000 Iterations |
| DB lookup | 0.80 | 1.13 | 1.52 |
| Linear search | 18.91 | 43.13 | 78.45 |
| Hash Index | 0.09 | 0.13 | 0.17 |
| Assoc Array | 0.05 | 0.09 | 0.11 |
| Assoc Array (without function) | 0.04 | 0.06 | 0.08 |
The bottom line for PL/SQL developers: By extending the flexibility of the collection syntax, storage, and access, we're able to write much simpler code that is much more efficient than implementations possible in earlier versions. It's incumbent upon all of us to become aware of these impressive new features, and then figure out how to best integrate them into existing and new application development projects.
This article was originally published in the July 2002 issue of Oracle Professional. The material in Feuerstein's articles--and those he cowrote with Bryn Llewellyn--are based on Oracle Corporation white papers originally prepared by Llewellyn for Oracle OpenWorld 2001 in San Francisco and OracleWorld Copenhagen in June 2002, and Feuerstein's book, Oracle PL/SQL Programming, 3rd Edition.
Steven Feuerstein is considered one of the world's leading experts on the Oracle PL/SQL language.
Bryn Llewellyn is PL/SQL Product Manager, Database and Application Server Technologies Development Group, at Oracle Corporation Headquarters.
O'Reilly & Associates recently released (September 2002) Oracle PL/SQL Programming, 3rd Edition.
Sample Chapter 10, Dates and Timestamps, is available free online.
You can also look at the Table of Contents, the Index, and the Full Description of the book.
For more information, or to order the book, click here.
Return to the O'Reilly Network.
Copyright © 2007 O'Reilly Media, Inc.