O'Reilly Databases

oreilly.comSafari Books Online.Conferences.

We've expanded our coverage and improved our search! Search for all things Database across O'Reilly!

Search Search Tips

advertisement
AddThis Social Bookmark Button

Listen Print Discuss Subscribe to Databases Subscribe to Newsletters

Using Regular Expressions in PostgreSQL
Pages: 1, 2, 3

Matching a list of characters

Unlike the previous examples, these next regular expressions carry out queries using a pattern defined in a "class," a list of characters enclosed by brackets, [ ]. Think of a class as a simplified way of expressing logical ORs within a SQL statement.



Study the results returned by the queries:

SELECT record FROM myrecords WHERE record ~ '[a]';
SELECT record FROM myrecords WHERE record ~ '[A]';
SELECT record FROM myrecords WHERE record ~* '[a]';
SELECT record FROM myrecords WHERE record ~ '[ac]';
SELECT record FROM myrecords WHERE record ~ '[ac7]';
SELECT record FROM myrecords WHERE record ~ '[a7A]';
SELECT record FROM myrecords WHERE record ~* '[ac7]';

What happens when you search for a character, the letter z, that is not present in any string in the table?

SELECT record FROM myrecords WHERE record ~ '[z]';
SELECT record FROM myrecords WHERE record ~ '[z7]';

This statement excludes all strings with the characters 4 OR a:

SELECT record FROM myrecords WHERE record !~ '[4a]';

The use of the dash, -, in the class implies a search range--that is, to return all strings having any digits between 1 and 4:

SELECT record FROM myrecords WHERE record ~ '[1-4]';

Here are a few more range examples:

SELECT record FROM myrecords WHERE record ~ '[a-c5]';
SELECT record FROM myrecords WHERE record ~* '[a-c5]';
SELECT record FROM myrecords WHERE record ~ '[a-cA-C5-7]';

Matching two or more characters in a pattern

The previous examples made pattern searches one character or class at a time. However, you can use many classes in a pattern. These three statements return the same string from the example table:

SELECT record FROM myrecords WHERE record ~ '3[a]';
SELECT record FROM myrecords WHERE record ~ '[3][a]';
SELECT record FROM myrecords WHERE record ~ '[1-3]3[a]';

These two statements also return the same records:

SELECT record FROM myrecords WHERE record ~ '[23][a]';
SELECT record FROM myrecords WHERE record ~ '[2-3][a]';

Always keep in mind the potential for confusion. This next returns nothing because there is no such character string, ac, in any of the records in the example table:

SELECT record FROM myrecords WHERE record ~ '[a][c]';

However, introducing a range for each class returns the records:

SELECT record FROM myrecords WHERE record ~ '[a-b][b-c]';

Excluding strings in a class

To create a class of characters to exclude, insert the caret, ^, immediately after the left bracket of a class. Inserting at any other point other than as the first character in the class means you're looking for the caret in a string rather than excluding the class' character pattern.

Notice that queries can still return strings containing those excluded characters. Class exclusions prevent explicit searches using those characters.

This next statement excludes all digits from 0 to 9 from the target search. In other words, this expression returns strings that don't include digits.

SELECT record FROM myrecords WHERE record ~ '[^0-9]';

Making choices

The pipe | denotes alternation. In other words, it's a logical OR for pattern searches. Suppose you want to return all records with strings that begin with a or end with c:

SELECT record FROM myrecords WHERE record ~ '^a|c$';

This statement returns strings beginning either with a or 5, or ending with c:

SELECT record FROM myrecords WHERE record ~ '^a|c$|^5';

This next example performs a targeted search excluding digits and all lowercase letters. Control characters, spaces, and uppercase letters can all appear in the search pattern:

SELECT record FROM myrecords WHERE record ~ '[^0-9|^a-z]';

Repeating characters (quantifiers)

Sometimes you need to find strings that have repeated sets of the same character(s). You can find them with quantifiers--represented by the characters *, ?, and + as well as by digits enclosed within braces { }.

To find a sequence of 0 or more matches, use *:

SELECT record FROM myrecords WHERE record ~ 'a*'';

To find a sequence of one or more matches, use +:

SELECT record FROM myrecords WHERE record ~ 'b+';

To find a sequences of zero or one occurrence, use ?:

SELECT record FROM myrecords WHERE record ~ 'a?';

To find a sequences of exactly # matches, use {#}:

SELECT record FROM myrecords WHERE record ~ '[0-9]{3}';

To find a sequences of exactly # or more matches, use {#,}:

SELECT record FROM myrecords WHERE record ~ '[0-9]{4,}';

To find a sequences of # through ## (inclusive) matches, where # does not exceed ##, use {#, ##}:

SELECT record FROM myrecords WHERE record ~ '[a-c0-9]{2,3}';

Pages: 1, 2, 3

Next Pagearrow




Tagged Articles

Be the first to post this article to del.icio.us

Sponsored Resources

  • Inside Lightroom

Related to this Article

Understanding Oracle Clinical Understanding Oracle Clinical
by Joan M. Johnson
June 2009
$9.99 USD

New Features in Oracle 9i New Features in Oracle 9i
by Howard J. Rogers
June 2009
$5.95 USD

Advertisement
O'Reilly Media

©2009, O'Reilly Media, Inc.
(707) 827-7000 / (800) 998-9938
All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners.
About O'Reilly
Academic Solutions
Authors
Contacts
Customer Service
Jobs
Newsletters
O'Reilly Labs
Press Room
Privacy Policy
RSS Feeds
Terms of Service
User Groups
Writing for O'Reilly
Content Archive
Business Technology
Computer Technology
Google
Microsoft
Mobile
Network
Operating System
Digital Photography
Programming
Software
Web
Web Design
More O'Reilly Sites
O'Reilly Radar
Ignite
Tools of Change for Publishing
Digital Media
Inside iPhone
makezine.com
craftzine.com
hackszine.com
perl.com
xml.com

Partner Sites
InsideRIA
java.net
O'Reilly Insights on Forbes.com