JAVA HACK
|
|
|
Avoid the dreaded replaceAll method
The replaceAll method of the java.lang.String class has an unexpected behavior. Escaped double quotes (\") are unescaped. A sound replacement is org.apache.commons.lang.StringUtils#replace(String, String, String).

Contributed by: Unknown User anonymous2
[06/29/04 | Discuss (4) | Link to this hack] |
The java.lang.String#replaceAll(String, String) method has a unexpected behavior. Escaped double quotes are unescaped. So:
"a string".replaceAll("string", "\"TEST\"")
will not produce a \"TEST\". Instead, the escaped double quotes will be unescaped and thus the output will be a "TEST".
A sound replacement for the replaceAll method is org.apache.commons.lang.StringUtils#replace(String, String, String). This method will not unescape the double quotes.
Hack submitted by Steven Devijver.
Comment on this hack
You must be logged in to the O'Reilly Network to post a comment.
Showing messages 1 through 4 of 4.
-
RE: Avoid the dreaded replaceAll method
2005-03-07 10:31:48
dustbort
[Reply | View]
-
RE: Avoid the dreaded replaceAll method
2006-08-07 09:25:07
npgall
[Reply | View]
-
its really helpful about replaceAll method
2005-07-13 23:52:26
madhujava
[Reply | View]
-
Escape?
2004-11-08 04:03:34
eckes
[Reply | View]
|
Showing messages 1 through 4 of 4.
|
|
O'Reilly Home | Privacy Policy

© 2007 O'Reilly Media, Inc.
Website:
| Customer Service:
| Book issues:
All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners.
|
|
"Note that backslashes (\) and dollar signs ($) in the replacement string may cause the results to be different than if it were being treated as a literal replacement string. Dollar signs may be treated as references to captured subsequences as described above, and backslashes are used to escape literal characters in the replacement string."
The fact that dollar-sign references are available in the replacement String indicates that the replacement string is processed by the regex engine before the replacement is made. As with all Java String literals that are parsed by the regex engine, backslashes must be doubly-escaped, once for the Java compiler and again for the regex engine.
(I'll omit the literal-delimiting double-quotes for clarity.) In your example, the literal
\"TEST\"becomes"TEST"when compiled by Java. The double-quote is not a special regex character, so it is unchanged by the regex engine before the replacement is made.If you want the final result to be
\"TEST\", your replacement string must be\\\\\"TEST\\\\\". When Java compiles the String, it is stored internally as\\"TEST\\". After the regex engine parses the string, it becomes\"TEST\"before it is appended as the replacement in the output string.Users should know that this in not a bug. If all you need is character-for-character String matches, the apache method may be for you. But, if you need more powerful regex matches in the search string, or $ references in the replacement string, the apache method won't cut it.