HTTP Communication from Within the Oracle Database
Pages: 1, 2, 3, 4
Encoding of Character Data
In the classical client/server architecture, the database and the client may use different
encoding schemes to represent character data. For example, in a Japanese application, the
database might use (a variety of) the EUC character set, and the client might use (a variety
of) the SJIS character set. Thus character-encoding conversion is required. The solution is
well known and long established: Oracle Net transparently handles the conversion (as
specified by the database character set and the NLS_LANG client environment variable).
A corresponding issue exists for Utl_Http. When a request is sent, it may need to be
encoded differently than the database character set (because the requestor knows that the
target URL requires this). And when a response is received, it may again be encoded
differently than the database character set (because that's the non-negotiable behavior of the
target URL).
There are two areas of concern when sending a request: the URL and the request body.
When sending by the "GET" method, all request parameterization is via the URL itself,
typically after the ? delimiter. Search terms, for example, are normally handled this way.
HTTP defines no convention for specifying different character sets for the URL and
expects that everything is 7-bit ASCII. Other character-encoding schemes should be
represented as the hex codes of their bytes using the %nn notation. (The sender of the
request must know from documentation which character set the URL expects to decode
from the hex representation.) Oracle 9i introduces the Utl_Url package, which has
functions to convert from the database character set to a hex-coded representation of a
specified character set, and vice versa. In addition, these functions handle the conversion
of the reserved symbols: percent (%), semi-colon (;), slash (/), question mark (?), colon (:),
at sign (@), ampersand (&), equals sign (=), plus sign (+), dollar sign ($), and comma (,).
|
Web Services The world of B2B is changing rapidly. Obviously, a business that exposes itself on the Web as a provider will want as many requestors to access it as possible. This will be more likely if the protocol for their service is highly standardized. Similarly, a business in the role of consumer-looking, for example, for a provider to supply a certain kind of paint in large quantity at a favorable discount-will want to be able to discover quickly and easily who the potential providers are and how to access their service. This is where Web Services, SOAP, UDDI, WSDL-and, of course, WC3-come into play.
You can read more at http://www.w3.org/2002/ws. And you can also view an example of the WSDL for a stock quote Web Service exposed via SOAP. All this is way outside the scope of this article, but it helps you to see the important role of |
When sending by the "PUT" method, the character set of the request body should be
set via the charset attribute of the Content-Type in the request header, using the new
Utl_Http.Set_Header procedure. If this is done, it gives Oracle sufficient information to
transform appropriately when sending a character request body (by using
Utl_Http.Write_Text). If the charset attribute isn't set in the request header, then no
character set conversion takes place unless the user has catered for it via the overloaded
procedure Utl_Http.Set_Body_Charset. The variant Set_Body_Charset(charset
varchar2)-a.k.a. the global variant-allows the user to set a fallback character set, to be
assumed, if no other information is provided, for both requests and responses for the
session. The variant Set_Body_Charset(r Utl_Http.Req, charset varchar2)-a.k.a. the
request variant-allows the user to insist on a character set for the body for this request.
(A record of PL/SQL type Utl_Http.Req is returned when the HTTP request is begun with
Utl_Http.Begin_Request.) The choice made via the request variant won't only override that
made via the global variant but will also override that made via the charset attribute of the
request header. For this reason, the recommended way to specify the character set
conversion for the request body is via the charset attribute of the header. Only if the user
has a special reason for leaving this unspecified in the request header would he use the
request variant of Set_Body_Charset.
There's just one area of concern when receiving the response: the response body. If the
implementation of the URL is well-mannered, then the character set of the response body
will be specified correctly in the charset attribute of the Content-Type in the response
header, accessible to the user via the procedure Utl_Http.Get_Header. Oracle will
implicitly perform the appropriate conversion in connection with calling
Utl_Http.Read_Text. However, this is often not set. In this case, the user can use the
global variant of Set_Body_Charset to determine the character set conversion. However,
the charset attribute of the response header is sometimes set wrong. (This is likely when
pages in different character sets are served up as files from the file system seen by the
Web server, since the Content-Type header information will often be set globally for the
server with no mechanism to make it file-specific.) For this reason, a third overloaded
variant Set_Body_Charset(r Utl_Http.Resp,charset varchar2) is provided-a.k.a. the
response variant. (A record of PL/SQL type Utl_Http.Resp is returned when the HTTP
response is received with Utl_Http.Get_Response.) The choice made via the response
variant will override that made via the global variant and that expressed via the charset
attribute of the response header.
Note: From Oracle 8i v8.1.6 and pre-Oracle 9i, Oracle detected the charset of the response body (if this was specified) and used the information to do the character set conversion. And if the charset attribute of the response body wasn't specified, then no conversion took place, and no overriding or fallback mechanism was provided. Under special circumstances (for example, fetching a SJIS Japanese response where the charset attribute isn't specified into an EUC database), problems arose pre- Oracle 9i.
Thus the user now has full control over all character set conversion issues. In an
extreme case, where the response body is Content-Type text/HTML and where the HTML
<meta> tag is used to specify the character set, the user can retrieve the response body into a PL/SQL RAW with Utl_Http.Read_Raw and then write custom code to parse the
HTML and to convert to the database character set in a PL/SQL VARCHAR2 once the
response character set is discovered.
A Closer Look at Utl_Http Calls
Example 1 shows: how to send an HTTP request-setting the proxy information, setting the method to "GET," providing username/password authentication information, and setting the request header-and how to get the response-retrieving the status code, the header information, and the response body. The "GET" method is suitable for non-parameterized URLs or for URLs with a manageable volume of parameter name-value pairs. The maximum length of the URL string is limited by the capacity of the PL/SQL VARCHAR2 variable used to pass it. The "POST" method is suitable for parameterizing the request with an arbitrarily large volume of data, especially, for example, as might be the case when the request is expressed as an XML document.
Example 1. A simple demonstration of Utl_Http's basic features.
DECLARE
req Utl_Http.req;
resp Utl_Http.resp;
NAME VARCHAR2 (255);
VALUE VARCHAR2 (1023);
v_msg VARCHAR2 (80);
v_url VARCHAR2 (32767) := 'http://otn.oracle.com/';
BEGIN
/* request that exceptions are raised for error Status Codes */
Utl_Http.set_response_error_check (ENABLE => TRUE );
/* allow testing for exceptions like Utl_Http.Http_Server_Error */
Utl_Http.set_detailed_excp_support (ENABLE => TRUE );
Utl_Http.set_proxy (
proxy => 'www-proxy.us.oracle.com',
no_proxy_domains => 'us.oracle.com'
);
req := Utl_Http.begin_request (url => v_url, method => 'GET');
/*
Alternatively use method => 'POST' and Utl_Http.Write_Text to
build an arbitrarily long message
*/
Utl_Http.set_authentication (
r => req,
username => 'SomeUser',
PASSWORD => 'SomePassword',
scheme => 'Basic',
for_proxy => FALSE /* this info is for the target Web server */
);
Utl_Http.set_header (r => req, NAME => 'User-Agent', VALUE => 'Mozilla/4.0');
resp := Utl_Http.get_response (r => req);
DBMS_OUTPUT.put_line ('Status code: ' || resp.status_code);
DBMS_OUTPUT.put_line ('Reason phrase: ' || resp.reason_phrase);
FOR i IN 1 .. Utl_Http.get_header_count (r => resp)
LOOP
Utl_Http.get_header (r => resp, n => i, NAME => NAME, VALUE => VALUE);
DBMS_OUTPUT.put_line (NAME || ': ' || VALUE);
END LOOP;
BEGIN
LOOP
Utl_Http.read_text (r => resp, DATA => v_msg);
DBMS_OUTPUT.put_line (v_msg);
END LOOP;
EXCEPTION
WHEN Utl_Http.end_of_body
THEN
NULL;
END;
Utl_Http.end_response (r => resp);
EXCEPTION
/*
The exception handling illustrates the use of "pragma-ed" exceptions
like Utl_Http.Http_Client_Error. In a realistic example, the program
would use these when it coded explicit recovery actions.
Request_Failed is raised for all exceptions after calling
Utl_Http.Set_Detailed_Excp_Support ( ENABLE=>FALSE )
And it is NEVER raised after calling with ENABLE=>TRUE
*/
WHEN Utl_Http.request_failed
THEN
DBMS_OUTPUT.put_line (
'Request_Failed: ' || Utl_Http.get_detailed_sqlerrm
);
/* raised by URL http://xxx.oracle.com/ */
WHEN Utl_Http.http_server_error
THEN
DBMS_OUTPUT.put_line (
'Http_Server_Error: ' || Utl_Http.get_detailed_sqlerrm
);
/* raised by URL http://otn.oracle.com/xxx */
WHEN Utl_Http.http_client_error
THEN
DBMS_OUTPUT.put_line (
'Http_Client_Error: ' || Utl_Http.get_detailed_sqlerrm
);
/* code for all the other defined exceptions you can recover from */
WHEN OTHERS
THEN
DBMS_OUTPUT.put_line (SQLERRM);
END;



