Writing Filters for Apache 2.0
08/23/2001When the Apache developers first began talking about Apache 2.0, one of the major goals was for one module to be able to modify the output of another. This goal was realized earlier this year with the sixth alpha version. The mechanism used to make modifications are called filters. Originally it was difficult to write filters, but during the past few releases, the developers have improved the interface so that filters are much easier to create.
This article will cover some of the basic concepts of Apache filters. In my next column, I'll walk you through creating a filter. In the column after that, I will apply the same concepts toward writing an input filter.
Standard Filters
Filters work because the Apache developers consider Web pages as chunks of information. In general, we don't care what those chunks look like or how they are stored on the server. In Apache filter terminology, each chunk is stored in a bucket, and lists of buckets form brigades. Lists of brigades can then create a Web document. Filters operate on one brigade at a time, and are called upon repeatedly until the entire document has been processed. This allows the server to stream information to the client.
The basic Apache distribution includes several standard filters.
The first is the content_length_filter. This filter computes the content length of the response if possible. If the response is not fully available when this filter is first called and the protocol allows the server to send the response without a content-length header, then this filter just passes data to the next filter. It continues to count bytes, however, for logging purposes.
The second standard filter is the header_filter. The first time this filter is called, it formats the header table and sends all of the headers to next filter before sending the current page. This is important, because if your filter wants to modify headers, it must be inserted before the header_filter and it must buffer the entire page until it has made all of the modifications to the headers. Once your filter passes data to the next filter in the stack, you have effectively told Apache that you are done with that data, and it can be sent to the client.
The final filter is always the core_output_filter. This filter is responsible for writing all data to the network. To provide optimal usage of the available network bandwidth, Apache will buffer as much as 9KB of data before sending it to the client. However, filters can force Apache to send data immediately by flushing the current filter stack.
Filter Types and Their Meanings
Before a filter can be enabled for a given request, it must be registered with the server. This is done using the ap_register_output_filter function. This function is invoked with three arguments: the filter name, the filter function pointer and the filter type, such as:
ap_register_output_filter("CONTENT_LENGTH",
ap_content_length_filter,
AP_FTYPE_HTTP_HEADER);
|
|
The filter name is a server-wide unique identifier for this filter. No two filters can use the same string as their filter_name. For this reason, it is recommended that filter names have some sort of namespace protection unique to each module. The filter function is the function that should be added to the filter stack whenever this filter is specified. Next month, we will cover this function in more detail. Finally, a filter type must be specified. All filters have a type associated with them; this helps Apache to order filters correctly. The following is a list of filter types with their associated meanings.
AP_FTYPE_CONTENT
This filter type specifies that the filter will be used to modify the content of the Web page itself. Examples of this type of filter are SSI or PHP.AP_FTYPE_HTTP_HEADER
This is a special filter type to give modules that want to modify headers an opportunity. All filters of this type are run after allAP_FTYPE_CONTENTfilters are run. Examples of this filter type are thecontent_lengthandhttp_headerfilters.AP_FTYPE_TRANSCODE
This filter type represents filters that will modify how a response is sent to the client, but not the content itself. An example of this type of filter is thechunking_filter, which breaks a response into chunks for the client to interpret. All filters of this type are run afterAP_FTYPE_HTTP_HEADERfilters.AP_FTYPE_CONNECTION
This filter type is used to modify how the server interprets HTTP data. These filters should not be used to modify the data in the request or response itself, because these filters are called afterAP_FTYPE_TRANSCODEfilters, so by this time the server has already created the headers for the request or response. An example of this type of filter is thehttp_infilter, which parses multiple requests on the same connection into individual requests for processing by the server.AP_FTYPE_NETWORK
This is the final filter type, and it is always the last filter type to run. This filter type is responsible for reading and writing data to and from the network.
Most filter writers will focus exclusively on AP_FTYPE_CONTENT filters. Once a filter is registered with the server, it can be added for a request. This is done using the ap_add_output_filter function, and is usually specified with the SetFilter directive in the httpd.conf file. The ap_add_output_filter accepts four arguments:
ap_add_output_filter(const char *name, void *ctx,
request_rec *r, conn_rec *c);
|
|
The first argument is the name that was registered with ap_register_output_filter.
The ctx argument is an arbitrary pointer that is passed to the filter each time that it is called. This is useful when a single function implements multiple function. The final two arguments are a request_rec and conn_rec that the filter uses each time it is called. If a request_rec is not available, that field can safely be NULL. If the request_rec is NULL, the conn_rec must be provided. This allows a single filter chain to be used on both a request and sub-request, without requiring Apache to determine which request goes with which filter. Associating a request with a filter is done when adding the filter to the filter stack.
This article has just barely scratched the surface of filters, and we will take the next two months to delve into this topic. Writing filters is a complex topic, but by taking it slow, they can become a powerful way to enhance a Web server.
Ryan Bloom is a member of the Apache Software Foundation, and the Vice President of the Apache Portable Run-time project.
Read more Apache 2.0 Basics columns.
Return to the Apache DevCenter.
You must be logged in to the O'Reilly Network to post a talkback.
Showing messages 1 through 5 of 5.
-
Streaming of bulk data through a processing filter
2005-03-11 18:50:18 nanyerri [Reply | View]
Excuse my ignorance but I have not previously written Web services.
However, I am wanting to set up a Web service where client nominated files of indeterminate length (possibly > 10MBs) can be streamed into a data processing service. The service would process the data on the fly and stream the results back to the client.
The service is currently a Unix filter written in 'C' that performs sophisticated data manipulations on a record by record basis. At the moment, client's send their data electronically (FTP or CD), which is then processed in batch and returned.
The intent is to have the Web service act like a standard Unix filter that can be incorporated transparently into a client's own normal data processing logic.
For example:
<Client routines to extract data for processing> |
Call to Remote Web Data Processing Service |
<Client routines to store results of Web Service>
Is it possible with Apache filters (or any other Web Service) to support this type of processing? or, do the client's file and generated results have to continue to be FTP'ed between machines?
-
New Apache module
2002-08-24 04:43:21 frouni [Reply | View]
I'm writing a new module in C that
communicates with a mysql database and grab geographic data according to the
webclient's IP address. for the moment all the functions are in php code but
my goal is to make these functions available and easy to use whatever the
Imagine that you write a new apache module and you implement a function that
returns geographic data (city, latitude, longitude...) of the current user. the goal is to make this
function profitable for users who pragram in php, asp, perl or java by using
something like "HTTP_LOCATION", "HTTP_CITY" headers
Or for users who build simple html
websites by using something like <LOCATION> or <CITY IP"11.11.11.11"> in
thier html code.
any help please ?
thx
-
New Apache module
2006-01-16 18:56:03 viswa_pm [Reply | View]
Dear Frouni,
This is viswa. Am also planning to do the same But my request is some what different. My module is running in Server A. and my Apache is running on Server B. when the webclient connects to Server B then my Apachi shoud connnect to Server A ( at this point the web client IPaddress should pass to Server A).Then my Server A will return the country name to Server B. I need to check that country name if the country name is INDIA then i should allow to view the my web page. Otherwise i should display the error ?
Any help please ?
Regards,
Viswa
-
Apache Input Filter
2002-08-19 02:46:16 amish [Reply | View]
I am writing Input filter for Apache 2.0. I have made filter program filter.so file and then I make a entry in httpd.conf file with LoadModule. but the apache server is giving error at starting.
Can I have sample filter program with procedure how to load that in conf file ?
Thanks,
-Amish




I have a project in which I have to write an Input Filter that transform an HTTP GET request to HTTP POST request before that request is handled by the Apache http proxy.
It works now, my Input Filter but I have a problem that I don't know how to filter one URI request by using Apache directives before pass that request to my filter.
Can anyone please help me?
Thanks.