If you ever needed to parse form-based multipart data, you are probably aware of the Jakarta Commons FileUpload library. But while this library is primarily intended to parse data directly from HttpRequest objects and thus handle uploading of files in your web application, it could be used to parse an arbitrary InputStream and extract multiparts from it too.
As you can probably guess, the part of the library that is related to handling file uploads is well documented, but for the low-level API you can only find the broken example in the JavaDocs. I know that this funcionality is rarely needed, but still it deserves a working example, so here it is.
The actual class that is used for this low-level parsing is org.apache.commons.fileupload.MultipartStream. It accepts two parameters, an input stream that contains data that we want to parse and the byte array that contains the boundary of this multipart block.
Now let’s proceed to the example. Let’s suppose that we have two variables, that we can get from the http request:
-
bufferthat holds data - and
contentTypethat holds a request header describing the content type of the request
Let’s not digg deeper into multipart format (more information on it you can find here). The one thing that is important is how our contentType variable looks like, so that we can create a valid boundary for parsing. In the following snippet, you can find the typical content type header:
Content-Type: multipart/form-data; boundary=---------------------------223702928913614
So, the first thing that we need to do is to create a valid byte array that will be used for parsing:
int boundaryIndex = contentType.indexOf("boundary=");
byte[] boundary = (contentType.substring(boundaryIndex + 9)).getBytes();
The above code is straight-forward; we find the “boundary=” substring and get the text that follows.
Now that we have a valid boundary, we can create the input stream (in this example we will assume that data are kept in a String - the buffer variable) and instantiate a MultipartStream object.
ByteArrayInputStream input = new ByteArrayInputStream(buffer.getBytes());
MultipartStream multipartStream = new MultipartStream(input, boundary);
Now we can extract multiparts from the document:
boolean nextPart = multipartStream.skipPreamble();
while(nextPart) {
String headers = multipartStream.readHeaders();
System.out.println("Headers: " + headers);
ByteArrayOutputStream data = new ByteArrayOutputStream();
multipartStream.readBodyData(data);
System.out.println(new String(data.toByteArray());
nextPart = multipartStream.readBoundary();
}
Briefly, the skipPreamble() method ignores data until it reaches the boundary. Then, we can use the readHeaders() and readBodyData() to read multipart headers and data respectively (note that you must pass OutputStream object to readBodyData()). Also, you must process both headers and data. In case that you are only interested in headers you can use the discardBodyData() method instead of the readBodyData() used in this example.


Hello,
I tryed to use this method but I could not find a working library. I've downloaded the latest nightly build from apache, but The MultipartStream class appears to be incomplete....any suggestions...10x