One of the new APIs available in the .NET Framework 2.0 is the new set of compression classes located in the
System.IO.Compression namespace. The two new classes in this namespace are
DeflateStream. Using these compression classes, you can enable compression and decompression in your .NET applications using the well-known GZip and Deflate algorithms.
One of the compelling applications of data compression is to reduce the size of data transmitted over the network. This is especially important when the cost of bandwidth is a major concern. Consider the case of a web service returning the content of a database table via a dataset. The content of a dataset is transmitted as XML when exposed as a web service. When consumed by a client connected through an expensive medium (such as GPRS), every byte of data exchanged is billable. And hence, to make it less expensive for clients to consume these types of web services, it makes sense to compress the data on the server end and decompress it when it is received on the client side.
In this article, I will show you how to use the compression classes in .NET 2.0 in a web service environment. You will see the benefits of using compression in your application so that you can decide if you want to use it for your own applications.
Using Visual Studio 2005, let's first build a web service. Name the project C:\DatasetWS. In the Service.vb file, import the following namespaces:
Imports System.Data Imports System.Data.SqlClient Imports System.Diagnostics Imports System.IO Imports System.IO.Compression Imports System.Text.Encoding
Next, define the
getRecords() web method so that it returns the
Employees table from the Northwind database as a dataset (exposed in a byte array).
Note: This example uses SQL Server 2005 Express edition with the Northwind sample database. Since SQL Server 2005 Express does not come with any sample databases, you need to install the sample databases yourself. You can install the pubs and Northwind sample databases by downloading their installation scripts. Once the scripts are installed on your system, go to the Visual Studio 2005 command prompt (Start -> Programs -> Microsoft Visual Studio 2005 -> Visual Studio Tools -> Visual Studio 2005 Command Prompt) and change to the directory containing your installation scripts. Type in the following to install the pubs and Northwind databases:
C:\SQL Server 2000 Sample Databases>sqlcmd -S .\SQLEXPRESS -i instpubs.sql C:\SQL Server 2000 Sample Databases>sqlcmd -S .\SQLEXPRESS -i instnwnd.sql <WebMethod()> _ Public Function getRecords() As Byte() Dim connStr As String = _ "Data Source=.\SQLEXPRESS;Initial Catalog=Northwind;" & _ "Integrated Security=True" Dim sql As String = "SELECT * FROM Employees" Dim conn As SqlConnection = New SqlConnection(connStr) Dim comm As SqlCommand = New SqlCommand(sql, conn) Dim dataadapter As SqlDataAdapter = New SqlDataAdapter(comm) Dim ds As DataSet = New DataSet() '---open the connection and fill the dataset--- conn.Open() dataadapter.Fill(ds, "Employees_table") conn.Close() '---convert the dataset to XML--- Dim datadoc As System.Xml.XmlDataDocument = _ New System.Xml.XmlDataDocument(ds) Dim dsXML As String = datadoc.InnerXml Return ASCII.GetBytes(dsXML) End Function
Note that instead of returning the dataset as a
Dataset object, I have chosen to return it as a byte array. This is to allow us to add compression easily later on.
That's all you need for the web service. To test the web service, simply press
F5 and note the URL for the web service. You should see something like this:
11496 is a random port number that Visual Studio 2005 uses to launch my web service. You should have a different number on your computer.
Let's now add a Windows application project to the solution. Go to File -> New -> "Project..." and add a new Windows Application project to the current solution. Name the project C:\DatasetWSConsumer.
Populate the default Form1 with the following controls (see Figure 1):
Figure 1. Populating the default Form1
Add a web reference to the web service created earlier. Name the web reference "dataWS" (see Figure 2). Click Add Reference.
Figure 2. Adding a web reference to the web service
Switch to the code-behind of Form1 and import the following namespaces:
Imports System.IO Imports System.IO.Compression Imports System.Text.Encoding
Double-click on the Load button to switch to its event handler. Code the following:
Private Sub btnLoad_Click( _ ByVal sender As System.Object, _ ByVal e As System.EventArgs) _ Handles btnLoad.Click '---create a proxy obj to the web service--- Dim ws As New dataWS.Service '---create a dataset obj--- Dim ds As New DataSet '---create a stopwatch obj--- Dim sw1, sw2 As New Stopwatch '---time the download--- sw1.Start() '---connect to the web service--- Dim dsBytes As Byte() = ws.getRecords Label1.Text = "Size of download: " & dsBytes.Length '---convert the byte array into string and ' then read it into the dataset obj--- ds.ReadXml(New IO.StringReader(ASCII.GetString(dsBytes))) sw1.Stop() Label2.Text = "Time spent: " & sw1.ElapsedMilliseconds & "ms" '---bind it to the DataGridView control--- DataGridView1.DataSource = ds DataGridView1.DataMember = "Employees_table" End Sub
Essentially, this connects to the web service to fetch the dataset and then binds it to the DataGridView control. Press
F5 to test the application. After clicking the Load button, the DataGridView control will be populated. Figure 3 shows the output.
Figure 3. Binding the dataset to the DataGridView control
The first time you click the Load button, you will notice that it took a while for the DataGridView control to be loaded. Subsequently, the loading will be much faster as the data from the database is cached on the web service end. Observe the size of the data downloaded and the time taken for the download. You should click the Load button a few times so that you can obtain the average time required to download the data. In this case, it took about 65ms (on average) to download 266KB.
To see how compression will improve the application we have built, let's modify the project so that it supports compression. On the web service's end, add the
Compress() function in Service.vb as follows:
Public Function Compress(ByVal data() As Byte) As Byte() Try '---the ms is used for storing the compressed data--- Dim ms As New MemoryStream() Dim zipStream As Stream = Nothing zipStream = New GZipStream(ms, _ CompressionMode.Compress, True) '---or--- 'zipStream = New DeflateStream(ms, _ ' CompressionMode.Compress, True) '---compressing using the info stored in data--- zipStream.Write(data, 0, data.Length) zipStream.Close() ms.Position = 0 '---used to store the compressed data (byte array)--- Dim compressed_data(ms.Length - 1) As Byte '---read the content of the memory stream into ' the byte array--- ms.Read(compressed_data, 0, ms.Length) Return compressed_data Catch ex As Exception Return Nothing End Try End Function
Basically, this function compresses the data stored in a byte array using the
GZipStream class and then stores the compressed data in a stream object. The compressed data is then returned as a byte array.
To use this
Compress() function, modify the
getRecords() web method as follows:
<WebMethod()> _ Public Function getRecords() As Byte() Dim connStr As String = _ "Data Source=.\SQLEXPRESS;Initial Catalog=Northwind;" & _ "Integrated Security=True" Dim sql As String = "SELECT * FROM Employees" Dim conn As SqlConnection = New SqlConnection(connStr) Dim comm As SqlCommand = New SqlCommand(sql, conn) Dim dataadapter As SqlDataAdapter = New SqlDataAdapter(comm) Dim ds As DataSet = New DataSet() '---open the connection and fill the dataset--- conn.Open() dataadapter.Fill(ds, "Employees_table") conn.Close() '---convert the dataset to XML--- Dim datadoc As System.Xml.XmlDataDocument = _ New System.Xml.XmlDataDocument(ds) Dim dsXML As String = datadoc.InnerXml '---perform compression--- Dim compressedDS() As Byte compressedDS = Compress(UTF8.GetBytes(dsXML)) Return compressedDS '------------------------- End Function
On the client's side, add the
Decompress() function to the code behind of Form1:
Public Function Decompress(ByVal data() As Byte) As Byte() Try '---copy the data (compressed) into ms--- Dim ms As New MemoryStream(data) Dim zipStream As Stream = Nothing '---decompressing using data stored in ms--- zipStream = New GZipStream(ms, CompressionMode.Decompress) '---or--- 'zipStream = New DeflateStream(ms, _ ' CompressionMode.Decompress, True) '---used to store the decompressed data--- Dim dc_data() As Byte '---the decompressed data is stored in zipStream; ' extract them out into a byte array--- dc_data = ExtractBytesFromStream(zipStream, data.Length) Return dc_data Catch ex As Exception Return Nothing End Try End Function
The compressed data is copied into a memory stream object and then decompressed using the
GZipStream class. The decompressed data is extracted into a byte array using the
ExtractFromStream() method, which is defined next:
Public Function ExtractBytesFromStream( _ ByVal stream As Stream, _ ByVal dataBlock As Integer) _ As Byte() '---extract the bytes from a stream object--- Dim data() As Byte Dim totalBytesRead As Integer = 0 Try While True '---progressively increase the size ' of the data byte array--- ReDim Preserve data(totalBytesRead + dataBlock) Dim bytesRead As Integer = _ stream.Read(data, totalBytesRead, dataBlock) If bytesRead = 0 Then Exit While End If totalBytesRead += bytesRead End While '---make sure the byte array contains exactly the number ' of bytes extracted--- ReDim Preserve data(totalBytesRead - 1) Return data Catch ex As Exception Return Nothing End Try End Function
Because you do not know the actual size of the decompressed data, you have to progressively increase the size of the data array used to store the decompressed data. The
dataBlock parameter suggests the number of bytes to copy at a time. A good rule of thumb is to use the size of the compressed data as the block size, such as:
'---data is the array containing the compressed data dc_data = ExtractBytesFromStream(zipStream, data.Length)
Since the data returned by the
getRecord() web method is now compressed, you need to decompress it before it can be loaded onto a dataset object. Modify the Load button event handler as follows:
Private Sub btnLoad_Click( _ ByVal sender As System.Object, _ ByVal e As System.EventArgs) _ Handles btnLoad.Click '---create a proxy obj to the web service--- Dim ws As New dataWS.Service '---create a dataset obj--- Dim ds As New DataSet '---create a stopwatch obj--- Dim sw1, sw2 As New Stopwatch sw1.Start() Dim dsBytes As Byte() = ws.getRecords Label1.Text = "Size of download: " & dsBytes.Length '---perform decompression--- Dim decompressed_dsBytes() As Byte sw2.Start() decompressed_dsBytes = Decompress(dsBytes) sw2.Stop() Label3.Text = "Decompression took: " & _ sw2.ElapsedMilliseconds & "ms" ds.ReadXml(New _ IO.StringReader(ASCII.GetString(decompressed_dsBytes))) '--------------------------- sw1.Stop() Label2.Text = "Time spent: " & sw1.ElapsedMilliseconds & "ms" DataGridView1.DataSource = ds DataGridView1.DataMember = "Employees_table" End Sub
I have also timed how long it takes to perform the decompression so that you have a good idea of how much time is actually spent on performing the decompression.
That's it! Press
F5 to test the program. As usual, click the Load button a few times and observe the size of the data as well as the time taken for each task. Figure 4 shows you the average time I observed.
Figure 4. Using compression
It is good to examine the numbers that you have obtained so that you can understand the usefulness of using compression in your application.
Table 1 summarizes the data that you have obtained before using compression and after using it.
|Tasks||Size of Download (bytes)||Time taken (ms)||Decompression Time (ms)|
Table 1. Data obtained before and after using compression
First, observe that with compression, the data is reduced from 266330 to 124148 bytes, yielding a compression ratio of 46 percent. Although I only measured the decompression time (which takes about 13ms), compression time is about the same. The decompression time of 13ms is small compared to the relatively longer time required to transmit and load the data (83ms). Overall, with compression, the time needed to populate the DataGridView control is slighted increased.
In experimenting with different data sizes, it is observed that the compression and decompression times are more or less constant. For example, instead of loading the
Employees table, I loaded the
[Order Details] table. The uncompressed data size for the table is 343902 bytes, and after compression the size is 24987 bytes, giving a compression ratio of 7.3 percent. However, the decompression time is almost similar to that when decompressing a smaller data block.
Interestingly, two blocks of data with the same size but different contents might yield very different compression ratios (the lower the number, the better it is), and text files are much more receptive to compression than binary files such as .exe and .jpg files. For compression to be effective, the block of data to be compressed should be large; compressing small blocks of data actually inflates the data size and wastes precious time in compressing and decompressing.
You should also note that using compression on the web service side will increase the workload of the web server, and hence you need to factor this into your consideration of whether to use compression or not.
In this article you have seen how to use the new compression classes in .NET 2.0. While the implementation of these classes are not as efficient as those utilities in the market (which Microsoft has admitted), they are nevertheless very useful in cases where you need to reduce your data size. What's more, they are free and hence, I have no complaints!
Wei-Meng Lee (Microsoft MVP) http://weimenglee.blogspot.com is a technologist and founder of Developer Learning Solutions http://www.developerlearningsolutions.com, a technology company specializing in hands-on training on the latest Microsoft technologies.
Return to the Windows DevCenter.
Copyright © 2009 O'Reilly Media, Inc.