Published on ONDotNet.com (http://www.ondotnet.com/)
 See this if you're having trouble printing code examples

Improving Typed DataSets

by Shawn Wildermuth

The first time I used a typed DataSet, it was much like the beginning of a relationship. After dealing with raw DataSets, typed DataSets seemed elegant and perfect. Soon the cracks in the facade appeared. I knew that typed DataSets were much easier to work with than the raw DataSets, but I still longed to be able to change some of the way that the code was generated. Unlike relationships, we have some limited control of how typed DataSets work. In this article, I will show you typed DataSet annotations and how they can change the way that typed DataSets are generated.

The Typed DataSet Rationale

For the uninitiated, typed DataSets are way of creating a classes that derive from the standard ADO.NET classes of DataSet, DataTable, DataRow, etc. For example, if you were to try and access the CustomerID value from the first row in the Customers table within an untyped DataSet, the code would look something like this:

DataSet dataSet = new DataSet();

// Fill the DataSet (Code omitted for brevity)

string customerID = (string)dataSet.Tables["Customers"].Rows[0]["CustomerID"];

There are three problems with this code. First, the syntax is dependent on lookups, so that the syntax is muddled and not immediately clear to the reader of the code. Second, any misspelling of "Customers" or "CustomerID" would only show up as a run-time error, not a compile error (where we would like it to happen, to help us find this bug sooner). Lastly, we have to have knowledge that the CustomerID field is in fact a string and not an int, Guid, or other type. In a perfect world, it would be nice if the access were more like a class hierarchy:

MyTypedDataSet dataSet = new MyTypedDataSet();

// Fill the DataSet (Code omited for brevity)

string customerID = dataSet.Customers[0].CustomerID;

This syntax is much cleaner, don't you think? This is a leap forward in productivity as well, since IntelliSense will now allow us to view the typed members more easily. The problem comes in that we do not have much control over how the objects are named. The Customers table is called that, but the individual row class is called CustomersRow. This is clear, but not necessarily a naming convention that is cohesive with naming conventions throughout your enterprise. Though Microsoft has not given us full control over the code generation, they did add Typed DataSet Annotations to help solve some of the more common issues.

Overview of Annotations

Annotations are simply a set of extensions to the raw XSD file that is used by .NET to generate the typed DataSet. In general, I like the code the typed DataSet generates, but by using annotations, I can solve some common problems:

In order to use annotations, you need to modify the raw XSD file to include a new namespace:

<xs:schema id="MyTypedDataSet" 

Once you have added the namespace, you're ready to start annotating the typed DataSet!

Renaming Classes and Properties

Probably the most common use of annotations is to rename classes and properties in the typed DataSet to something more friendly to your development team. By default, the typed DataSet names are generated classes by the name of the table element:

DataSet Element Default Naming Annotation to Modify
DataTable TableNameDataTable typedPlural
DataTable methods NewTableNameRow
DataRowCollection TableName typedPlural
DataRow TableNameRow typedName
DataSet Events TableNameRowChangeEvent

For example, if your table is named Customers, the DataTable class will be named CustomersDataTable; the DataRowCollection will be named Customers, and the method to create a new DataTableRow is called NewCustomersRow. To change these names, you will want to add codegen annotations to change the typedPlural and typedName of the table element:

<xs:element name="Customers" 

Once this change is made, the DataTable is called MyCustomersDataTable, the DataRowCollection is now called MyCustomers, and the new DataTableRow method is now called NewMyCustomerRow. You can also use the typedName annotation to change the way that individual DataColumns are named, to allow you to change the name:

<xs:element name="Customers" 
      <xs:element name="CustomerID" type="xs:string" />
      <xs:element name="CompanyName" type="xs:string" />
      <xs:element name="ContactName" type="xs:string" minOccurs="0" />
      <xs:element name="ContactTitle" 
      <xs:element name="Address" type="xs:string" minOccurs="0" />
      <xs:element name="City" type="xs:string" minOccurs="0" />
      <xs:element name="Region" type="xs:string" minOccurs="0" />
      <xs:element name="PostalCode" type="xs:string" minOccurs="0" />
      <xs:element name="Country" type="xs:string" minOccurs="0" />
      <xs:element name="Phone" type="xs:string" minOccurs="0" />
      <xs:element name="Fax" type="xs:string" minOccurs="0" />
      <xs:element name="BirthDate" type="xs:dateTime" minOccurs="0" />

Renaming Relationship Accessors

When you have set up a typed DataSet with relationships between tables, the generated code allows you to navigate up and down each relationship using a method that return the matching rows in the child table and a property to access the parent rows. By default, the method to get the child rows is called GetTableNameRows and the property for getting the parent row is named TableName. In this case, we actually need to annotate the code gen of the relationship (or keyref in the XSD file):

<xs:keyref name="CustomersOrders" 
  <xs:selector xpath=".//mstns:Orders" />
  <xs:field xpath="mstns:CustomerID" />

In the CustomerRow class, we now have a method called TheOrders that returns the orders for the particular customer. Conversely, in the OrdersRow class, we now have have a property called TheCustomer that returns the CustomerRow who owns a particular order. Simple, huh?

Dealing with Database Nulls

By default, in DataSets (and typed DataSets, as well), when you try and access a value in a row that is null in the database, an exception is thrown. Typed DataSets make this easier by allowing you to call "IsFieldNameNull()" methods to determine if a field is null before you try and access it. Sometimes it would be nice to have a null behave differently than throw an exception or force us to test for the null. Annotations come to the rescue again. Within each field in a typed DataSet, you can specify a nullValue annotation to tell the typed DataSet how to react when an underlying field is DbNull. The possible values are:

Value Behavior
_throw Throw an exception. (This is what happens when you do not specify an annotation.)
_null Returns a null reference if the field type is a reference type, or throws an exception if the field is a value type (e.g. strings return null, ints throw an exception.)
_empty Returns String.Empty for strings, returns an object from an empty constructor from all other reference types. Still throws an exception if the field is a value type.
Replacement Value Specifies a default value to be returned when the type is null. The replacement must be compatible with type (e.g. nullValue="0" for an int, but nullValue="Hi There" for a string.)

You can use these annotations like so:

<xs:element name="ShipName" 
<xs:element name="ShipAddress" 
<xs:element name="ShipVia" 
<xs:element name="ShippedDate" 

By annotating these fields, we can control the way the nulls are handled. The first two fields (ShipName and ShipAddress) return empty strings when a DbNull is encountered. The third field (ShipVia) defaults the field to zero and the last field (ShippedDate) defaults to January 1st, 1980.


While annotations will not fix all issues we have with typed DataSets, it does allow us some flexibility over how naming and null behaviors are handled.

Shawn Wildermuth is the founder of ADOGuy.com and is the author of "Pragmatic ADO.NET" for Addison-Wesley.

Return to ONDotnet.com

Copyright © 2009 O'Reilly Media, Inc.