About Author

Shabir has 12 yrs of exp in I.T using Asp, Asp.NET, Vb6/.NET/C#, MOSS/WMI/html/DOM, Ajax, XML,xsl.SharePoint

Name:Shabir Hakim
Country: India
Gender: Male

What is a DataSet and How it works

DataSet is the core of the ADO.NET disconnected architecture and is used to store data in a disconnected state. It was designed to fully support the use of XML so an XML document can be read into a DataSet or a DataSet can be exported to XML. This allows a DataSet to be easily transported across a network so it can be used as a return from a Web service or other type of remote call. Changes to a DataSet can be propagated back to the data source from where the data originated. A DataSet is fully navigable forward or backward and the contents can be filtered, sorted, or searched as desired, making all records accessible at any given point.

The DataSet, part of the System.Data namespace, is not specific to any one data provider, but rather is independent of them. A DataSet relies on a DataAdapter specific to each provider to be the intermediary between the DataSet and the data store. The DataAdapter uses a DataReader for the specific provider to fill the contents of the DataSet.
Version 1.0 of the Microsoft .NET Framework includes a SqlDataAdapter specific to SQL Server and an OleDbDataAdapter that is more generic for OLEDB Providers. Version 1.1 of the .NET Framework introduced the OdbcDataAdapter for ODBC data sources, and OracleDataAdapter for connecting to Oracle databases. In addition, each database vendor often has its own .NET provider. In addition, you can implement your own DataAdapter through the IDataAdapter interface. In prior implementations of ADO, a Recordset could access multiple query results, but was not aware of relationships between them. A DataSet can expose a hierarchical model of tables similar to a relational database. The object model can be used to retrieve records related to the current record. A DataSet can contain one or more DataTable objects that have primary key, foreign key, and constraints between them and can enforce constraints such as unique or be configured to ignore them. Rather than using a join to combine tables of data, you typically use a DataAdapter to retrieve each table separately, populate the DataSet contents with the multiple adapters, and then link them together in the DataSet through DataRelation objects.

Sample DataSet C# code


The C# sample code in   Example 2 demonstrates how to use a DataSet to retrieve the same data as the example above. This time a connection does not have to be opened or closed because the DataAdapter will automatically handle it.

You create an instance of a DataSet by calling the DataSet constructor. Optionally specify a name argument. If you do not specify a name for the DataSet, the name is set to "NewDataSet".

You can also create a new DataSet based on an existing DataSet. The new DataSet can be an exact copy of the existing DataSet; a clone of the DataSet that copies the relational structure or schema but that does not contain any of the data from the existing DataSet; or a subset of the DataSet, containing only the modified rows from the existing DataSet using the GetChanges method.

The following code example demonstrates how to construct an instance of a DataSet.

Dim customerOrders As DataSet = New DataSet("CustomerOrders")


DataSet customerOrders = new DataSet("CustomerOrders");

Populating a DataSet:

The ADO.NET DataSet is a memory-resident representation of data that provides a consistent relational programming model independent of the data source. The DataSet represents a complete set of data including tables, constraints, and relationships among the tables. Because the DataSet is independent of the data source, a DataSet can include data local to the application, as well as data from multiple data sources. Interaction with existing data sources is controlled through the DataAdapter. The SelectCommand property of the DataAdapter is a Command object that retrieves data from the data source. The InsertCommand, UpdateCommand, and DeleteCommand properties of the DataAdapter are Command objects that manage updates to the data in the data source according to modifications made to the data in the DataSet. These properties are covered in more detail in Updating Data Sources with DataAdapters. The Fill method of the DataAdapter is used to populate a DataSet with the results of the SelectCommand of the DataAdapter. Fill takes as its arguments a DataSet to be populated, and a DataTable object, or the name of the DataTable to be filled with the rows returned from the SelectCommand. The Fill method uses the DataReader object implicitly to return the column names and types used to create the tables in the DataSet, as well as the data to populate the rows of the tables in the DataSet. Tables and columns are only created if they do not already exist; otherwise Fill uses the existing DataSet schema. Column types are created as .NET Framework types according to the tables in Mapping .NET Data Provider Data Types to .NET Framework Data Types. Primary keys are not created unless they exist in the data source and DataAdapter.MissingSchemaAction is set to MissingSchemaAction.AddWithKey. If Fill finds that a primary key exists for a table, it will overwrite data in the DataSet with data from the data source for rows where the primary key column values match those of the row returned from the data source. If no primary key is found, the data is appended to the tables in the DataSet. Fill uses any mappings that may exist when populating the DataSet (see Setting Up DataTable and DataColumn Mappings).

A dataset can be populated in multiple number of ways. Manually constructed Rows and Tables Data from an XML File Data from a database Manually constructed Rows and Tables:    This is usually preferred when we need to process an existing set of data such as calculations on a database, reading from an unconventional data source etc., If we need data from such inconsistent data sources to be rendered into an application written on our .Net framework (including asp .net, c#, vb .net) etc., then we can go for manipulating the data and then creating the Dataset manually.     Before constructing a dataset in .net manually, it should be prepared by adding datacolumn types to it. The DataTable thus constructed should be added to the DataSet using DataSet.Tables.Add(DataTable variablename) function. This will add the DataTable with the specific definition of Column types into the DataSet. If needed more DataTables can be added to the DataSet as it is capable of holding multiple Database objects at a single point of time.
    After preparing the DataSet and DataTable to contain the Correct Column Types, the following is the code snippet for adding DataRow into the Dataset.

dtrow = myDataTable.NewRow();
dtrow["Name"] = "Pervez";
dtrow["Age"] = 27;
myDataTable.Rows.Add(dtrow);  . 

The above code snippet will add a row of data as DataRow to the DataTable. Any number of rows can be added in a similar fashion with either Processed or Unprocessed data. This can finally be rendered by a manual iteration or Data binding to a DataGrid. Reading Data from XML File into a DataSet: This is a very straight forward operation and the code involved is very simple. The following Code snippet can pull the data from an XML File into the Data set. DataSet ds = new DataSet(); ds.ReadXml("XML File Path"); This DataSet can then be used to manipulate data for rendering on the screen. Reading Data from a Database into a DataSet: This needs the SqlConnection object of .Net to connect to the database first

c#

CSharp Source Code

string strDBConnection ="server=(local);database=DatabaseName;user id=UserName;password=Pwd;
  connection reset=false;connection lifetime=5;Trusted_Connection=Yes;"
SqlConnection ObjConn;
ObjConn = new SqlConnection(strDBConnection);            
string strSQL = "Select * from [DatabaseName].[OwnerName].[TableName] order by FieldName"; //Open the connection ObjConn.Open(); //Create a command ObjCmd = new SqlCommand(strSQL,ObjConn); SqlDataAdapter sqlData = new SqlDataAdapter(ObjCmd); DataSet dsSelectData = new DataSet(); sqlData.Fill(dsSelectData);

vb.net

Vb.net Source Code

Dim strDBConnection As String = "server=(local);database=DatabaseName;user
id=UserName;password=Pwd; connection reset=false;connection lifetime=5;Trusted_Connection=Yes;" 
Dim ObjConn As SqlConnection  ObjConn = New SqlConnection(strDBConnection) 
Dim strSelectSql As String= "Select * from [DatabaseName].[OwnerName].[TableName] order by FieldName" 

'Open the connection
ObjConn.Open() 'Create a command Dim ObjCmd As New SqlCommand(strSQL,ObjConn) Dim sqlData As New SqlDataAdapter(ObjCmd) Dim dsSelectData As New DataSet() sqlData.Fill(dsSelectData)

Populating a DataSet from Multiple DataAdapters

Any number of DataAdapter objects can be used with a DataSet. Each DataAdapter can be used to fill one or more DataTable objects and resolve updates back to the relevant data source. DataRelation and Constraint objects can be added to the DataSet locally, enabling you to relate data from dissimilar data sources. For example, a DataSet can contain data from a Microsoft SQL Server database, an IBM DB2 database exposed via OLE DB, and a data source that streams XML. One or more DataAdapter objects can handle communication to each data source.

Example
The following code example populates a list of customers from the Northwind database on Microsoft SQL Server 2000, and a list of orders from the Northwind database stored in Microsoft Access 2000. The filled tables are related with a DataRelation, and the list of customers is then displayed with the orders for that customer. For more information about DataRelation objects, see Adding a Relationship Between Tables and Navigating a Relationship Between Tables.

VB.NET

 ' Assumes that customerConnection is a valid SqlConnection object. 
' Assumes that orderConnection is a valid OleDbConnection object.
   Dim custAdapter As SqlDataAdapter
         = New SqlDataAdapter("SELECT * FROM dbo.Customers", customerConnection)
  Dim ordAdapter As OleDbDataAdapter = New OleDbDataAdapter("SELECT * FROM Orders", orderConnection)
  Dim customerOrdersAs DataSet = New DataSet()
  custAdapter.Fill(customerOrders, "Customers")
  ordAdapter.Fill(customerOrders, "Orders")  
  Dim relation As DataRelation =  customerOrders.Relations.Add("CustOrders",
  customerOrders.Tables("Customers").Columns("CustomerID"), _ 
  customerOrders.Tables("Orders").Columns("CustomerID"))  
  Dim pRow, cRow As DataRow   For Each pRow In customerOrders.Tables(
             "Customers").Rows
  Console.WriteLine(pRow("CustomerID").ToString())  
  For Each cRow In pRow.GetChildRows(relation)
  Console.WriteLine(vbTab & cRow("OrderID").ToString())
  Next Next 

C#

 // Assumes that customerConnection is a valid SqlConnection object.
 // Assumes that orderConnection is a valid OleDbConnection object.
  SqlDataAdapter custAdapter = new SqlDataAdapter("SELECT * FROM dbo.Customers", customerConnection);
  OleDbDataAdapter ordAdapter = new OleDbDataAdapter("SELECT * FROM Orders", orderConnection);
  DataSet customerOrders = new DataSet();
  custAdapter.Fill(customerOrders, "Customers");
  ordAdapter.Fill(customerOrders, "Orders");
  DataRelation relation = customerOrders.Relations.Add("CustOrders",
 customerOrders.Tables["Customers"].Columns["CustomerID"],
  customerOrders.Tables["Orders"].Columns["CustomerID"]);
  foreach (DataRow pRow in customerOrders.Tables["Customers"].Rows)
  {
  Console.WriteLine(pRow["CustomerID"]);
  foreach (DataRow cRow in pRow.GetChildRows(relation))
  Console.WriteLine("\t" + cRow["OrderID"]);
  }

Features of DataSets

1 - Working with Disconnected Data

The data in DataSet is disconnected from database. Once you fetch the results of a query into a DataSet using a DataAdapter object, there is no longer a connection between DataSet and database. Changes you make to the contents of the DataSet will not affect the database. If other users modify data in the database that corresponds to the data in DataSet, you will not see those changes in your DataSet.

Working with disconnected data structures definitely has its benefits. The first major benefit of working with disconnected data is that it does not require a live connection to your database. Once you’ve fetched the results of your query into a DataSet object, you can close the connection to your database and continue to work with the data in your DataSet.

Disconnected data structures such as DataSets are also helpful when you build multi-tiered applications. If your application uses business objects running on a middle-tier server to access database, business object needs to pass disconnected data structures to client application. The DataSet object is designed for use in such situations. You can pass the contents of a DataSet from one component to another. The component that receives the data can work with the information as a DataSet (if the component is built using the Microsoft .NET Framework) or as an XML document.

2 - Scrolling, Sorting, Searching, and Filtering

The DataSet object lets you examine the contents of any row in your DataSet at any time. You can loop back and forth through the results of your query as often as you like. This makes the DataSet object ideal for scenarios in which your code needs to loop through data, such as in reporting routines. You can also easily build an application that allows a user to scroll back and forth through the results of a query.

DataSet objects also let you change the way you view the results of queries. You can sort the data in a DataSet based on a column or a series of columns. You can search for a row of data based on simple search criteria. You can also apply a filter to the data in your DataSet so that only rows that satisfy the desired criteria are visible.

3 - Working with Hierarchical Data

DataSet objects are designed to work with hierarchical data. The DataSet object lets you define relationships between the tables of data stored in the DataSet.

4 - Caching Changes

Working with read-only data is easy. One of the biggest challenges in building a database application is to transform the user’s input into changes to the contents of your database. Building such logic into a multi-tiered application can present an even greater challenge if your application needs to cache changes and submit them to your database all at once.

The DataSet object lets you cache changes to a row of data so that you can submit the changes to your database using a DataAdapter. You can also examine modified rows in your DataSet to determine how the row has changed (inserted, modified, or deleted) as well as to compare both the original and current values for each row.

5 - XML Integration

The ADO.NET DataSet was built from the ground up to work with XML. You can save and load the contents of a DataSet to and from files as XML documents. The DataSet also lets you separate the schema information (table, column, and constraint information) into an XML schema file.

In ADO.NET, DataSet objects and XML documents are almost interchangeable. It’s easy to move from one data structure to the other. This duality allows developers to use the interfaces they’re most comfortable with. XML programmers can work with DataSet objects as XML documents, and database programmers can work with XML documents as DataSet objects.

6 - Uniform Functionality

Developers who have worked with ADO might be aware that the Recordset object has features similar to those of the DataSet. The ADO Recordset object supports features such as filtering, searching, sorting, and caching updates. However, the manner in which you open a Recordset plays a large part in determining what functionality is available in the Recordset.

For example, if you use just the default settings for the ADO Recordset and Connection objects, you cannot get an accurate count of the number of rows in the Recordset. The Recordset object has a Supports method that developers often use to determine the functionality available: Can I modify the contents of the Recordset? If I update a row, will the Recordset send the change to the database immediately or will it be cached? Can I bind my Recordset to a grid? Can I move to the previous row?

The reason that not all Recordset objects support the same functionality is that the Recordset object tries to be everything to everyone. Whether you’re working with a firehose cursor, a server-side cursor, or disconnected data in ADO, you’re using a Recordset object

Creating a DataSet

Visual Basic

Dim ds As New 
DataSet("DataSetName")
Console.WriteLine(ds.DataSetName)       

Using C#.Net

DataSet ds = new 
DataSet("DataSetName");
Console.WriteLine(ds.DataSetName);
                              

Filling the DataSet Object

Filling the data from database into dataset object is a very easy process. Here we can use either SQL query or a stored procedure. Below is the example of how to fill data using query.

Visual Basic

Dim strConn, strSQL As 
String strConn = "Provider=SQLOLEDB;Data Source=(local);"Initial Catalog=Northwind;Trusted_Connection=Yes;"
strSQL
= "SELECT CustomerID,CompanyName, ContactName, Phone FROM Customers"
Dim da
As New OleDbDataAdapter(strSQL, strConn) Dim ds As New DataSet() da.Fill(ds,"Customers"<)

C#

string strConnstrSQL;
strConn
= "Provider=SQLOLEDB;Data Source=(local);"Initial Catalog=Northwind;Trusted_Connection=Yes;";
strSQL
= "SELECT CustomerID,CompanyName, ContactName, Phone FROM Customers"; OleDbDataAdapter da = new OleDbDataAdapter(strSQL,strConn); DataSet ds = new DataSet(); da.Fill(ds,"Customers");

The following example consists of several methods that, combined, create and fill a DataSet from the Northwind database.

Visual Basic

Option Explicit On
Option Strict On
Imports System.Data
Imports system.Data.SqlClient
Public Class NorthwindDataSet
    Public Shared Sub Main()
       Dim connectionString 
          As String =  GetConnectionString()
        ConnectToData(connectionString)
   End Sub
    Private Shared Sub ConnectToData( ByVal connectionString As String)
        ' Create a SqlConnection to the Northwind database.        
 Using connection As SqlConnection = New SqlConnection( connectionString)
            ' Create a SqlDataAdapter for the Suppliers table.        
  Dim suppliersAdapter As SqlDataAdapter =  New SqlDataAdapter()
            ' A table mapping names the DataTable.     
        suppliersAdapter.TableMappings.Add("Table", "Suppliers")
            ' Open the connection.           
  connection.Open()
            Console.WriteLine("The
                SqlConnection is open.")
          ' Create a SqlCommand to retrieve Suppliers data.          
   Dim suppliersCommand As SqlCommand
             = New SqlCommand( "SELECT
                 SupplierID, CompanyName FROM dbo.Suppliers;", connection)
            suppliersCommand.CommandType = CommandType.Text
          ' Set the SqlDataAdapter's SelectCommand.         
    suppliersAdapter.SelectCommand = suppliersCommand
           ' Fill the DataSet.             
               Dim  dataSet AsDataSet = 
                   New DataSet("Suppliers")
            suppliersAdapter.Fill(dataSet)
          ' Create a second SqlDataAdapter and SqlCommand to get
             ' the Products table, a child table of Suppliers.             
                 Dim  productsAdapter As SqlDataAdapter =   New SqlDataAdapter()
            productsAdapter.TableMappings.Add("Table", "Products")
             Dim  productsCommand As SqlCommand = 
New SqlCommand("SELECT ProductID, SupplierID FROM dbo.Products;", connection)
            productsAdapter.SelectCommand = productsCommand
            ' Fill the DataSet.             productsAdapter.Fill(dataSet)
          ' Close the connection.             connection.Close()
           Console.WriteLine("The
               SqlConnection is closed.")
           ' Create a DataRelation to link the two tables ' based
              on the SupplierID.            Dim  parentColumn AsDataColumn = _
               dataSet.Tables("Suppliers").Columns("SupplierID")
            Dim  childColumn As 
               DataColumn = _
               dataSet.Tables("Products").Columns("SupplierID")
             Dim  relation As DataRelation
            = New  System.Data.DataRelation("SuppliersProducts",  parentColumn, childColumn)
            dataSet.Relations.Add(relation)
           Console.WriteLine( "The
              {0} DataRelation has been created.",  relation.RelationName)
        End Using
   End Sub
    Private Shared Function GetConnectionString() As String       
' To avoid storing the connection string in your code, '
           you can retrieve it from a configuration file.        
 Return "Data Source=(local);Initial Catalog=Northwind;"
                & "Integrated Security=SSPI;"
   End Function End Class 

c#

using System;
using  System.Data;
using  System.Data.SqlClient;
namespace Microsoft.AdoNet.DataSetDemo
{
    class NorthwindDataSet
    {
       static void Main()
        {
            string connectionString = GetConnectionString();
            ConnectToData(connectionString);
        }
        private static void ConnectToData(
           string connectionString)
        {
            //Create a SqlConnection to the Northwind database.           
 using (SqlConnection connection = new SqlConnection(connectionString))
            {
                //Create a SqlDataAdapter for the Suppliers table.                
 SqlDataAdapter adapter =new SqlDataAdapter();            
    // A table mapping names the DataTable.               
  adapter.TableMappings.Add("Table", "Suppliers");
                // Open the connection.                
 connection.Open();
                Console.WriteLine("The
                    SqlConnection is open.");
                // Create a SqlCommand to retrieve Suppliers data.                
SqlCommand command = new SqlCommand("SELECT
                        SupplierID, CompanyName FROM dbo.Suppliers;",connection);
                command.CommandType = CommandType.Text;
                // Set the SqlDataAdapter's SelectCommand.                
 adapter.SelectCommand = command;
                // Fill the DataSet.                
                    DataSet dataSet = new DataSet("Suppliers");
                adapter.Fill(dataSet);
                // Create a second Adapter and Command to get
                // the Products table, a child table of Suppliers. 
                
                    SqlDataAdapter productsAdapter = new SqlDataAdapter();
                productsAdapter.TableMappings.Add("Table", "Products");
                SqlCommand productsCommand = new SqlCommand("SELECT ProductID,
                        SupplierID FROM dbo.Products;", connection);
                productsAdapter.SelectCommand = productsCommand;
              // Fill the DataSet.                 productsAdapter.Fill(dataSet);
           // Close the connection.                 connection.Close();
                Console.WriteLine("The
                    SqlConnection is closed.");
             // Create a DataRelation to link the two tables // based
                on the SupplierID.                
 DataColumn parentColumn = dataSet.Tables["Suppliers"].Columns["SupplierID"];
                DataColumn childColumn =dataSet.Tables["Products"].Columns["SupplierID"];
                DataRelation relation = new System.Data.DataRelation("SuppliersProducts"
,parentColumn, childColumn);
                dataSet.Relations.Add(relation);
                Console.WriteLine(  "The
                    {0} DataRelation has been created.",  relation.RelationName);
            }
        }
        static private string GetConnectionString()
        {
            // To avoid storing the connection string in your code,
                // you can retrieve it from a configuration file.           
          return "Data Source=(local);Initial Catalog=Northwind;"+ "Integrated Security=SSPI";
        }
    }
}

Databases offer different mechanisms that you can use to ensure that the data in your database is valid. The sample Northwind database has many rules and constraints defined. The CustomerID column in the Customers table must be populated with a string of up to five characters, and that value must be unique within the table. The Orders table generates a new OrderID value for each row and requires that the CustomerID value for each row refer to an existing entry in the Customers table.

Sometimes you’ll want to apply similar rules to validate data in your application before submitting changes to your database. For example, let’s say you’re shopping on line and reach the page where you purchase the items in your basket. Most Web sites will make sure you’ve entered information into each of the required fields before they submit your order information to the appropriate database.

This type of logic might seem redundant because the database probably has similar validation rules defined. However, adding validation rules to your application can improve its performance. If a user fails to enter a credit card number, either by accident or in the hope that the system programmers were extremely lazy, the code for the Web page can easily determine that it can’t successfully submit the order without having to contact the database. The other benefits of this approach are a slight reduction of network traffic and a lighter load on your database.

The ADO.NET DataSet offers many of the same data validation mechanisms available in database systems. You can separate these validation mechanisms, also called constraints, into two categories—column-level restrictions and table-level restrictions.

The DataColumn object exposes a number of properties that you can use to validate your data.

  1. ReadOnly The simplest way to ensure that your data is valid is to not let users modify it. If you want to make the data in a DataColumn read-only, set the ReadOnly property of the DataColumn to True.

  2. AllowDBNull Some database columns require values, while others accept empty, or null, values. The DataColumn object exposes an AllowDBNull property that you can set to control whether the column in your DataSet accepts null values.

  3. MaxLength Many databases place restrictions on the size of a string in a column. In the Customers table, for example, the CustomerID column accepts a string of up to 5 characters and the CompanyName column accepts up to 40 characters. You can place similar restrictions on a DataColumn using the MaxLength property.

  4. Unique The DataColumn lets you specify which values in a column are unique using the Unique property. When you set this property to True on a DataColumn, ADO.NET will examine the value stored in this column of each row in your DataTable. If you add or modify a row in your DataTable to create a duplicate value in a unique column, ADO.NET will throw a ConstraintException.

You can also validate data in your DataSet by setting properties of the DataTable object. The ADO.NET object model includes two classes that you can use to define constraints in a DataTable. These classes, UniqueConstraint and ForeignKeyConstraint, are derived from the Constraint class. The DataTable exposes a Constraints property that you can use to add to, modify, or examine the constraints on the DataTable.

  1. UniqueConstraints
    If you set the Unique property of a DataColumn to True, you’ve defined a unique constraint in the DataTable that contains that column. At the same time, you’ve also added a UniqueConstraint object to the DataTable object’s Constraints collection. Setting the Unique property of a DataColumn is simpler than creating a new UniqueConstraint in a DataTable object’s Constraints collection. However, there are times when you’ll want to explicitly create a UniqueConstraint, such as when you need to make sure that the combinations of values from multiple columns are unique.

  2. PrimaryKey
    A primary key is a special type of unique constraint. The ADO.NET DataRowCollection object has a Find method that you can use to locate a row in your DataTable by the value or values in its primary key column, as shown here.

row MyTable Rows Find("something")


A DataTable can have multiple unique constraints but can contain at most one primary key. You can set or examine a DataTable object’s primary key using its PrimaryKey property.

  1. ForeignKeyConstraint
    You can also add foreign constraints to a DataTable. I described an example of a foreign key constraint just a couple pages back. Each order in the Northwind database’s Orders table must have a value for its CustomerID column that is used in the Customers table. You can place similar restrictions on the data in your DataSet by creating a ForeignKeyConstraint and adding it to the table whose rows you want to validate.

You generally won’t need to explicitly create a ForeignKeyConstraint. Creating a DataRelation between two DataTable objects within your DataSet creates a ForeignKeyConstraint in the procss.