About Author

Shabir has 12 yrs of exp in I.T using Asp, Asp.NET, Vb6/.NET/C#, MOSS/WMI/html/DOM, Ajax, XML,xsl.SharePoint

Name:Shabir Hakim
Country: India
Gender: Male

Language-Integrated Query (LINQ) Introduction to LINQ Queries

A query is an expression that retrieves data from a data source. Queries are usually expressed in a specialized query language. Different languages have been developed over time for the various types of data sources, for example SQL for relational databases and XQuery for XML. Therefore, developers have had to learn a new query language for each type of data source or data format that they must support. LINQ simplifies this situation by offering a consistent model for working with data across various kinds of data sources and formats. In a LINQ query, you are always working with objects. You use the same basic coding patterns to query and transform data in XML documents, SQL databases, ADO.NET Datasets, .NET collections, and any other format for which a LINQ provider is available.

All LINQ query operations consist of three distinct actions:

  1. Obtain the data source.

  2. Create the query.

  3. Execute the query.

The following example shows how the three parts of a query operation are expressed in source code. The example uses an integer array as a data source for convenience; however, the same concepts apply to other data sources also. This example is referred to throughout the rest of this topic.

class IntroToLINQ
{        
    static void Main()
    {
        // The Three Parts of a LINQ Query:
        //  1. Data source.
        int[] numbers = new int[7] { 0, 1, 2, 3, 4, 5, 6 };

        // 2. Query creation.
        // numQuery is an IEnumerable<int>
        var numQuery =
            from num in numbers
            where (num % 2) == 0
            select num;

        // 3. Query execution.
        foreach (int num in numQuery)
        {
            Console.Write("{0,1} ", num);
        }
    }
}

The following illustration shows the complete query operation. In LINQ the execution of the query is distinct from the query itself; in other words you have not retrieved any data just by creating a query variable.

Complete LINQ Query Operation

In the previous example, because the data source is an array, it implicitly supports the generic IEnumerable<(Of <(T>)>) [ http://msdn2.microsoft.com/en-us/library/9eekhta0(printer).aspx ] interface. This fact means it can be queried with LINQ. A query is executed in a foreach statement, and foreach requires IEnumerable [ http://msdn2.microsoft.com/en-us/library/h1x9x1b1(printer).aspx ] or IEnumerable<(Of <(T>)>) [ http://msdn2.microsoft.com/en-us/library/9eekhta0(printer).aspx ] . Types that support IEnumerable<(Of <(T>)>) [ http://msdn2.microsoft.com/en-us/library/9eekhta0(printer).aspx ] or a derived interface such as the generic IQueryable<(Of <(T>)>) [ http://msdn2.microsoft.com/en-us/library/bb351562(printer).aspx ] are called queryable types.

A queryable type requires no modification or special treatment to serve as a LINQ data source. If the source data is not already in memory as a queryable type, the LINQ provider must represent it as such. For example, LINQ to XML loads an XML document into a queryable XElement [ http://msdn2.microsoft.com/en-us/library/bb340098(printer).aspx ] type:

// Create a data source from an XML document.
// using System.Xml.Linq;
XElement contacts = XElement.Load(@"c:\myContactList.xml");

With LINQ to SQL, you first create an object-relational mapping at design time either manually or by using the Object Relational Designer (O/R Designer) [ http://msdn2.microsoft.com/en-us/library/bb384429(printer).aspx ] . You write your queries against the objects, and at run-time LINQ to SQL handles the communication with the database. In the following example, Customer represents a specific table in the database, and

Table<Customer> supports generic IQueryable<(Of <(T>)>) [ http://msdn2.microsoft.com/en-us/library/bb351562(printer).aspx ] , which derives from IEnumerable<(Of <(T>)>) [ http://msdn2.microsoft.com/en-us/library/9eekhta0(printer).aspx ] .

// Create a data source from a SQL Server database.
// using System.Data.Linq;
DataContext db = new DataContext(@"c:\northwind\northwnd.mdf");

For more information about how to create specific types of data sources, see the documentation for the various LINQ providers. However, the basic rule is very simple: a LINQ data source is any object that supports the generic IEnumerable<(Of <(T>)>) [ http://msdn2.microsoft.com/en-us/library/9eekhta0(printer).aspx ] interface, or an interface that inherits from it.

The query specifies what information to retrieve from the data source or sources. Optionally, a query also specifies how that information should be sorted, grouped, and shaped before it is returned. A query is stored in a query variable and initialized with a query expression. To make it easier to write queries, C# has introduced new query syntax.

The query in the previous example returns all the even numbers from the integer array. The query expression contains three clauses: from, where and select. (If you are familiar with SQL, you will have noticed that the ordering of the clauses is reversed from the order in SQL.) The from clause specifies the data source, the where clause applies the filter, and the select clause specifies the type of the returned elements. These and the other query clauses are discussed in detail in the LINQ Query Expressions (C# Programming Guide) [ http://msdn2.microsoft.com/en-us/library/bb397676(printer).aspx ] section. For now, the important point is that in LINQ, the query variable itself takes no action and returns no data. It just stores the information that is required to produce the results when the query is executed at some later point. For more information about how queries are constructed behind the scenes, see Standard Query Operators Overview [ http://msdn2.microsoft.com/en-us/library/bb397896(printer).aspx ] .

Deferred Execution

As stated previously, the query variable itself only stores the query commands. The actual execution of the query is deferred until you iterate over the query variable in a foreach statement. This concept is referred to as deferred execution and is demonstrated in the following example:

//  Query execution. 
foreach (int num in numQuery)
{
    Console.Write("{0,1} ", num);
}

The foreach statement is also where the query results are retrieved. For example, in the previous query, the iteration variable num holds each value (one at a time) in the returned sequence.

Because the query variable itself never holds the query results, you can execute it as often as you like. For example, you may have a database that is being updated continually by a separate application. In your application, you could create one query that retrieves the latest data, and you could execute it repeatedly at some interval to retrieve different results every time.

Forcing Immediate Execution

Queries that perform aggregation functions over a range of source elements must first iterate over those elements. Examples of such queries are Count, Max, Average, and First. These execute without an explicit foreach statement because the query itself must use foreach in order to return a result. Note also that these types of queries return a single value, not an IEnumerable collection. The following query returns a count of the even numbers in the source array:

var evenNumQuery = 
    from num in numbers
    where (num % 2) == 0
    select num;

int evenNumCount = evenNumQuery.Count();

To force immediate execution of any query and cache its results, you can call the ToList<(Of <(TSource>)>) [ http://msdn2.microsoft.com/en-us/library/bb342261(printer).aspx ] or ToArray<(Of <(TSource>)>) [ http://msdn2.microsoft.com/en-us/library/bb298736(printer).aspx ] methods.

List<int> numQuery2 =
    (from num in numbers
     where (num % 2) == 0
     select num).ToList();

// or like this:
// numQuery3 is still an int[]

var numQuery3 =
    (from num in numbers
     where (num % 2) == 0
     select num).ToArray();

You can also force execution by putting the foreach loop immediately after the query expression. However, by calling ToList or ToArray you also cache all the data in a single collection object.

In a LINQ query, the first step is to specify the data source. In C# as in most programming languages a variable must be declared before it can be used. In a LINQ query, the from clause comes first in order to introduce the data source (customers) and the range variable (cust).

//queryAllCustomers is an IEnumerable<Customer>
var queryAllCustomers = from cust in customers
                        select cust;

The range variable is like the iteration variable in a foreach loop except that no actual iteration occurs in a query expression. When the query is executed, the range variable will serve as a reference to each successive element in customers. Because the compiler can infer the type of

cust, you do not have to specify it explicitly. Additional range variables can be introduced by a let clause. For more information, see let clause (C# Reference) [ http://msdn2.microsoft.com/en-us/library/bb383976(printer).aspx ] .

Probably the most common query operation is to apply a filter in the form of a Boolean expression. The filter causes the query to return only those elements for which the expression is true. The result is produced by using the where clause. The filter in effect specifies which elements to exclude from the source sequence. In the following example, only those customers who have an address in London are returned.

var queryLondonCustomers = from cust in customers
                           where cust.City == "London"
                           select cust;

You can use the familiar C# logical AND and OR operators to apply as many filter expressions as necessary in the where clause. For example, to return only customers from "London" AND whose name is "Devon" you would write the following code:

where cust.City=="London" && cust.Name == "Devon"

To return customers from London or Paris, you would write the following code:

where cust.City == "London" || cust.City == "Paris"

For more information, see where clause (C# Reference) [ http://msdn2.microsoft.com/en-us/library/bb311043(printer).aspx ] .

Often it is convenient to sort the returned data. The orderby clause will cause the elements in the returned sequence to be sorted according to the default comparer for the type being sorted. For example, the following query can be extended to sort the results based on the Name property. Because Name is a string, the default comparer performs an alphabetical sort from A to Z.

var queryLondonCustomers3 = 
    from cust in customers
    where cust.City == "London"
    orderby cust.Name ascending
    select cust;

To order the results in reverse order, from Z to A, use the orderby…descending clause.

For more information, see orderby clause (C# Reference) [ http://msdn2.microsoft.com/en-us/library/bb383982(printer).aspx ] .

The group clause enables you to group your results based on a key that you specify. For example you could specify that the results should be grouped by the City so that all customers from London or Paris are in individual groups. In this case, cust.City is the key.

// queryCustomersByCity is an IEnumerable<IGrouping<string, Customer>>
  var queryCustomersByCity =
      from cust in customers
      group cust by cust.City;

  // customerGroup is an IGrouping<string, Customer>
  foreach (var customerGroup in queryCustomersByCity)
  {
      Console.WriteLine(customerGroup.Key);
      foreach (Customer customer in customerGroup)
      {
          Console.WriteLine("    {0}", customer.Name);
      }
  }

When you end a query with a group clause, your results take the form of a list of lists. Each element in the list is an object that has a Key member and a list of elements that are grouped under that key. When you iterate over a query that produces a sequence of groups, you must use a nested foreach loop. The outer loop iterates over each group, and the inner loop iterates over each group's members.

If you must refer to the results of a group operation, you can use the into keyword to create an identifier that can be queried further. The following query returns only those groups that contain more than two customers:

// custQuery is an IEnumerable<IGrouping<string, Customer>>
var custQuery =
    from cust in customers
    group cust by cust.City into custGroup
    where custGroup.Count() > 2
    orderby custGroup.Key
    select custGroup;

For more information, see group clause (C# Reference) [ http://msdn2.microsoft.com/en-us/library/bb384063(printer).aspx ] .

Join operations create associations between sequences that are not explicitly modeled in the data sources. For example you can perform a join to find all the customers in London who order products from suppliers who are in Paris. In LINQ the join clause always works against object collections instead of database tables directly. In LINQ you do not have to use join as often as you do in SQL because foreign keys in LINQ are represented in the object model as properties that hold a collection of items. For example, a Customer object contains a collection of Order objects. Rather than performing a join, you access the orders by using dot notation:

from order in Customer.Orders...

For more information, see join clause (C# Reference) [ http://msdn2.microsoft.com/en-us/library/bb311040(printer).aspx ] .

The select clause produces the results of the query and specifies the "shape" or type of each returned element. For example, you can specify whether your results will consist of complete

 Customer objects, just one member, a subset of members, or some completely different result type based on a computation or new object creation. When the select clause produces something other than a copy of the source element, the operation is called a projection. The use of projections to transform data is a powerful capability of LINQ query expressions. For more information