OakLeaf Systems: September 2007

Friday, September 28, 2007

LINQ and Entity Framework Posts for 9/28/2007+

Julie Lerman Demonstrates Detached Updates with Entity Framework

The Entity Framework (EF) and LINQ to SQL teams have been talking about about sample projects to demonstrate detached updates and WCF implementations for months. Talk is cheap, but sample projects (like whiskey) cost money, which translates to time for developers.

Julie Lerman's More on Disconnected Entity Framework of October 1, 2007 shows you the code required to achieve her objective with detached and re-attached entities: "Update only relevant properties and have the option to deal with concurrency on a granular level."

Julie starts by recommending that you add a LastModified DateTime property to each entity and field to each persistence table (similar to that required by Sync Services for SQL Server Compact Edition), rather than a timestamp, and have the client update the value when modifying property values. She then shows how to handle entity additions and deletions (no concurrency management) with LINQ queries. Finally, Julie uses the largely undocumented the EF's MetadataWorkspace object and reflection to modify entity property values programmatically for updates. At this point, the ObjectContext.SaveChanges() method handles all persistence chores.

Most of Julie's techniques apply to LINQ to SQL, although detaching and reattaching the same entity to a different DataContext is problematic, as Dinesh Kulkarni notes in his Beta2 to RTM Changes in Attach() Behavior post to the LINQ Project General forum.

Julie promises to blog about handling state in ASP.NET projects and XML serialization of entities. I'm anxious to read her posts about both projects.

From the "Talk Is Cheap" Department:

On June 2, 2007, EF architect Mike Pizzo said in response to request #10 of my Defining the Direction of LINQ to Entities/EDM post of May 29, 2007, "Define the support for n-tier architectures and provide non-trivial, loosely-coupled sample projects with WCF:"

I would love to see us do some more in-depth sample projects with WCF.

Me, too, Mike. But Julie's doing this now.

Danny Simmons says in his September 15, 2007 Concepts Part I: Getting an entity model up and running post about his proposed blog series:

[A]s we go through the series [of blog posts ... explaining various Entity Framework concepts] we probably will also spend some time exploring other app architectures like web services and web apps.

I'm not holding my breath for a WCF sample.

Finally, in his October 1, 2007 Where was I hiding (aka lame excuse for not posting before)? post (see below), Dinesh Kulkarni promises to write a blog post about "Attaching detached objects (if only I can detach them first)" as his second priority (after "Stored Procedures."

Hopefully, Dinesh will include a non-trivial WCF sample project or two.

Considering Microsoft's emphasis on service architectures, it's surprising that more forethought wasn't devoted to detached and, especially WCF, scenarios by both the EF and LINQ to SQL teams.

Dinesh Kulkarni Plans Series of LINQ to SQL Posts

Dinesh's Where was I hiding (aka lame excuse for not posting before)? post of October 1, 2007 discusses why he's been among the missing LINQ to SQL team bloggers and lists seven topics he plans to cover in future posts:

Sprocs and LINQ
Attaching detached objects (if only I can detach them first)
Transacting in LINQ to SQL
Joins vs. n queries for eager loading (why can't you do them joins)
Top 3 bugs that we fixed
What am I doing after Orcas (VS 2008, .NET Fx 3.5) ships
Tintin in LINQ land

I've about exhausted the first "Sprocs and LINQ" topic and am starting on the second "Attaching detached objects" topic in conjunction with WCF. I've already dealt with #4, which I call "JOIN Queries from Hell" because of their poor performance.

However, I've recommended that Dinesh replace topic #5 with "LINQ to SQL vs. DataAdapter Peformance" and some code that proves Soma's claim in his September 27, 2008 VS 2008 Performance Improvements post that :

In our testing, LINQ does in fact out-perform SqlDataAdaptor on almost every test case we tried, and in many of the exceptions, it is no more than 10% slower than using a SqlDataReader to accomplish the same task. Given the power of LINQ, we feel this is a very reasonable trade-off.

I'm not seeing such results in my tests; I can't even come close to them with my test harnesses.

Save Server Round Trips by Preloading LINQ to SQL EntityRefs

shows you how to minimize the number of server round trips required to populate EntityRef associations with stored procedures. The technique executes a custom stored procedure that retrieves all the EntityRef objects that might be expected to appear on a set of Web pages or in a Windows form session. The eager loading process for the association detects that the entities are already loaded and doesn't invoke the corresponding stored procedure call.

This article uses the same ASP.NET 3.5 test harness as the earlier Problems Using Stored Procedures for LINQ to SQL Data Retrieval (Updated) post.

Added: 10/1/2007

Problems Using Stored Procedures for LINQ to SQL Data Retrieval (Updated)

finally became available on September 30, 2007. This post summarizes the issues expected to remain at RTM when you replace LINQ to SQL's autogenerated, dynamic T-SQL prepared statements for hydrating objects with stored procedures. The post covers issues common to ADO.NET 3.5 and ASP.NET 3.5 projects, as well as those that affect only ASP.NET projects that use the LinqDataSource control.

Here's a screen capture of the OrdersLinqDataSourceSP.sln test harness for pre-loading EntityRef objects and server-side sorting (by Employee.LastName) with a LinqDataSource control:

Click image for full-size screen capture.

The above is one of six test harnesses used to verify workarounds for and analyze comparative performance of stored procedures and dynamically generated T-SQL prepared statements.

Added: 9/20/2007

David Hayden Comments on Soma's LINQ to SQL Performance Post

David Hayden's LINQ To SQL and Visual Studio 2008 Performance Update post of September 28, 2007 mentions that he "would love to hear more about improvements on the way the team is dealing with the chattiness of LINQ To SQL when you are prefetching a graph of objects." So would I, but I'd also like to hear more about the methods used to gather the data for Soma's comparison of LINQ to SQL and DataSet performance.

David's sample query includes two 1:n associations (Customer:Orders and Order:Order_Details) populated by eager loading the Orders collection for each Customer entity and the Order_Details collection for each Order entity:

using (NorthwindDataContext context =
           new NorthwindDataContext())
{
    DataLoadOptions options = new DataLoadOptions();
    options.LoadWith<Customer>(c => c.Orders);
    options.LoadWith<Order>(o => o.Order_Details);

    context.LoadOptions = options;

    IEnumerable<Customer> customers =
           context.Customers.ToList<Customer>();
}

The preceding code executes 92 queries to load all rows from the three tables into entities and populates my test harness's three bound DataGridView controls in 1.23 seconds. Each query opens and closes a pooled connection.

The reason for 92 queries? According to Scott Guthrie, only one 1:n association is permitted in a single JOIN query to preclude data explosions, as noted in my Clarification of the Object Tracking Problem with LINQ to SQL's Eager Loading Feature post of August 29, 2007.

Note: Deferred loading (the default), more commonly called lazy loading, executes 922 queries: 1 to retrieve all Customer entities, 91 to retrieve Orders for each Customer, and 830 to retrieve Orders with their Order_Details.

Rewriting the sample query as Scott suggested to substitute a m:1 Customer:Order association for the 1:n Customer:Orders association executes two queries to retrieve all entities:

using (NorthwindDataContext context =
           new NorthwindDataContext())
{
    DataLoadOptions options = new DataLoadOptions();
    options.LoadWith<Order>(c => c.Customer);
    options.LoadWith<Order>(o => o.Order_Details);

    context.LoadOptions = options;

    IEnumerable<Order> orders =
           context.Orders.ToList<Order>();
}

The first query is a simple SELECT query that returns all Customers. The second is a JOIN Query from Hell that returns a row for each Order Details item. This query takes 3.21 seconds to execute and fill the three DataGridViews, which includes a few milliseconds to convert the List<Order> to a filterable BindingListView<Order> that can sync the contents of the Orders DataGridView for the selected Customer entity. (See Support Sorting and Filtering for List<T> Collections of 9/7/2007 for more details on the BindingListView<Order> wrapper.) My code uses ToBindingList<Customer> and ToBindingList<Order> for autogenerated dynamic SQL queries to make all three DataGridViews sortable, which is equivalent to DataSet functionality.

The corresponding typed DataSet loads into the three DataGridViews with three queries in 0.21 seconds.

David also suggests:

Obviously when you get into these scenarios where LINQ To SQL is not providing optimized queries, you can use LINQ To SQL's support for Stored Procedures.

This might be true for very simple queries after RTM when SProcs return the same Table<TEntity> data type as autogenerated dynamic SQL queries but, as Matt Warren noted in response to my Problem Populating a LinqDataSource's EntityRef Association with a Stored Proc issue in the LINQ Project General forum:

There is currently no means by which you can use a single stored procedure to load more than a single type of entity or association at a time. For example, there is no way to write a stored procedure to retrieve a customer and all its orders and map that result to a heirarchy of customer and order objects. You would have one stored procedure for retrieving a customer and a separate procedure for retrieving orders for that customer's ID.

This means that LINQ to SQL executes an SProc for each entity, which generates 923 queries for the three Northwind tables, regardless of whether you eager-load or lazy-load associations.

Until RTM or later CTP bits arrive, SProcs for data retrieval deliver a second-class ISingleResult<T> data type instead of the Table<TEntity> data type with Attach() and Detach() methods. The latter also offers a ToBidingList() method so your DataGridViews are sortable (but not filterable). See Upgrade to LINQ to SQL's WinForm Databinding with Stored Procedures Scheduled for RTM for more details on the forthcoming data type change.

Stored procedures to support server-side paging and correct sorting of GridViews bound to LinqDataSource controls contain ugly as sin T-SQL statements, to be charitable. See Dynamic ORDER BY Clauses for ASP.NET Server-Side Paging with Stored Procedures of 9/18, 2007 for examples.

P.S. See Frans Bouma's comment of 9/29/2007 below about the issues raised here.

Added: 9/28/2007 1500 PST Updated: 9/29/2007 Minor edits, additions and clarifications

Mike Taulty Says "Grouping in LINQ is Weird"

His Grouping in LINQ is weird (IGrouping<K,T> is your friend) post of September 28, 2007 explains the non-public GroupedEnumerable type and the IGrouping<K,T> enumeration to those who haven't yet figured out LINQ's grouping story. (The attempt at "an attention-grabbing headline" would only interest a LINQ junkie.)

Added: 9/28/2007 1530 PST

Chris Buckett Sets Up Unit Tests for an Entity Framework Project

Entity Framework - Part 2 - Setting up a project and getting unit tests working of September 28, 2007 includes a downloadable EF Research project that contains a simple Entity Data Model (EDM) generated from the Northwind Customers table.

The most interesting aspect of Chris's article is his use of Visual Studio 2008 Professional Edition's unit test feature provided by the New Project\Test\Test Project template, which you can substitute for NUnit or TestDriven.net.

As Chris learned in this Unit testing / mocking Entity Framework post in the ADO.NET Orcas forum, "there is no planned way to mock the database access in EF," so he's working on a "mock entity data provider."

Frans Bouma's Heavy into His LINQ to SQL Provider at Day 4

In his Developing Linq to LLBLGen Pro, Day 4 post of September 28, 2007, Frans discusses the hazards of retrofitting commercial code to accommodate a new feature: support for derived tables, which Frans defines in Developing Linq to LLBLGen Pro, Day 2 as the [in-memory] result of execution of SELECT queries in FROM clauses. (This is a narrow view of derived tables, which the SQL 2003 standard defines indirectly as the result of any table subquery.) All modern relational database management systems (RDBMSs) now support derived tables, which are especially useful with aggregate functions. Frans includes sample code for generating a derived table.

Frans next tackles issues with the visitor pattern and observes that Microsoft should have provided a generic visitor class that provider writers could adapt to their database or O/RM tool. It appears that Frans intends to write a giant "switch statement which calls handler routines" instead of a custom visitor class.

Matt Warren, the LINQ to SQL team's chief architect, discusses the visitor pattern and expression trees in LINQ: Building an IQueryable Provider - Part II (Where and Reusable Expression Tree Visitor) and LINQ: Building an IQueryable Provider - Part V (Improved Column Binding).

Note: GoF is the accepted abbreviation for the "Gang of Four," authors of Design Patterns: Elements of Reusable Object-Oriented Software, the seminal work on application of design patterns to software design.

Pablo Castro and Opening Astoria's Door to Additional Data Sources

Pablo's Astoria data sources and system layering post of September 27, 2007 starts by asking and answering, at least in part, "Why does Astoria build on top of the EDM and the Entity Framework?" I'd say that demonstrating the practicality and scalability of the EDM and Entity Framework (EF) in a large-scale data service implementation provided much if not the majority of the initial impetus.

He then goes on to define Astoria's requirements for all data sources and proposes LINQ's IQueryable interface and the use of expression trees for translating Astoria's URI-based HTTP interface into composable (IQueryable) LINQ queries. Pablo cite's Matt Warren's LINQ: Building an IQueryable Provider series as "provid[ing] a great detailed reference" for expression trees. I agree wholeheartedly.

The team also plans to:

[D]efine an interface, something like IUpdatable or whatever names works, that has the basic operations we need to perform in order to handle updates. The interface would have primitive operations for adding a new resource, remove an existing resource, applying modifications to resources and also handle linking/unlinking of resources.

On the whole, the idea of using IQueryable and IUpdatable as the interfaces sounds good to me (and much better than substituting Web3S for POX as a data format, as noted in Deciding Astoria's Final URI Addressing Scheme.)

Soma Somasegar Claims Major Performance Gains for LINQ

Soma Somasegar says the following in his September 27, 2008 VS 2008 Performance Improvements post:

In the VS 2008, we have made some good strides in improving performance for some of the common scenarios that developers care about. We set explicit performance goals both for new product features and for improving the performance of existing features.

For example, with the new LINQ facility we set a goal that LINQ performance be significantly better than using a SqlDataAdaptor for the same query and competitive with using a SqlDataReader, the lightest weight interface available for retrieving data from SQL Server. In our testing, LINQ does in fact out-perform SqlDataAdaptor on almost every test case we tried, and in many of the exceptions, it is no more than 10% slower than using a SqlDataReader to accomplish the same task. Given the power of LINQ, we feel this is a very reasonable trade-off.

My tests with Beta 2 bits don't show that LINQ to SQL outperforms the SqlDataAdapter on any case I tried. In fact, my LINQ to SQL test harnesses show far slower data retrieval time with dynamic SQL statements or stored procedures than TableAdapters. I have the feeling that Microsoft's test results didn't populate associations while hydrating objects. On the other hand, there might have been a miraculous performance improvement post Beta 2.

Watch for my "Optimizing LINQ to SQL Performance" cover article for the November 2007 issue of Visual Studio Magazine where I provide the execution times for a wide variety of tests with bound DataGridViews for Windows forms and GridView/LinqDataSource combinations for Web pages. You'll be able to download the sample code for the four LINQ to SQL test harnesses that use Northwind as the persistence store. The tests weren't intended to compare LINQ to SQL and DataSet performance, but I built a comparable Windows form test harness for typed data sets and included the load time as a reference value. You can drag and drop a similar DataSet test harness in less than 5 minutes.

Note: My "Layer Data Access with LINQ to SQL" article from the July 2007 issue covered the basics of LINQ to SQL with Beta 1 code.

I also plan to rerun my timing tests for conventional and compiled LINQ to SQL queries with Beta 2 bits to see if I show the gains reported by Rico Mariani in his and my previous posts:

Stay tuned for updates.

Update 9/29/2007: Minor fixes and additions.

Mike Griffen Explains Why He's Lukewarm about Support for LINQ in EntitySpaces: Performance

The .NET Rocks! team released on September 27, 2007 a one-hour interview with Mike Griffen, the creator of MyGeneration and EntitySpaces. MyGeneration is an open-source (BSD licensed) codegen and O/RM tool for .NET that's available for download from SourceForge. EntitySpaces is a commercial persistence layer and business object generator/manager for .NET that supports seven popular relational database management systems. According to Mike, EntitySpaces competes with LLBLGen, NHibernate, and other popular O/RM tools for .NET.

The EntitySpaces Team Blog EntitySpaces and LINQ - An Introduction - Part 1 post of June 18, 2006 (updated September 30, 2006) states: "As LINQ nears its official release we will look at translating the LINQ IQueryable<T> expression trees into our query API." Subsequent EntitySpaces versions' LINQ support was limited to querying previously loaded ObjecNameCollection lists, which implement IEnumerable.

Mike claims that his dynamic query API is very similar to LINQ (16:28). Here's his explanation (18:00):

LINQ is very similar to what we're doing. They have a different syntax than we do. I've played around with it quite a bit, actually. I find our syntax more intuitive, especially when were doing joining and stuff like that. LINQ is very powerful, though. You can LINQ into our collections; we've supported that for a long time.

But we, basically, feel that we're not going to spend a lot of time right now trying to create LINQ providers for EntitySpaces, because LINQ is basically really supported on SQL Server [only]; some other databases are working towards that, but there aren't [other] LINQ providers or good LINQ implementation guidelines for actually creating a LINQ provider.

However, LINQ's performance seems to be Mike's primary concern. Mike says in the interview at 49:20:

From what I gather, [LINQ] uses a lot of reflection. I hear a lot about performance in ADO.NET, the Entity Framework and stuff. So I hear a lot about binding performance, about slowness. ...

From an EntitySpaces perspective I'm not panicked or really worried about where LINQ goes. We're going to support a lot of the LINQ stuff, we'll take advantage of it where we can.

It sounds to me as if Mike won't be following Frans Bouma's lead in translating LINQ expression trees into his O/RM tool's native query language after all.

Added: 9/29/2007 Updated: 10/24/2007 (Fixed Mike Griffen's name in the title)

Ambition Attempts LINQ for Ruby

Java has Quaere so Ruby must deserve a LINQ clone that enables writing queries in Ruby instead of SQL. Welcome to Chris Wanstrath's Ambition project. Ambition queries behave like LINQ queries, that is, they defer execution until you invoke a kicker method, such as detect, each, each_with_index, map and first (with no argument). Examples in the initial Full of Ambition blog of August 30, 2007 use the ActiveRecord "logging hack" to display the SQL for simple User.select, User.first, User.sort_by, User.detect, etc., expressions. There's some similarity to EntitySpaces' string-based query syntax.

As of September 11, 2007's Even More Ambitious post, Ambition's developers had "moved our sights from Rack to LINQ." Ambition has the ambition "to be a query language for SQL, LDAP, XPath, the works."

Anders Noras, the developer of Quaere, describes Ambition's similarities and differences in Ambition, Quaere, Closures and Other Random Thoughts of September 27, 2007, from which I learned of Ambition's existence.

Added: 9/28/2007 1630 PST

Technorati Tags: Astoria Project, LINQ to SQL, LINQ, Entity Data Model, Entity Framework, MyGeneration, EntitySpaces, DataSets, Unit Testing, Visual Studio Magazine, Quaere, Ambition, LINQ for Java, LINQ for Ruby

Sunday, September 23, 2007

Deciding Astoria's Final URI Addressing Scheme

Mike Flasko and Pablo Castro of the Astoria Project team posted URI Format - Part 1 - Addressing resources using URI path segments on September 21, 2007. This post, which was converted from a subitem in the LINQ and Entity Framework Posts for 9/21/2007+ post covers the first of three goals:

Provide a mechanism to point to every resource or member of a resource in the system. That is, every piece of data is addressable, and the URI used to address it needs to be derivable from the service metadata which describes the conceptual model of the system

Future Astoria Team posts will tackle new syntax for simple queries and DML.

The Original Addressing Scheme

Julie Lerman explained the current URI scheme for retrieving entity sets, entities, and members in her What the heck is Astoria? post as similar to the following:

http://localhost:1544/Northwind.svc retrieves the list of the service's entities
http://localhost:1544/Northwind.svc/Customers retrieves a list of the members of the Customer entity with their key value(s) enclosed by square brackets
http://localhost:1544/Northwind.svc/Customers[ALFKI] retrieves the member whose key value is 'ALFKI'
http://localhost:1544/Northwind.svc/Customers[ALFKI]/Orders retrieves the Orders entity set for 'ALFKI'
http://localhost:1544/Northwind.svc/Customers[ALFKI]/Orders[10643] retrieves Order number 10643
http://localhost:1544/Northwind.svc/Customers[ALFKI]/Orders[10643]/Shippers retrieves the Shipper for Order 10643, Shippers[3] (Federal Shipping)
http://localhost:1544/Northwind.svc/Customers[ALFKI]/Orders[10643]/Shippers[3]/Orders retrieves all Orders sent by Federal Shipping (equivalent to http://localhost:1544/Northwind.svc/Shippers[3]/Orders

and so on. I attribute the failure of http://localhost:1544/Northwind.svc/Customers[ALFKI]/Orders[10643]/Order_Details to retrieve records to an early lack of a defined syntax for composite keys, which might logically be [10643][52] or the like for OrderID/ProductID. (Early Entity Framework CTPs couldn't handle composite primary keys).

Proposed Changes to the Original Syntax

Square brackets: I'm not even close to being an expert in designing URI addressing schemes but the original URI path segments looked good to me because of their simplicity and consistency. Square brackets correspond to array indices in SOAP encoding (and C#, of course).

However, Mike and Pablo said:

The May 2007 CTP used values in square-brackets (e.g. “…/Customers[ALFKI]”). We got “generous” feedback saying that square-brackets were a bad choice.

My Google search on 'Microsoft Astoria "square brackets"' didn't turn up obvious complaints and Mike/Pablo didn't explain why "square-brackets were a bad choice."

Web3S vs. POX: Replacing POX with Web3S, which doesn't ring my bell, requires syntax that supports heterogeneous sets that Astoria doesn't deliver. Thus Mike and Pablo propose two approaches to retrieve the same entity:

Full-form (for Web3S compatibility): ...Customers/Customer('ALFKI'), which specifies a Customer entity within the Customers entity set.
Short-form (Astoria standard syntax): ...Customers!'ALFKI', which infers the Customer entity.

Comments:

The syntax provides two methods of addressing for a particular entity so is "a redundant method for getting to it." This was rejected for specifying the default container in the URI.
Unless there are plans for Astoria to somehow deliver heterogeneous sets, the long form should be abandoned as unnecessarily verbose.
Quotes around key strings are an SQL artifact that don't belong in the URI. If meaningful characters (such as /) are to be escaped in literals, there's no reason for quoted-string alphanumeric key values.
Parentheses imply an array index (to VB coders), while the bang (!) operator traditionally separates the names of a collection and its members. If there's a valid reason not to use square brackets to set off key values, parentheses are better than the bang operator.

If it's absolutely necessary to support heterogeneous set syntax, reserve the bang operator for it's traditional use, as in ...Customers!Customer(ALFKI) or substitute a virgule (/) and allow qualifying the default container name.

Literal forms of composite keys: If you require escaping meaningful characters within literals, commas (, = %2C) are meaningful within URIs and composite keys so escape them like you would whacks (/ = %2F) or spaces ( = %20). (Who would include a comma in a key value anyway?) If you prohibit spaces in key values, why not prohibit commas or other URI-reserved characters?

Names for key values: The names might be significant if the key values represent natural keys, but if they're surrogate keys I see no reason to name them. Is there a reason key names can't be made optional if position or name is significant?

Comment: The appearance of composite key names and values indicates to me that current EF bits must support surfacing foreign key values, which I requested some time ago. The need to eager- or lazy-load an entity to retrieve a natural key value only is an unnecessary waste of resources.

Persistence Ignorance is Good but Syntax Ignorance is Bad

"Should we assume that consumers of Astoria URIs understand their syntax?" Yes. Consumers are required to know the container name and the .svc extension. If so, they should have some inkling of the remainder of the URI. How about $syntax as a substitute for /? for text-encoded help?

Syntax for Self-Joins

Currently the link syntax to a many:one association resulting from a self-join, such as the Employees table's Employees:ReportsTo relationship is:

which doesn't express the relationship in a meaningful way.

Web3S Redux

I've just reread Sam Ruby's and Tim Bray's June 2007 posts on Web3S. Both Sam and Tim point out that Web3S is intended to be “the central data store in Windows Live for address book information. All Hotmail contacts, Messenger buddies and Spaces’ friends are recorded in Live Contacts." Astoria has no association whatsoever with Live Contacts and should not be saddled with its redefinition of the XML Infoset and lack of a schema.

Web3S is not an IETF or W3C standard and probably never will be. Astoria lends no credence to Web3S as an "industry standard." Web3S requires support for a new HTTP verb, UPDATE. The chance of UPDATE being added to the HTTP protocol and of any network admin allowing it through a corporate firewall is infinitesimal. How many business networks permit PUT and DELETE from the outside (or inside)? Not many.

Tim suggests LDAP as a more appropriate protocol. Having had some experience with LDAP in the DSMLv2 and DSfW Beta 1 era, I don't see how anyone could call LDAP RESTful.

Technorati Tags: Astoria Project, Entity Framework, Entity Data Model, Entity SQL, LINQ to Entities, LINQ, REST, Data in the Cloud

Friday, September 21, 2007

Problems Using Stored Procedures for LINQ to SQL Data Retrieval (Updated)

This is an updated summary of my experiences when using stored procedures for retrieving data to populate (hydrate) LINQ to SQL entities with stored procedures (SProcs).

Many issues reported in the original Problems Using Stored Procedures for LINQ to SQL Data Retrieval post of September 8, 2007 have workarounds or are scheduled for a fix before RTM. Remaining problems and workarounds have been reconfirmed in Visual Studio 2008 Beta 2 with one or more of six ADO.NET and ASP.NET test harnesses. The test harnesses accompany my forthcoming Leverage LINQ in ASP.NET Projects report for Wrox/Wiley (downloadable content, ISBN: 978-0-470-25910-8) and "Optimize LINQ to SQL Performance" article for the November 2007 issue of Visual Studio Magazine. The test harnesses will be available for download upon purchase and publication, respectively.

Why Use Stored Procedures?

There are two primary reasons for using SProcs to hydrate LINQ to SQL (or Entity Data Model) entities:

Your organization's DBAs won't give users or applications SELECT privileges on production SQL Server tables.
You want to control the number of records to populate EntitySet associations with a Top(n) limit or date-based criteria using eager or lazy loading. This option isn't available directly with autogenerated dynamic SQL.

Stored procedures offer no significant performance or security advantages over the parameterized prepared statements generated by LINQ to SQL (or the Entity Framework) and executed by the sp_executesql SProc.

Problems with Stored Procedures in ADO.NET and ASP.NET Projects

The following issues relating to use of SProcs for data retrieval and updates affect both .NET 3.5 Windows and Web forms:

You can't inhibit sending T-SQL statements to the server when you execute LINQ queries against Table<TEntity> (DataContext.TableName) objects. However, LINQ queries against BindingList<T>, collections created from Table<TEntity> objects operate on in-memory instances so they don't send T-SQL statements.
You must execute a stored procedure to retrieve each EntitySet or EntityRef object with eager or lazy loading. However, it's possible to preload EntitySets as described in the Save Server Round Trips by Preloading LINQ to SQL EntityRefs post.
You must override in a partial class the default functions that retrieve EntitySet and EntityRef objects to populate associations. The syntax for the required LoadEntitySet and LoadEntityRef overloads is undocumented as of this post's date.
You don't gain the benefit of minimal network traffic for updates by sending only the data for column(s) that change plus that required to support optional optimistic concurrency control (preferably a single timestamp field.)
Lazy-loading specified properties by setting their Delay Loaded property value to True in the O/R Designer doesn't work with SProcs.
The graphical O/R Designer doesn't support stored procedures that return multiple resultsets, so they require you to write a substantial amount of custom code.

In Beta 2 and earlier, SProcs return ISingleResult<T> instead of Table<TEntity> objects. ISingleResult<T> objects are second-class objects that don't provide important Table<TEntity> methods, such as Attach(), Detach() and ToBindingList(). Applying the ToList() method to data-bind BindingSource components or DataGridView controls results in non-sortable data. The LINQ to SQL team has committed to a fix for this problem by RTM (see Upgrade to LINQ to SQL's WinForm Databinding with Stored Procedures Scheduled for RTM.) It's not known if the change will have any affect on the ASP.NET LinqDataSource control.

Problems with Stored Procedures That Affect ASP.NET Projects Only

SProc issues affecting ASP.NET projects that use GridView controls bound to LinqDataSource controls primarily relate to server-side paging and sorting:

The LinqDataSource control supports server-side paging with ad-hoc sort column sequence and direction when you retrieve data with dynamic SQL. Server-side paging requires SQL Server 2005 or later. Substituting SPprocs requires writing complex CASE statements to create dynamic ORDER BY clauses.
Server-side sorting by EntityRef values leads to exceedingly complicated logic in the SProcs(see Dynamic ORDER BY Clauses for ASP.NET Server-Side Paging with Stored Procedures of 9/18/2007.)

Here's a screen capture of the OrdersLinqDataSourceSP.sln test harness for pre-loading EntityRef objects and server-side sorting (by Employee.LastName) with a LinqDataSource control:

Click image for full-size screen capture.

The number of issues is greatly reduced from that of September 8 because of the forthcoming ISingleResult<T> fix and validation of workarounds for other initial problems.

Updated 10/1/2007: Added item #5 to "Problems with Stored Procedures in ADO.NET and ASP.NET Projects" topic.

Technorati Tags: LINQ to SQL, LINQ, DataGridView, GridView, LinqDataSource, Stored Procedures, Dynamic Queries

LINQ and Entity Framework Posts for 9/21/2007+

Note: Items about the Frans Bouma and Pablo Castro articles that were entered on 9/28/2007 have been moved to LINQ and Entity Framework Posts for 9/28/2007+.

Guy Burstein Compares LINQ to SQL's Attribute- and XML-Based Mapping

Guy's September 23, 2007 Linq to SQL Attribute Based Mapping post avoids the O/R Designer and leads you through the process of writing your own LINQ to SQL class files with [Table(Name="TableName")], and [Column(Name="ColumnName"), IsPrimaryKey=true, IsDbGenerated=true), CanBeNull=false] attributes.

His September 27, 2007 Linq to SQL XML Based Mapping post eliminates the need for decorating classes with attributes by writing an XML mapping schema document and loading an XmlMappingSource with instructions like:

XmlMappingSource mapping = XmlMappingSource.FromUrl("NorthwindMap.xml")
DataContext ctx = new DataContext(connectionString, mapping);
var query = from order in ctx.GetTable<Order>()
where order.CustomerId == "ALFKI"
select order;
foreach (Order order in query)
{
Console.WriteLine(order.Id + " " + order.OrderDate + " " + order.CustomerId);
}

XML mapping files skip business object classes altogether and let you execute LINQ to SQL queries against temporary Table<TEntity> objects directly.

Charlie Calvert Tackles Anonymous Types from C# LINQ Queries

Charlie's September 26, 2007 Anonymous Types in Query Expressions is an elementary discussion of anonymous types with samples from C# query expressions with the following topics:

Using Anonymous Types
Returning a Particular Class from a Query Expression
Querying a Class You Can't Modify

Rick Strahl Designs Business Objects Over LINQ to SQL

Rick's September 26, 2007 LINQ to SQL Queries out of a Business Layer post describes a simple business object with methods that return the IQueryable<EntryEntity> type for execution in the LINQ to SQL data layer. This enables the UI layer to qualify the query with Where and/or OrderyBy operators that will be reflected in the IEnumerable<EntryEntity> result.

Danny Simmons and Persistence Ignorance Redux

If there's an Ignorance Quotient for Persistence Ignorance (PIQ?), you'd assume that 100% would indicate an object/relational mapping (O/RM) tool supporting business objects with no knowledge of or interest in their persistence mechanism. In that case, 0% would represent total dependence on a particular database for persistence, proprietary object attributes tied to the O/RM tool, and onerous base classes from which all objects must inherit.

Danny Simmons' EF Persistence Ignorance Recap post of September 26, 2007 provides a summary the current state of the Entity Frameworks PI for version 1.0, which the team expects to release in the first half of 2008. I'd give it a PIQ of about 20.

Danny also includes links to his previous articles on the topic. The primary OakLeaf PI post is Persistence Ignorance Is Bliss, but Is It Missing from the Entity Framework? of March 7, 2007. This post explains the origin and definition of PI and the controversy that erupted from the "NHibernate Mafia" at Microsoft's Most Valuable Professionals Summit held in Seattle and Redmond on March 12-15, 2007.

Chris Buckett Starts an Entity Framework Series

UK Developer Chris Buckett has started the series with an illustrated Entity Framework, Part 1 - The high level overview of September 25, 2007.

Chris also is a proponent of VistaDB, as he notes in this earlier VistaDB - .net managed database item. VistaDB has < 1MB file system footprint and is compatible with Visual Studio 2008 Beta 2.

Four More Astoria Posts by Julie Lerman

Julie is proving to be a fountain of information on little-known and underdocumented aspects of Astoria. Following are her last three posts of Saturday, September 22, 2007, in chronological order:

More fun with Astoria - random queries in the browser

Astoria is sick (as in SLICK) when it comes to DML!

Trying to see what Astoria messages look like in the pipe

In her Help the Astoria team decide on URI syntax DevLife post of the same day, Julie encourages folks to contribute to the Astoria Team's search for a new URI syntax compatible with Web3S.

Deciding Astoria's Final URI Addressing Scheme

Taking Julie up on her request, I expanded the Deciding Astoria's Final URI Addressing Scheme item and moved it from here to its own URI.

Astoria Quick-Start from [Guess Who?]

Julie just posted a lavishly illustrated step-by-step QuickStart for building an Astoria data service from Northwind. She ran into a problem with ASPNETCompatibiltyEnabled in the web.config file, so her demo runs under Windows XP SP2. I'm trying it in Vista to see if I run into the same snag. [Unfortunately, my current Vista VMs are contaminated by previous Jasper installs.]

Julie Lerman to Present Sessions on Astoria at REMIX Boston and New England Code Camp

Julie's What the heck is Astoria? post of September 21, 2007 shows the basics of RESTful, URI-based data retrieval from a local Astoria Web Data Service. As she notes, the Web Data Service looks like a WCF Service and uses the Entities class from your Entity Data Model as its data source. Julie's presenting a session on Astoria at REMIX Boston 07 (10/8-9/2007) and at the New England Code Camp (9/29-30/2007).

Her resultsets are formatted as plain-old-XML (POX), which will be replaced by Microsoft's new Web3S XML format in a later CTP. Silverlight 1.1 has an Astoria client library; XML-formatted data is one reason that a future Silverlight version will include LINQ to XML.

Astoria also provides RESTful update semantics with HTTP PUT and POST operations. Here's a copy of Leonard Richardson's table of HTTP verbs as they apply to Astoria entities:

Resource	GET	POST	PUT	DELETE
EntitySet List	X
EntitySet (Collection)	X	Create a New Entity
Virtual EntitySet	X
Entity (Member)	X		X	X
Scoped EntitySet	X	Create an Association between Two Entities
Association	X		X	X

The preceding LINQ and Entity Framework Posts for 9/14/2007+ post has links to additional Astoria content. My original post on Astoria, "Astoria" Enables RESTful Data Services of April 30, 2007, has links to several related articles and blog posts from MIX07.

RTM Change to LINQ to SQL's Default XML Property Type Mapping

Dinesh Kulkarni's Beta2 to RTM change: XML column default mapping changed to XElement instead of XDocument post of September 20, 2007, the default data type for entity properties mapped to SQL Server 2005+ columns of the xml data type will be XElement. Dinesh's post explains:

Reason: due to a late change in LINQ to XML, XDocument cannot be serialized with the DataContract serializer. XElement can still be serialized. We wanted to make sure that if you choose the serialization option in SqlMetal or designer, the resulting classes are indeed serializable. Hence, we have changed the default to XElement.
However, you can still choose to use XDocument explicitly in the designer or by editing the dbml (or in your favorite editor or separate mapping tool). It is just not the default anymore.

This is a breaking change. The following two forum posts noted in the LINQ and Entity Framework Posts for 9/7/2008+ item also describe breaking changes at RTM:

LINQ to SQL Beta2 to RTM Key Changes (August 27, 2007)
Beta2 to RTM Changes in Attach() Behavior (September 11, 2007)

Moving to LINQ to SQL RTM Will Require Design-Time Class Refresh

In the same September 20, 2007 post, Dinesh warns:

When moving from beta2 to RTM, please make sure that you regen the classes from your database/dbml using designer or SqlMetal. While we would have liked to maintain "generated source compatibility", that has not been possible in going from beta2 to RTM. I expect problems if code generated with beta2 is used against RTM version of System.Data.Linq.dll.

Note: Code added to accommodate differences between ISingleResult<T> and Table<TEntity> types returned when retrieving persisted data with stored procedures (see below), such as ToList<T> invocations, probably will continue to work but should be removed.

Changes in Store by RTM for LINQ to SQL's WinForm Databinding with Stored Procedures

My Upgrade to LINQ to SQL's WinForm Databinding with Stored Procedures Scheduled for RTM of September 19, 2007 describes the change from the ISingleResult<T> to the DataQuery<TEntity> type returned to the DataContext object by stored procedures that retrieve data. This means that populating DataContexts with stored procs will be closer to on par with dynamic SQL when Microsoft releases Visual Studio 2008. You'll probably hear more about this change from Dinesh Kulkarni in the near future. In the meantime, I'm anxious to hear from Scott Guthrie how the change will affect the LinqDataSource.

You can expect updates to (or a rewrite of) this and my earlier Problems Using Stored Procedures for LINQ to SQL Data Retrieval post shortly. The first update to this post, an added "DataQuery<T> and Table<TEntity> vs. ISingleResult<T> and List<T> Collections" section, is ready now.

This does not appear to be a breaking change but it will affect the LinqDataSource. (Updated 9/19/2007. Repeated because of its importance.)

Early Version of LINQ Provider for the db4o Object Database

Luciano Di Cocco has written "very primitive and limited implementation of a LINQ provider for db4o" called, unsurprisingly, LINQ to db40. db4o is an open-source object database that has Java and .NET implementations. including the .NET Compact Framework.

Technorati Tags: Astoria Project, LINQ to SQL, LINQ to XML, LinqDataSource, LINQ, SQL Server 2005, VS 2008 Beta 2, VS 2008 RTM, Entity Framework, Entity Data Model, VistDB, Persistence Ignorance

Wednesday, September 19, 2007

Upgrade to LINQ to SQL's WinForm Databinding with Stored Procedures Scheduled for RTM

According to an email I received late last night from Dinesh Kulkarni, LINQ to SQL WinForm databinding will change by RTM for data retrieval with stored procedures. Dinesh slipped the upgrade into the Visual Studio 2008 development schedule one day before change lockdown.

What the change will do:

Databinding with sprocs will become like that with the DataQuery<T> implementation for dynamic SQL when you specify the sproc's Return Type property value as an entity in the O/R Designer's Methods pane. (Or drag the sproc from Server Explorer and drop on the entity widget in the main Designer pane.)
Your DataContext objects will have Table<TEntity> properties that behave like those from data retrieval with dynamic SQL.
You'll be able to sort DataGridView columns without using a third-party IBindingListView implementation like that I described in my Support Sorting and Filtering for List<T> Collections post of September 7, 2007. Table<TEntity> has a GetNewBindingList() method.

What the change won't do:

Dinesh says the requirement to override dynamic T-SQL for INSERT, UPDATE, and DELETE operations with the Configuration Behavior dialog won't change.
He also says the need to write LoadEntityName method overrides to easy- or eager-load EntityRefs and EntitySetss, as described in the "Specific Problems with ASP.NET Server-Side Paging and EntityRefs" section of my Problems Using Stored Procedures for LINQ to SQL Data Retrieval post, will remain.
The upgrade won't relieve the need to make a server round-trip for each EntityRef and EntitySet, as noted by Matt Warren in his last answer to my Problem Populating a LinqDataSource's EntityRef Association with a Stored Proc post to the LINQ Project General forum.
You'll still need to write custom stored procedures to handle ad-hoc sorting of GridView columns with the LinqDataSource's server-side paging feature.
Applying a filter to or enabling advanced (multi-column) sorting a DataGridView bound to a BindingSource component will still require an IBindingListView wrapper.

What remains to be resolved and/or disclosed:

It's not clear at this point what effect the upgrade will have on ASP.NET's LinqDataSource. I've sent a message to Scott Guthrie requesting clarification.
LINQ queries executed against Table<TEntity> instances generate dynamic SQL, as Matt Warren notes in his two answers in the Re: SPROCS, multiple results, and associations thread of 9/12/2007. It would be preferable to direct LINQ queries to the Table<TEntity> instances rather than the data store when the data source is a sproc. Hopefully, Dinesh will address this later.

I'm sure I'll have other questions as I recreate my Problems Using Stored Procedures for LINQ to SQL Data Retrieval post to account for the forthcoming changes.

DataQuery<T> and Table<TEntity> vs. ISingleResult<T> and List<T> Collections

DataQuery<T> is an internal sealed class (Friend NotInheritable Class in VB), so you won't see it in Object Browser. Lutz Roeder's .NET Reflector app shows that DataQuery<T> implements IOrderedQueryable<T>, IQueryable<T>, IQueryProvider, IEnumerable<T>, IOrderedQueryable, IQueryable, IEnumerable, and IListSource:

Executing a DataQuery<T> instance, which represents an expression tree, attaches a Table<TEntity> type to the active DataContext object. Only Table<TEntity> types support EntityRef<TEntity> and EntityType<TEntity> associations.

Note: Frans Bouma, creator of the LLBLGenPro commercial Object/Relational Mapping (O/RM) tool, made the wise decision on Developing Linq to LLBLGen Pro, Day 1 to implement DataQuery<T> as LLBLGenProQuery<T> for his LINQ to LLBLGenPro implementation. (Frans is a proponent of dynamic SQL for O/RM applications, as you can see by reading his Stored procedures are bad, m'kay? post. Frans' Yay! A new Stored Proc vs. Dyn. Sql battle! item also has links to other posts, both pro and con.)

It's important to note that LINQ to SQL's IdentityManager recognizes Table<TEntity> types while they're attached to the active DataContext object. SPs return either ISingleResult<T> or IMultipleResult<T> collections, which are lightweight types that implement IEnumerable<T> but differ considerably from Table<TEntity> collections. IdentityManager has no knowledge of objects contained in ISingleResult<T> or elements of IMultipleResult<T> collections.

Table<TEntity> collections implement IQueryable<T>, IQueryProvider, IEnumerable<T>, ITable, IQueryable, IEnumerable, and IListSource. ITable exposes the Attach, AttachAll, GetModifiedMembers, and GetOriginalEntityState methods that ISingleResult<T> doesn't define. The GetModifiedMembers and GetOriginalEntityState methods are needed to support change tracking and optimistic concurrency checking. The GetNewBindingList method returns an IBindingList that supports sorting and change notification by bound DataGridView controls.

ISingleResult<T> collections only mplement IEnumerable<T>, IEnumerable, and IFunctionResult and IDisposable. Databinding features of IEnumerable<T> types are very limited; if the LINQ query returns a projection (an anonymous type), no data binding occurs; if the query returns the concrete source data type, such as Customer, grids can't add or delete records. So it's a common practice to apply the ToList<T> method to the ISingleResult<T> object to enable data binding to a List<T> collection.

List<T> generic collections implement IList<T>, ICollection<T>, IEnumerable<T>, IList, ICollection, IEnumerable. IList<T> is a decendant of ICollection<T>, which provides Add, Remove, Clear, Contains and CopyTo methods and the Count property to improve the data binding experience. However, the List<T> collection is no match for the IBindingList type from the Table<TEntity> collection that supports sorting and change notification by bound DataGridView controls. (Updated 9/19/2007 17:00 PDT)

Technorati Tags: LINQ to SQL, LINQ, Store Procedures, WinForms, LinqDataSource, DataBinding, DataGridView

Tuesday, September 18, 2007

Dynamic ORDER BY Clauses for ASP.NET Server-Side Paging with Stored Procedures

My "compulsive obsession" to bring SQL Server 2005 stored procedures to the user-experience equivalent of dynamic T-SQL queries with a GridView bound to a LinqDataSource—regardless of the development effort required—continues.

When hydrating LINQ to SQL entities, I finally got server-side (incremental) paging to work properly with a LinqDataSource, bound DataGrid view and a stored procedure derived from a sample query. (See the introduction and "Attempts to Fix the Server-Side Paging Problems" sections of Problems Using Stored Procedures for LINQ to SQL Data Retrieval from 9/8/2007, updated 9/17/2007.) Here's a screen capture of the Orders test harness displaying 10 of the 122 Northwind Orders from customers in the USA:

Click capture to display full-size image in another window.

It was obvious that ad-hoc sorting wasn't behaving as expected for properties other than OrderID. For example, the preceding screen capture shows a descending sort on the Customer.CompanyName property. The sort's scope is the 10 rows displayed; the GridView sorts the rows. The same is true for ad-hoc sorts on all other properties except OrderID.

A quick review of a typical dynamic T-SQL query for page 2 with the simplest case, an ascending sort on the Freight property, showed the syntax required to increase the scope of the sort to all entities in the set:

exec sp_executesql N'SELECT TOP 10 [t2].[OrderID], [t2].[CustomerID],
    [t2].[EmployeeID], [t2].[OrderDate], [t2].[RequiredDate],
    [t2].[ShippedDate], [t2].[ShipVia], [t2].[Freight], [t2].[ShipName],
    [t2].[ShipAddress], [t2].[ShipCity], [t2].[ShipRegion],
    [t2].[ShipPostalCode], [t2].[ShipCountry]
FROM (
    SELECT ROW_NUMBER() OVER (ORDER BY [t0].[Freight], [t0].[OrderID]
        DESC) AS [ROW_NUMBER], [t0].[OrderID], [t0].[CustomerID],
        [t0].[EmployeeID], [t0].[OrderDate], [t0].[RequiredDate],
        [t0].[ShippedDate], [t0].[ShipVia], [t0].[Freight], [t0].[ShipName],
        [t0].[ShipAddress], [t0].[ShipCity], [t0].[ShipRegion],
        [t0].[ShipPostalCode], [t0].[ShipCountry]
    FROM [dbo].[Orders] AS [t0]
    LEFT OUTER JOIN [dbo].[Customers] AS [t1]
        ON [t1].[CustomerID] = [t0].[CustomerID]
    WHERE [t1].[Country] = @p0
    ) AS [t2]
WHERE [t2].[ROW_NUMBER] > @p1
ORDER BY [t2].[Freight], [t2].[OrderID] DESC',N'@p0 nvarchar(3),
    @p1 int',@p0=N'USA',@p1=10

The red ORDER BY expression for the ROW_NUMBER() function shows the syntax for creating the rowset from which page 2 (or greater) is obtained. Notice that both Freight and OrderID fields are included in the sort expression. The green expression is a holdover from the page 1 code and isn't required by the preceding example.

It's a better approach to use an explicit query than run sp_executesql to conditionally execute string-based queries from within the stored proc because of possible problems with caching the query execution plan and the brittleness of unchecked query syntax.

Garth Wells' demonstrated how to write CAST expressions to customize ORDER BY expressions in his 2001 Dynamic ORDER BY article; a fix by "Wildthing" in the comments thread accommodates different datatypes. Here's the stored proc code to handle various single-parameter sorts:

CREATE PROCEDURE dbo.usp_GetOrdersByCountryPagedAndSorted(
    @Country nvarchar(15) = 'USA',
    @MaximumRows int = 20,
    @StartRowIndex int = 0,
    @SortCol nvarchar(128) = 'OrderID',
    @SortDir nvarchar(4) = 'DESC')
AS
SET NOCOUNT ON
SELECT TOP(@MaximumRows) t2.OrderID, t2.CustomerID, t2.EmployeeID,
    t2.OrderDate,     t2.RequiredDate, t2.ShippedDate, t2.ShipVia,
    t2.Freight, t2.ShipName, t2.ShipAddress, t2.ShipCity, t2.ShipRegion,
    t2.ShipPostalCode, t2.ShipCountry
FROM (
    SELECT ROW_NUMBER() OVER (ORDER BY
        CASE -- Numeric types ascending
            WHEN @SortCol = 'OrderID' AND @SortDir = 'ASC'
                THEN t0.OrderID
            WHEN @SortCol = 'EmployeeID' AND @SortDir = 'ASC'
                THEN t0.OrderID
            WHEN @SortCol = 'ShipperID' AND @SortDir = 'ASC'
                THEN t0.OrderID
            WHEN @SortCol = 'Freight' AND @SortDir = 'ASC'
                THEN t0.Freight
            WHEN @SortCol = 'Employee.LastName' AND @SortDir = 'ASC'
                THEN t0.EmployeeID
            WHEN @SortCol = 'Shipper.CompanyName' AND @SortDir = 'ASC'
                THEN t0.ShipVia
        END ASC,
        CASE -- Numeric types descending
            WHEN @SortCol = 'OrderID' AND @SortDir = 'DESC'
               THEN t0.OrderID
            WHEN @SortCol = 'EmployeeID' AND @SortDir = 'DESC'
               THEN t0.OrderID
            WHEN @SortCol = 'ShipperID' AND @SortDir = 'DESC'
               THEN t0.OrderID
            WHEN @SortCol = 'Freight' AND @SortDir = 'DESC'
               THEN t0.Freight
        END DESC,
        CASE -- Character types ascending
            WHEN @SortCol = 'CustomerID' AND @SortDir = 'ASC'
                THEN t0.CustomerID
            WHEN @SortCol = 'ShipName' AND @SortDir = 'ASC'
                THEN t0.ShipName
            WHEN @SortCol = 'Customer.CompanyName' AND @SortDir = 'ASC'
                THEN t0.CustomerID
        END ASC,
        CASE -- Character types descending
            WHEN @SortCol = 'CustomerID' AND @SortDir = 'DESC'
                THEN t0.CustomerID
            WHEN @SortCol = 'ShipName' AND @SortDir = 'DESC'
                THEN t0.ShipName
            WHEN @SortCol = 'Customer.CompanyName' AND @SortDir = 'DESC'
                THEN t0.CustomerID
        END DESC,
        CASE -- Date types ascending
            WHEN @SortCol = 'OrderDate' AND @SortDir = 'ASC'
                THEN t0.OrderDate
            WHEN @SortCol = 'RequiredDate' AND @SortDir = 'ASC'
                THEN t0.RequiredDate
            WHEN @SortCol = 'ShippedDate' AND @SortDir = 'ASC'
                THEN t0.ShippedDate
        END ASC,
        CASE -- Date types ascending
            WHEN @SortCol = 'OrderDate' AND @SortDir = 'DESC'
                THEN t0.OrderDate
            WHEN @SortCol = 'RequiredDate' AND @SortDir = 'DESC'
                THEN t0.RequiredDate
            WHEN @SortCol = 'ShippedDate' AND @SortDir = 'DESC'
                THEN t0.ShippedDate
        END DESC
        ) AS ROW_NUMBER, t0.OrderID,
        t0.CustomerID, t0.EmployeeID, t0.OrderDate, t0.RequiredDate,
        t0.ShippedDate, t0.ShipVia, t0.Freight, t0.ShipName, t0.ShipAddress,
        t0.ShipCity, t0.ShipRegion, t0.ShipPostalCode, t0.ShipCountry
    FROM dbo.Orders AS t0
    LEFT OUTER JOIN dbo.Customers AS t1 ON t1.CustomerID = t0.CustomerID
    WHERE Country = @Country
    ) AS t2
WHERE t2.ROW_NUMBER > @StartRowIndex
SET NOCOUNT OFF

Here's a screen capture of a later version of the test harness that sorts all columns with the preceding stored procedure. The ORDER BY clause is CustomerID DESC, OrderID DESC with many instances of ASC, ASC, ASC, ... DESC, DESC, DESC, which the query parser appears to ignore.

Click capture to display full-size image in another window.

The basic CAST approach gives the same sort sequence for CustomerID vs. Customer.CompanyName, but different sequences for EmployeeID vs. Employee.LastName and ShipVia vs. Shipper.CompanyName. Here's the modified stored proc code (red) to sort the entire data set on EntityRef values:

ALTER PROCEDURE dbo.usp_GetOrdersByCountryPagedAndSortedExt(
    @Country nvarchar(15) = 'USA',
    @MaximumRows int = 20,
    @StartRowIndex int = 0,
    @SortCol nvarchar(128) = 'OrderID',
    @SortDir nvarchar(4) = 'DESC')
AS
SET NOCOUNT ON
SELECT TOP(@MaximumRows) t2.OrderID, t2.CustomerID, t2.EmployeeID,
    t2.OrderDate,     t2.RequiredDate, t2.ShippedDate, t2.ShipVia,
    t2.Freight, t2.ShipName, t2.ShipAddress, t2.ShipCity, t2.ShipRegion,
    t2.ShipPostalCode, t2.ShipCountry
FROM (
    SELECT ROW_NUMBER() OVER (ORDER BY
        CASE -- Numeric types ascending
            WHEN @SortCol = 'OrderID' AND @SortDir = 'ASC'
                THEN t0.OrderID
            WHEN @SortCol = 'EmployeeID' AND @SortDir = 'ASC'
                THEN t0.OrderID
            WHEN @SortCol = 'ShipperID' AND @SortDir = 'ASC'
                THEN t0.OrderID
            WHEN @SortCol = 'Freight' AND @SortDir = 'ASC'
                THEN t0.Freight
        END ASC,
        CASE -- Numeric types descending
            WHEN @SortCol = 'OrderID' AND @SortDir = 'DESC'
               THEN t0.OrderID
            WHEN @SortCol = 'EmployeeID' AND @SortDir = 'DESC'
               THEN t0.OrderID
            WHEN @SortCol = 'ShipperID' AND @SortDir = 'DESC'
               THEN t0.OrderID
            WHEN @SortCol = 'Freight' AND @SortDir = 'DESC'
               THEN t0.Freight
        END DESC,
        CASE -- Character types ascending
            WHEN @SortCol = 'CustomerID' AND @SortDir = 'ASC'
                THEN t0.CustomerID
            WHEN @SortCol = 'ShipName' AND @SortDir = 'ASC'
                THEN t0.ShipName
            WHEN @SortCol = 'Customer.CompanyName' AND @SortDir = 'ASC'
                THEN Customer.CompanyName
            WHEN @SortCol = 'Employee.LastName' AND @SortDir = 'ASC'
                THEN Employee.LastName
            WHEN @SortCol = 'Shipper.CompanyName' AND @SortDir = 'ASC'
                THEN Shipper.CompanyName
        END ASC,
        CASE -- Character types descending
            WHEN @SortCol = 'CustomerID' AND @SortDir = 'DESC'
                THEN t0.CustomerID
            WHEN @SortCol = 'ShipName' AND @SortDir = 'DESC'
                THEN t0.ShipName
            WHEN @SortCol = 'Customer.CompanyName' AND @SortDir = 'DESC'
                THEN Customer.CompanyName
            WHEN @SortCol = 'Employee.LastName' AND @SortDir = 'DESC'
                THEN Employee.LastName
            WHEN @SortCol = 'Shipper.CompanyName' AND @SortDir = 'DESC'
                THEN Shipper.CompanyName
        END DESC,
        CASE -- Date types ascending
            WHEN @SortCol = 'OrderDate' AND @SortDir = 'ASC'
                THEN t0.OrderDate
            WHEN @SortCol = 'RequiredDate' AND @SortDir = 'ASC'
                THEN t0.RequiredDate
            WHEN @SortCol = 'ShippedDate' AND @SortDir = 'ASC'
                THEN t0.ShippedDate
        END ASC,
        CASE -- Date types ascending
            WHEN @SortCol = 'OrderDate' AND @SortDir = 'DESC'
                THEN t0.OrderDate
            WHEN @SortCol = 'RequiredDate' AND @SortDir = 'DESC'
                THEN t0.RequiredDate
            WHEN @SortCol = 'ShippedDate' AND @SortDir = 'DESC'
                THEN t0.ShippedDate
        END DESC
        ) AS ROW_NUMBER, t0.OrderID,
        t0.CustomerID, t0.EmployeeID, t0.OrderDate, t0.RequiredDate,
        t0.ShippedDate, t0.ShipVia, t0.Freight, t0.ShipName, t0.ShipAddress,
        t0.ShipCity, t0.ShipRegion, t0.ShipPostalCode, t0.ShipCountry
    FROM dbo.Orders AS t0
    LEFT OUTER JOIN dbo.Customers AS Customer
        ON Customer.CustomerID = t0.CustomerID
    LEFT OUTER JOIN dbo.Employees AS Employee
        ON Employee.EmployeeID = t0.EmployeeID
    LEFT OUTER JOIN dbo.Shippers AS Shipper
        ON Shipper.ShipperID = t0.ShipVia
    WHERE Customer.Country = @Country
    ) AS t2
WHERE t2.ROW_NUMBER > @StartRowIndex
SET NOCOUNT OFF

Here's the test harness with the Employee column sorted ascending and an Ext. check box to select the extended sort capability:

Click capture to display full-size image in another window.

The preceding code doesn't attempt to sort multiple columns. It might be tempting to write a stored proc for two or three successive sort sequences with EXEC or sp_executesql and dynamic T-SQL. However, LINQ to SQL won't codegen function calls for stored procs that execute dynamic SQL. Also considering the security and query plan issues, it probably makes more sense to pass two or three pairs of @SortCol/@SortDir values and stack a set of CASE statements for each.

Update 9/19/2007: Minor edits and typos corrected .

Technorati Tags: LINQ to SQL, Entity Data Model Sets, Entity Refs, LINQ, LinqDataSource, Sorting Grids, SQL Server 2005, Stored Procedures, T-SQL, CAST