Selecting Static Results with Dynamic LINQ

Dynamic LINQ (DLINQ) is a LINQ extension provided in the VS 2008 Samples. Scott Guthrie provides a good overview here: Dynamic LINQ (Part 1: Using the LINQ Dynamic Query Library), but the executive summary is that it implements certain query operations on IQueryable (the non-generic variety), with filtering, grouping and projection specified with strings rather than statically-typed expressions.

I’ve never had a use for it, but a question on Stack Overflow caused me to take a second look…

…the selected groupbyvalue (Group) will always be a string, and the sum will always be a double, so I want to be able to cast into something like a List, where Result is an object with properties Group (string) and TotalValue (double).

Before we can solve the problem, let’s take a closer look at why it is being asked…

DynamicExpression.CreateClass

We can use the simplest of dynamic queries to explore a bit:

[Test]
public void DLINQ_IdentityProjection_ReturnsDynamicClass()
{
    IQueryable nums = Enumerable.Range(1, 5).AsQueryable();
    IQueryable q = nums.Select("new (it as Value)");
    Type elementType = q.ElementType;

    Assert.AreEqual("DynamicClass1", elementType.Name);
    CollectionAssert.AreEqual(new[] { typeof(int) },
        elementType.GetProperties().Select(p => p.PropertyType).ToArray());
}

DLINQ defines a special expression syntax for projection that is used to specify what values should be returned and how. it refers to the current element, which in our case is an int.

The result in question comes from DynamicQueryable.Select():

public static IQueryable Select(this IQueryable source, string selector, params object[] values)
{
    LambdaExpression lambda = DynamicExpression.ParseLambda(source.ElementType, null, selector, values);
    return source.Provider.CreateQuery(
        Expression.Call(
            typeof(Queryable), "Select",
            new Type[] { source.ElementType, lambda.Body.Type },
            source.Expression, Expression.Quote(lambda)));
}

The non-generic return type suggest that the type of the values returned is unknown at compile time. If we check an element’s type at runtime, we’ll see something like DynamicClass1. Tracing down the stack from DynamicExpression.ParseLambda(), we eventually find that DynamicClass1 is generated by a call to DynamicExpression.CreateClass() in ExpressionParser.ParseNew(). CreateClass() in turn delegates to a static ClassFactory which manages a dynamic assembly in the current AppDomain to hold the new classes, each generated by Reflection.Emit. The resulting type is then used to generate the MemberInit expression that constructs the object.

Dynamic to Static

While dynamic objects are useful in some situations (thus support in C# 4), in this case we want to use static typing. Let’s specify our result type with a generic method:

IQueryable<TResult> Select<TResult>(this IQueryable source, string selector, params object[] values);

We just need a mechanism to insert our result type into DLINQ to supersede the dynamic result. This is surprisingly easy to implement, as ParseLambda() already accepts a resultType argument. We just need to capture it…

private Type resultType;
public Expression Parse(Type resultType)
{
    this.resultType = resultType;
    int exprPos = token.pos;
    // ...

…and then update ParseNew() to use the specified type:

Expression ParseNew()
{
    // ...
    NextToken();
    Type type = this.resultType ?? DynamicExpression.CreateClass(properties);
    MemberBinding[] bindings = new MemberBinding[properties.Count];
    for (int i = 0; i < bindings.Length; i++)
        bindings[i] = Expression.Bind(type.GetProperty(properties[i].Name), expressions[i]);
    return Expression.MemberInit(Expression.New(type), bindings);
}

If resultType is null, as it is in the existing Select() implementation, a DynamicClass is used instead.

The generic Select<TResult> is then completed by referencing TResult as appropriate:

public static IQueryable<TResult> Select<TResult>(this IQueryable source, string selector, params object[] values)
{
    LambdaExpression lambda = DynamicExpression.ParseLambda(source.ElementType, typeof(TResult), selector, values);
    return source.Provider.CreateQuery<TResult>(
        Expression.Call(
            typeof(Queryable), "Select",
            new Type[] { source.ElementType, typeof(TResult) },
            source.Expression, Expression.Quote(lambda)));
}

With the following usage:

public class ValueClass { public int Value { get; set; } }

[Test]
public void DLINQ_IdentityProjection_ReturnsStaticClass()
{
    IQueryable nums = Enumerable.Range(1, 5).AsQueryable();
    IQueryable<ValueClass> q = nums.Select<ValueClass>("new (it as Value)");
    Type elementType = q.ElementType;

    Assert.AreEqual("ValueClass", elementType.Name);
    CollectionAssert.AreEqual(nums.ToArray(), q.Select(v => v.Value).ToArray());
}

Note that the property names in TResult must match those in the Select query exactly. Changing the query to “new (it as value)” results in an unhandled ArgumentNullException in the Expression.Bind() call seen in the for loop of ParseNew() above, as the “value” property cannot be found.

Selecting Anonymous Types

So we can select dynamic types or existing named types, but what if we want to have the benefits of static typing without having to declare a dedicated ValueClass, as we can with anonymous types and normal static LINQ? As a variation on techniques used elsewhere, let’s can define an overload of Select() that accepts an instance of the anonymous type whose values we will ignore but using its type to infer the desired return type. The overload is trivial:

public static IQueryable<TResult> Select<TResult>(this IQueryable source, TResult template, string selector, params object[] values)
{
    return source.Select<TResult>(selector, values);
}

With usage looking like this (note the required switch to var q):

[Test]
public void DLINQ_IdentityProjection_ReturnsStaticClass()
{
    IQueryable nums = Enumerable.Range(1, 5).AsQueryable();
    var q = nums.Select(new { Value = 0 }, "new (it as Value)");
    Type elementType = q.ElementType;

    Assert.IsTrue(elementType.Name.Contains("AnonymousType"));
    CollectionAssert.AreEqual(nums.ToArray(), q.Select(v => v.Value).ToArray());
}

However, if we try the above we encounter an unfortunate error:

The property ‘Int32 Value’ has no ‘set’ accessor

As you may or may not know, anonymous types in C# are immutable (modulo changes to objects they reference), with their values set through a compiler-generated constructor. (I’m not sure if this is true in VB.) With this knowledge in hand, we can update ParseNew() to check if resultType has such a constructor that we could use instead:

    // ...
    Type type = this.resultType ?? DynamicExpression.CreateClass(properties);

    var propertyTypes = type.GetProperties().Select(p => p.PropertyType).ToArray();
    var ctor = type.GetConstructor(propertyTypes);
    if (ctor != null)
        return Expression.New(ctor, expressions);

    MemberBinding[] bindings = new MemberBinding[properties.Count];
    for (int i = 0; i < bindings.Length; i++)
        bindings[i] = Expression.Bind(type.GetProperty(properties[i].Name), expressions[i]);
    return Expression.MemberInit(Expression.New(type), bindings);
}

And with that we can now project from a dynamic query onto static types, both named and anonymous, with a reasonably natural interface.

Due to licensing I can’t post the full example, but if you’re at all curious about Reflection.Emit or how DLINQ works I would encourage you to dive in and let us know what else you come up with. Things will get even more interesting with the combination of LINQ, the DLR and C# 4’s dynamic in the coming months.

Advertisements

Hacking LINQ Expressions: Join With Comparer

In this installment of my Hacking LINQ series we’ll take a look at providing an IEqualityComparer for use in a LINQ join clause.

The Problem

Many of the Standard Query Operators require comparing sequence elements and the default query providers are kind enough to give us overloads that accept a suitable comparer. Among these operators, Join and GroupJoin have perhaps the most useful query syntax:

 var res = from s in States
          join a in AreaCodes
            on s.Abbr equals a.StateAbbr
          select new { s.Name, a.AreaCode };

While a bit more verbose, I find the intent much easier to read then the method equivalent:

var res = States.Join(AreaCodes,
                      s => s.Abbr, a => a.StateAbbr,
                      (s, a) => new { s.Name, a.AreaCode });

Or maybe I’ve just spent too much time in SQL. Either way, I thought it would be useful to support joins by a comparer.

The Goal

We will use another extension method to specify how the join should be performed:

var res = from s in States
          join a in AreaCodes.WithComparer(StringComparer.OrdinalIgnoreCase)
            on s.Abbr equals a.StateAbbr
          select new { s.Name, a.AreaCode };

We can also support the same syntax for group joins:

var res = from s in States
          join a in AreaCodes.WithComparer(StringComparer.OrdinalIgnoreCase)
            on s.Abbr equals a.StateAbbr into j
          select new { s.Name, Count = j.Count() };

The Hack

As with most LINQ hacks, we’re going to use the result of WithComparer to call a specialized version of Join or GroupJoin, in this case by providing a replacement for the join’s inner sequence:

var res = States.Join(AreaCodes.WithComparer(StringComparer.OrdinalIgnoreCase),
                      s => s.Abbr, a => a.StateAbbr,
                      (s, a) => new { s.Name, a.AreaCode });

Eventually leading to this method call:

var res = States.Join(AreaCodes,
                      s => s.Abbr, a => a.StateAbbr,
                      (s, a) => new { s.Name, a.AreaCode },
                      StringComparer.OrdinalIgnoreCase);

Since we need both the inner collection we’re extending and the comparer, we can guess our extension method will be implemented something like this:

public static JoinComparerProvider<T, TKey> WithComparer<T, TKey>(
    this IEnumerable<T> inner, IEqualityComparer<TKey> comparer)
{
    return new JoinComparerProvider<T, TKey>(inner, comparer);
}

With a trivial provider implementation:

public sealed class JoinComparerProvider<T, TKey>
{
    internal JoinComparerProvider(IEnumerable<T> inner, IEqualityComparer<TKey> comparer)
    {
        Inner = inner;
        Comparer = comparer;
    }

    public IEqualityComparer<TKey> Comparer { get; private set; }
    public IEnumerable<T> Inner { get; private set; }
}

The final piece is our Join overload:

public static IEnumerable<TResult> Join<TOuter, TInner, TKey, TResult>(
    this IEnumerable<TOuter> outer,
    JoinComparerProvider<TInner, TKey> inner,
    Func<TOuter, TKey> outerKeySelector,
    Func<TInner, TKey> innerKeySelector,
    Func<TOuter, TInner, TResult> resultSelector)
{
    return outer.Join(inner.Inner, outerKeySelector, innerKeySelector,
                      resultSelector, inner.Comparer);
}

Implementations of GroupJoin and their IQueryable counterparts are similarly trivial.

Posted in .NET, LINQ. Tags: , , . Comments Off on Hacking LINQ Expressions: Join With Comparer

Hacking LINQ Expressions: Select With Index

First, a point of clarification: I use LINQ Expressions to mean (Language-INtegrated) Query Expressions (the language feature) rather than Expression Trees (the .NET 3.5 library in System.Linq.Expressions).

So what do I mean by “Hacking LINQ Expressions”? Quite simply, I’m not content with the rather limited set of operations that query expressions allow me to represent. By understanding how queries are translated, we can use various techniques to broaden our expressive reach. I have already documented one such hack for managing IDisposable objects with LINQ, so I guess we can call this the second in an unbounded series.

The Problem

In thinking over use cases for functional construction of web control trees, I paused to think through how I would express alternate row styling. My mind immediately jumped to the overload of Select() that exposes the current element’s index:

Controls.Add(
    new Table().WithControls(
        data.Select((x, i) =>
            new TableRow() {
                CssClass = i % 2 == 0 ? "" : "alt"
            }.WithControls(
                new TableCell().WithControls(x)
            )
        )
    )
);

This works fine for simple cases, but breaks down for more complex queries:

Controls.Add(
    new Table().WithControls((
        from x in Xs
        join y in Ys on x.Key equals y.Key
        select new { x, y }
        ).Select((z, i) =>
            new TableRow() {
                CssClass = i % 2 == 0 ? "" : "alt"
            }.WithControls(
                new TableCell().WithControls(z.x.ValueX, z.y.ValueY)
            )
        )
    )
);

The Goal

Instead, I propose a simple extension method to retrieve an index at arbitrary points in a query:

var res = from x in data
          from i in x.GetIndex()
          select new { x, i };

Or our control examples:

Controls.Add(
    new Table().WithControls(
        from x in data
        from i in x.GetIndex()
        select new TableRow() {
            CssClass = i % 2 == 0 ? "" : "alt"
        }.WithControls(
            new TableCell().WithControls(x)
        )
    )
);

Controls.Add(
    new Table().WithControls(
        from x in Xs
        join y in Ys on x.Key equals y.Key
        from i in y.GetIndex()
        select new TableRow() {
            CssClass = i % 2 == 0 ? "" : "alt"
        }.WithControls(
            new TableCell().WithControls(x.ValueX, y.ValueY)
        )
    )
);

Much like in the IDisposable solution, we use a from clause to act as an intermediate assignment. But in this case our hack is a bit trickier than a simple iterator.

The Hack

For this solution we’re going to take advantage of how multiple from clauses are translated:

var res = data.SelectMany(x => x.GetIndex(), (x, i) => new { x, i });

Looking at the parameter list, we see that our collectionSelector should return the result of x.GetIndex() and our resultSelector‘s second argument needs to be an int:

public static IEnumerable<TResult> SelectMany<TSource, TResult>(
    this IEnumerable<TSource> source,
    Func<TSource, SelectIndexProvider> collectionSelector,
    Func<TSource, int, TResult> resultSelector)

The astute observer will notice that the signature of this resultSelector exactly matches the selector used by Select‘s with-index overload, trivializing the method implementation:

{
    return source.Select(resultSelector);
}

Note that we’re not even using collectionSelector! We’re just using its return type as a flag to force the compiler to use this version of SelectMany(). The rest of the pieces are incredibly simple now that we know the actual SelectIndexProvider value is never used:

public sealed class SelectIndexProvider
{
    private SelectIndexProvider() { }
}

public static SelectIndexProvider GetIndex<T>(this T element)
{
    return null;
}

And for good measure, an equivalent version to extend IQueryable<>:

public static IQueryable<TResult> SelectMany<TSource, TResult>(
    this IQueryable<TSource> source,
    Expression<Func<TSource, SelectIndexProvider>> collectionSelector,
    Expression<Func<TSource, int, TResult>> resultSelector)
{
    return source.Select(resultSelector);
}

Because we’re just calling Select(), the query expression isn’t even aware of the call to GetIndex():

System.Linq.Enumerable+<RangeIterator>d__b1.Select((x, i) => (x * i))

We’re essentially providing our own syntactic sugar over the sugar already provided by query expressions. Pretty sweet, eh?

As a final exercise for the reader, what would this print?

var res = from x in Enumerable.Range(1, 5)
          from i in x.GetIndex()
          from y in Enumerable.Repeat(i, x)
          where y % 2 == 1
          from j in 0.GetIndex()
          select i+j;

foreach (var r in res)
    Console.WriteLine(r);
Posted in .NET, LINQ. Tags: . 2 Comments »

Simplifying LazyLinq

This is the fourth in a series of posts on LazyLinq, a wrapper to support lazy initialization and deferred disposal of a LINQ query context:

  1. Introducing LazyLinq: Overview
  2. Introducing LazyLinq: Internals
  3. Introducing LazyLinq: Queryability
  4. Simplifying LazyLinq
  5. Introducing LazyLinq: Lazy DataContext

As I was iterating on the proof of concept for LazyLinq, I always wanted to get rid of the TQuery type parameter. I thought I needed it to distinguish between ordered and unordered wrapped queries, but it just felt messy. The underlying provider mechanism didn’t need it, so why should I?

Well after taking a closer look at the SQL query provider, I figured out how to eliminate it. The object of inspiration is System.Data.Linq.DataQuery<T>, defined as follows:

internal sealed class DataQuery<T> :
    IOrderedQueryable<T>, IQueryable<T>,
    IOrderedQueryable, IQueryable,
    IEnumerable<T>, IEnumerable,
    IQueryProvider,
    IListSource

The key was realizing that IOrderedQueryable<> and ILazyOrderedQueryable<> don’t actually do anything. Implementation-wise, they’re just IQueryable<> or ILazyQueryable<> with an extra interface on top. It’s only on the design side that it actually matters, essentially providing a hook for additional ordering with ThenBy. In LINQ to SQL’s case, that means supporting orderability is as simple as specifying that the query object is both IQueryable<> and IOrderedQueryable<>.

So how does this revelation simplify Lazy LINQ? First, it allows us to remove TQuery from the interfaces:

    public interface ILazyContext<TContext> : IDisposable
    {
        TContext Context { get; }
        ILazyQueryable<TContext, TResult>
            CreateQuery<TResult>(Func<TContext, IQueryable<TResult>> queryBuilder);
        TResult Execute<TResult>(Func<TContext, TResult> action);
    }

    public interface ILazyQueryable<TContext, TSource> : IQueryable<TSource>
    {
        ILazyContext<TContext> Context { get; }
        Func<TContext, IQueryable<TSource>> QueryBuilder { get; }
    }

    public interface ILazyOrderedQueryable<TContext, TSource>
        : ILazyQueryable<TContext, TSource>, IOrderedQueryable<TSource>
    { }

Note that we can also eliminate ILazyContext.CreateOrderedQuery(), instead assuming that CreateQuery() will return something that can be treated as ILazyOrderedQueryable<> as necessary.

For the concrete implementations, we take the cue from DataQuery<T>, letting LazyQueryableImpl implement ILazyOrderedQueryable<> so we can eliminate LazyOrderedQueryableImpl:

    class LazyQueryableImpl<TContext, TSource>
        : ILazyQueryable<TContext, TSource>, ILazyOrderedQueryable<TContext, TSource>
    {
        // Implementation doesn't change
    }

Finally, our sorting query operations will look more like their counterparts in System.Linq.Queryable, casting the result of CreateQuery() to ILazyOrderedQueryable<>. To keep things readable, we’ll split our CreateOrderedQuery<> helper into separate versions for OrderBy and ThenBy. Note how the types of queryOperation map to the usage of OrderBy (unordered to ordered) and ThenBy (ordered to ordered):

        private static ILazyOrderedQueryable<TContext, TResult>
            CreateOrderByQuery<TSource, TContext, TResult>(
                this ILazyQueryable<TContext, TSource> source,
                Func<IQueryable<TSource>, IOrderedQueryable<TResult>> queryOperation
            )
        {
            return (ILazyOrderedQueryable<TContext, TResult>) source.Context.CreateQuery<TResult>(
               context => queryOperation(source.QueryBuilder(context)));
        }

        private static ILazyOrderedQueryable<TContext, TResult>
            CreateThenByQuery<TSource, TContext, TResult>(
                this ILazyQueryable<TContext, TSource> source,
                Func<IOrderedQueryable<TSource>, IOrderedQueryable<TResult>> queryOperation
            )
        {
            return (ILazyOrderedQueryable<TContext, TResult>) source.Context.CreateQuery<TResult>(
               context => queryOperation((IOrderedQueryable<TSource>) source.QueryBuilder(context)));
        }

Removing TQuery from the query operators is left as an exercise for the reader. Or you can just get the source on CodePlex.

A Note on IOrderedEnumerable<>

Having taken advantage of how LINQ to IQueryable handles orderability, it’s worth pointing out that LINQ to Objects uses a different approach, specifying new behavior in IOrderedEnumerable<> that is used to support multiple sort criteria:

public interface IOrderedEnumerable<TElement> : IEnumerable<TElement>, IEnumerable
{
    IOrderedEnumerable<TElement>
        CreateOrderedEnumerable<TKey>(
            Func<TElement, TKey> keySelector,
            IComparer<TKey> comparer, bool descending);
}

Introducing LazyLinq: Queryability

This is the third in a series of posts on LazyLinq, a wrapper to support lazy initialization and deferred disposal of a LINQ query context:

  1. Introducing LazyLinq: Overview
  2. Introducing LazyLinq: Internals
  3. Introducing LazyLinq: Queryability
  4. Simplifying LazyLinq
  5. Introducing LazyLinq: Lazy DataContext

Having defined the LazyLinq interfaces and provided concrete implementations, we’re left to provide support for the standard query operators.

Learning from Queryable

Before we try to query ILazyQueryable, it’s instructive to look at how System.Linq.Queryable works. There are essentially three types of operators on IQueryable<>:

  • Deferred queries returning IQueryable<>: Select, Where, etc.
  • Deferred query returning IOrderedQueryable<>: OrderBy, OrderByDescending, ThenBy, ThenByDescending
  • Everything else: Aggregate, Count, First, etc.

Reflecting Queryable.Select, modulo error checking, we see the following:

public static IQueryable<TResult> Select<TSource, TResult>(
    this IQueryable<TSource> source, Expression<Func<TSource, TResult>> selector)
{
    return source.Provider
        .CreateQuery<TResult>(
            Expression.Call(null,
                ((MethodInfo) MethodBase.GetCurrentMethod())
                    .MakeGenericMethod(new Type[] { typeof(TSource), typeof(TResult) }),
                new Expression[] { source.Expression, Expression.Quote(selector) }));
}

The source‘s Provider is used to construct a new query whose expression includes the call to Select with the given parameters. An ordered query follows a similar pattern, trusting that the query provider will return an IOrderedQueryable<> as appropriate:

public static IOrderedQueryable<TSource> OrderBy<TSource, TKey>(
    this IQueryable<TSource> source, Expression<Func<TSource, TKey>> keySelector)
{
    return (IOrderedQueryable<TSource>) source.Provider
        .CreateQuery<TSource>(
            Expression.Call(null,
                ((MethodInfo) MethodBase.GetCurrentMethod())
                    .MakeGenericMethod(new Type[] { typeof(TSource), typeof(TKey) }),
                new Expression[] { source.Expression, Expression.Quote(keySelector) }));
}

And finally, everything that’s not a query is handled by the provider’s Execute method:

public static TSource First<TSource>(this IQueryable<TSource> source)
{
    return source.Provider
        .Execute<TSource>(
            Expression.Call(null,
                ((MethodInfo) MethodBase.GetCurrentMethod())
                    .MakeGenericMethod(new Type[] { typeof(TSource) }),
                new Expression[] { source.Expression }));
}

Querying ILazyQueryable

You may have noticed that the above scenarios map rather closely to the methods provided by ILazyContext:

  • ILazyQueryable<TContext, TResult, TQuery>
        CreateQuery<TResult, TQuery>(Func<TContext, TQuery> queryBuilder)
        where TQuery : IQueryable<TResult>;
  • ILazyOrderedQueryable<TContext, TResult, TQuery>
        CreateOrderedQuery<TResult, TQuery>(Func<TContext, TQuery> queryBuilder)
        where TQuery : IOrderedQueryable<TResult>;
  • TResult Execute<TResult>(Func<TContext, TResult> action);

However, rather than expression trees we’re pushing around delegates. Execute seems pretty simple, so let’s start there:

public static TSource First<TSource, TContext, TQuery>(
        this ILazyQueryable<TContext, TSource, TQuery> source
    ) where TQuery : IQueryable<TSource>
{
    Func<TContext, TResult> action = context => ???;
    return source.Context.Execute(action);
}

So our source has a Context, which knows how to Execute an action from context to some result. To find that result, we need to leverage the other property of source: QueryBuilder. Recalling that QueryBuilder is a function from TContext to TQuery, and that TQuery is an IQueryable<TSource>, we see something on which we can execute Queryable.First:

public static TSource First<TSource, TContext, TQuery>(
        this ILazyQueryable<TContext, TSource, TQuery> source
    ) where TQuery : IQueryable<TSource>
{
    Func<TContext, TResult> action =
        context => source.QueryBuilder(context).First();
    return source.Context.Execute(action);
}

Now seeing as we have dozens of methods to implement like this, it seems an opportune time for a bit of eager refactoring. Recognizing that the only variance is the method call on the IQueryable, let’s extract an extension method that does everything else:

private static TResult Execute<TSource, TContext, TResult, TQuery>(
        this ILazyQueryable<TContext, TSource, TQuery> source,
        Func<TQuery, TResult> queryOperation
    ) where TQuery : IQueryable<TSource>
{
    return source.Context.Execute(
        context => queryOperation(source.QueryBuilder(context)));
}

From there, additional lazy operators are just a lambda expression away:

public static TSource First<TSource, TContext, TQuery>(
        this ILazyQueryable<TContext, TSource, TQuery> source
    ) where TQuery : IQueryable<TSource>
{
    return source.Execute(q => q.First());
}

public static TAccumulate Aggregate<TContext, TSource, TQuery, TAccumulate>(
        this ILazyQueryable<TContext, TSource, TQuery> source,
        TAccumulate seed, Expression<Func<TAccumulate, TSource, TAccumulate>> func
    ) where TQuery : IQueryable<TSource>
{
    return source.Execute(q => q.Aggregate(seed, func));
}

And now having done the hard part (finding an IQueryable), we can translate that understanding to make similar helpers for queries:

private static ILazyQueryable<TContext, TResult, IQueryable<TResult>>
    CreateQuery<TSource, TContext, TQuery, TResult>(
        this ILazyQueryable<TContext, TSource, TQuery> source,
        Func<TQuery, IQueryable<TResult>> queryOperation
    ) where TQuery : IQueryable<TSource>
{
    return source.Context.CreateQuery<TResult, IQueryable<TResult>>(
        context => queryOperation(source.QueryBuilder(context)));
}

private static ILazyOrderedQueryable<TContext, TResult, IOrderedQueryable<TResult>>
    CreateOrderedQuery<TSource, TContext, TQuery, TResult>(
        this ILazyQueryable<TContext, TSource, TQuery> source,
        Func<TQuery, IOrderedQueryable<TResult>> queryOperation
    ) where TQuery : IQueryable<TSource>
{
    return source.Context.CreateOrderedQuery<TResult, IOrderedQueryable<TResult>>(
        context => queryOperation(source.QueryBuilder(context)));
}

With similarly trivial query operator implementations:

public static ILazyQueryable<TContext, TResult, IQueryable<TResult>>
    Select<TContext, TSource, TQuery, TResult>(
        this ILazyQueryable<TContext, TSource, TQuery> source,
        Expression<Func<TSource, TResult>> selector
    ) where TQuery : IQueryable<TSource>
{
    return source.CreateQuery(q => q.Select(selector));
}

public static ILazyOrderedQueryable<TContext, TSource, IOrderedQueryable<TSource>>
    OrderBy<TContext, TSource, TQuery, TKey>(
        this ILazyQueryable<TContext, TSource, TQuery> source,
        Expression<Func<TSource, TKey>> keySelector
    ) where TQuery : IQueryable<TSource>
{
    return source.CreateOrderedQuery(q => q.OrderBy(keySelector));
}

And the end result:

LazyLinq.First

Introducing LazyLinq: Internals

This is the second in a series of posts on LazyLinq, a wrapper to support lazy initialization and deferred disposal of a LINQ query context:

  1. Introducing LazyLinq: Overview
  2. Introducing LazyLinq: Internals
  3. Introducing LazyLinq: Queryability
  4. Simplifying LazyLinq
  5. Introducing LazyLinq: Lazy DataContext

My first post introduced the three interfaces that LazyLinq provides. Next, we get to implement them.

Implementing ILazyQueryable

First, the interface:

    public interface ILazyQueryable<TContext, TSource, TQuery>
        : IQueryable<TSource>
        where TQuery : IQueryable<TSource>
    {
        ILazyContext<TContext> Context { get; }
        Func<TContext, TQuery> QueryBuilder { get; }
    }

We’ll start simple with an implicit implementation of the interface and a trivial constructor:

    class LazyQueryableImpl<TContext, TSource, TQuery>
            : ILazyQueryable<TContext, TSource, TQuery>
            where TQuery : IQueryable<TSource>
    {
            public ILazyContext<TContext> Context { get; private set; }
            public Func<TContext, TQuery> QueryBuilder { get; private set; }

            internal LazyQueryableImpl(ILazyContext<TContext> deferredContext, Func<TContext, TQuery> queryBuilder)
            {
                if (deferredContext == null) throw new ArgumentNullException("deferredContext");
                if (queryBuilder == null) throw new ArgumentNullException("queryBuilder");

                Context = deferredContext;
                QueryBuilder = queryBuilder;
            }

Next, a lazy-loaded query built from our lazy context:

            protected TQuery Query
            {
                get
                {
                    if (query == null)
                    {
                        query = QueryBuilder(Context.Context);
                        if (query == null)
                            throw new InvalidOperationException("Query built as null.");
                    }
                    return query;
                }
            }

And the internals of managing Context, which implements IDisposable:

            private void Dispose()
            {
                Context.Dispose();
                query = default(TQuery);
            }

            private IEnumerator<TSource> GetEnumerator()
            {
                try
                {
                    foreach (var i in Query)
                        yield return i;
                }
                finally
                {
                    Dispose();
                }
            }

Since Query depends on Context, once Context is disposed we need to reset Query so a new one can be built (if possible). Note that we use an iterator here to return an IEnumerator<TSource>, rather than the usual IEnumerable<>.

Finally, we’ll close out by explicitly implementing IQueryable:

            IEnumerator<TSource> IEnumerable<TSource>.GetEnumerator()
            {
                return GetEnumerator();
            }

            IEnumerator IEnumerable.GetEnumerator()
            {
                return GetEnumerator();
            }

            Type IQueryable.ElementType
            {
                get { return Query.ElementType; }
            }

            Expression IQueryable.Expression
            {
                get { return Query.Expression; }
            }

            IQueryProvider IQueryable.Provider
            {
                get { return Query.Provider; }
            }
        }

If this seemed relatively simple, you’re right. We’re just building a lazy-loaded Query proxy, with a bit of plumbing to clean up our Context.

Implementing ILazyOrderedQueryable

Not very exciting, but for completeness:

        public interface ILazyOrderedQueryable<TContext, TSource, TQuery>
            : ILazyQueryable<TContext, TSource, TQuery>, IOrderedQueryable<TSource>
            where TQuery : IOrderedQueryable<TSource>
        { }

        class LazyOrderedQueryableImpl<TContext, TSource, TQuery>
                : LazyQueryableImpl<TContext, TSource, TQuery>, ILazyOrderedQueryable<TContext, TSource, TQuery>
                where TQuery : IOrderedQueryable<TSource>
        {
            internal LazyOrderedQueryableImpl(ILazyContext<TContext> lazyContext, Func<TContext, TQuery> queryBuilder)
                : base(lazyContext, queryBuilder)
            {
            }
        }

LazyQueryable Factory

Consumers of this API should never need to know about these implementation details, so we can hide them behind a factory class:

        public static class LazyQueryable
        {
            public static ILazyQueryable<TContext, TResult, TQuery> CreateQuery<TContext, TResult, TQuery>(
                ILazyContext<TContext> context, Func<TContext, TQuery> queryBuilder)
                where TQuery : IQueryable<TResult>
            {
                return new LazyQueryableImpl<TContext, TResult, TQuery>(context, queryBuilder);
            }
            public static ILazyOrderedQueryable<TContext, TResult, TQuery> CreateOrderedQuery<TContext, TResult, TQuery>(
                ILazyContext<TContext> context, Func<TContext, TQuery> queryBuilder)
                where TQuery : IOrderedQueryable<TResult>
            {
                return new LazyOrderedQueryableImpl<TContext, TResult, TQuery>(context, queryBuilder);
            }
        }

Implementing ILazyContext

Again, we’ll start with the interface:

        public interface ILazyContext<TContext> : IDisposable
        {
            TContext Context { get; }

            ILazyQueryable<TContext, TResult, TQuery>
                CreateQuery<TResult, TQuery>(Func<TContext, TQuery> queryBuilder)
                where TQuery : IQueryable<TResult>;

            ILazyOrderedQueryable<TContext, TResult, TQuery>
                CreateOrderedQuery<TResult, TQuery>(Func<TContext, TQuery> queryBuilder)
                where TQuery : IOrderedQueryable<TResult>;

            TResult Execute<TResult>(Func<TContext, TResult> action);
        }

Now we can start fulfilling our requirements:

1. Lazily expose the Context.

        class LazyContextImpl<TContext> : ILazyContext<TContext>, IDisposable
        {
            public Func<TContext> ContextBuilder { get; private set; }

            private TContext context;
            public TContext Context
            {
                get
                {
                    if (context == null)
                    {
                        context = ContextBuilder();
                        if (context == null)
                            throw new InvalidOperationException("Context built as null.");
                    }
                    return context;
                }
            }

2. Produce lazy wrappers to represent queries retrieved from a context by a delegate.

            public ILazyQueryable<TContext, TResult, TQuery> CreateQuery<TResult, TQuery>(
                Func<TContext, TQuery> queryBuilder)
                where TQuery : IQueryable<TResult>
            {
                return LazyQueryable.CreateQuery<TContext, TResult, TQuery>(this, queryBuilder);
            }

            public ILazyOrderedQueryable<TContext, TResult, TQuery> CreateOrderedQuery<TResult, TQuery>(
                Func<TContext, TQuery> queryBuilder)
                where TQuery : IOrderedQueryable<TResult>
            {
                return LazyQueryable.CreateOrderedQuery<TContext, TResult, TQuery>(this, queryBuilder);
            }

3. Execute an action on the context.

There are two ways to “complete” a query, and we need to clean up context after each. The first was after enumeration, implemented above. The second is on execute, implemented here:

            public TResult Execute<TResult>(Func<TContext, TResult> expression)
            {
                try
                {
                    return expression(Context);
                }
                finally
                {
                    Dispose();
                }
            }

4. Ensure the context is disposed as necessary.

We don’t require that TContext is IDisposable, but we need to handle if it is. We also clear context to support reuse.

            public void Dispose()
            {
                var disposable = context as IDisposable;
                if (disposable != null)
                    disposable.Dispose();

                context = default(TContext);
            }

Constructors

With our requirements met, we just need a way to create our context. We provide two options:

            internal LazyContextImpl(TContext context) : this(() => context) { }

            internal LazyContextImpl(Func<TContext> contextBuilder)
            {
                if (contextBuilder == null) throw new ArgumentNullException("contextBuilder");

                ContextBuilder = contextBuilder;
            }

The former wraps an existing TContext instance in a closure, meaning every time ContextBuilder is called it returns the same instance. The latter accepts any delegate that returns a TContext. The most common such delegate would be a simple instantiation: () => new MyDataContext().

It should be clear now why we would want to clear our context on dispose. If ContextBuilder returns a new context instance each time, it’s perfectly safe to discard of the old (disposed) context to trigger the creation of a new one. Conversely, if the builder returns a single instance, using the context after disposal would trigger an ObjectDisposedException or something similar.

LazyContext Factory

For consistency, we should also provide factory methods to hide this specific implementation:

            public static class LazyContext
            {
                public static ILazyContext<T> Create<T>(T context)
                {
                    return new LazyContextImpl<T>(context);
                }
                public static ILazyContext<T> Create<T>(Func<T> contextBuilder)
                {
                    return new LazyContextImpl<T>(contextBuilder);
                }
            }

Lazy Extensions

And last, but certainly not least, we’re ready to reimplement our Use() extension methods:

    public static class Lazy
    {
        public static ILazyContext<TContext> Use<TContext>(this TContext @this)
        {
            return LazyContext.Create<TContext>(@this);
        }

        public static ILazyContext<TContext> Use<TContext>(this Func<TContext> @this)
        {
            return LazyContext.Create<TContext>(@this);
        }
    }

With several usage possibilities:

    var r1 = from x in new MyDataContext().Use() ...;

    Func<MyDataContext> f1 = () => new MyDataContext();
    var r2 = from x in f1.Use() ...;

    var r3 = from x in new Func<MyDataContext>(() => new MyDataContext()).Use() ...;

    var r4 = from x in Lazy.Use(() => new MyDataContext()) ...;

Or maybe we can make it even easier. Maybe…

Introducing LazyLinq: Overview

This is the first in a series of posts on LazyLinq, a wrapper to support lazy initialization and deferred disposal of a LINQ query context, including LINQ to SQL’s DataContext:

  1. Introducing LazyLinq: Overview
  2. Introducing LazyLinq: Internals
  3. Introducing LazyLinq: Queryability
  4. Simplifying LazyLinq
  5. Introducing LazyLinq: Lazy DataContext

Motivation

I recently posted an approach to dealing with IDisposable objects and LINQ. In the comments at LosTechies, Steve Gentile mentioned that my IQueryable example didn’t actually compile:

IQueryable<MyType> MyFunc(string myValue)
{
    return from dc in new MyDataContext().Use()
           from row in dc.MyTable
           where row.MyField == myValue
           select row;
}

Steve suggested using AsQueryable() on the result of the query, which does indeed fix the build. However, the purpose of returning an IQueryable is that it would allow us to perform additional query operations using the original query provider. Since the query result isn’t IQueryable, AsQueryable() will use a query provider based on LINQ to Objects, with an additional performance penalty to compile the expression trees into IL.

Even worse, because Use() is returning an IEnumerable<T> the entire query is actually executed with LINQ to Objects. Even though dc.MyTable is IQueryable, the translated query treats it as a simple IEnumerable, essentially performing a SELECT * before executing all query operations on entity objects in memory. It should go without saying that this is less than ideal.

Introducing LazyLinq

After several iterations, I think I have a better solution. In this post I’ll review the architecture of the solution, with posts to follow detailing the implementation.

LazyLinq is implemented around three interfaces. The first serves as a deferred query context provider:

    public interface ILazyContext<TContext> : IDisposable
    {
        TContext Context { get; }

        ILazyQueryable<TContext, TResult, TQuery>
            CreateQuery<TResult, TQuery>(Func<TContext, TQuery> queryBuilder)
            where TQuery : IQueryable<TResult>;

        ILazyOrderedQueryable<TContext, TResult, TQuery>
            CreateOrderedQuery<TResult, TQuery>(Func<TContext, TQuery> queryBuilder)
            where TQuery : IOrderedQueryable<TResult>;

        TResult Execute<TResult>(Func<TContext, TResult> action);
    }

An implementer of ILazyContext has four responsibilities:

  1. Lazily expose the Context.
  2. Produce lazy wrappers to represent queries retrieved from a context by a delegate.
  3. Execute an action on the context.
  4. Ensure the context is disposed as necessary.

The remaining interfaces serve as lazy query wrappers, corresponding to IQueryable<T> and IOrderedQueryable<T>:

    public interface ILazyQueryable<TContext, TSource, TQuery>
        : IQueryable<TSource>
        where TQuery : IQueryable<TSource>
    {
        ILazyContext<TContext> Context { get; }
        Func<TContext, TQuery> QueryBuilder { get; }
    }
    public interface ILazyOrderedQueryable<TContext, TSource, TQuery>
        : ILazyQueryable<TContext, TSource, TQuery>, IOrderedQueryable<TSource>
        where TQuery : IOrderedQueryable<TSource>
    { }

An implementer of ILazyQueryable has four responsibilities:

  1. Expose the Context from which it was created.
  2. Expose a delegate that represents how the deferred query is built from Context.
  3. Implement IQueryable for the deferred query.
  4. Ensure the context is disposed after the query is enumerated.

If it seems like these interfaces don’t do much, you’re absolutely correct. As we’ll see later, the light footprint gives us considerable flexibility.

LINQ to ILazyContext

Defining a few interfaces is all well and good, but the real goal is to simplify working with our disposable context. What if I told you that our original use case didn’t need to change at all (other than the lazy return type)?

ILazyQueryable<MyType> MyFunc(string myValue)
{
    return from dc in new MyDataContext().Use()
           from row in dc.MyTable
           where row.MyField == myValue
           select row;
}

We can’t implement it yet, but our new Use() extension method will have this signature:

    public static ILazyContext<TContext> Use<TContext>(this TContext @this) { ... }

This is where we really start to bend LINQ against its will. As the first step in the query translation process, the compiler will translate our from clauses into a call to SelectMany. All we need to do is provide a SelectMany method for ILazyContext that the compiler will find acceptable:

    public static ILazyQueryable<TContext, TResult, IQueryable<TResult>> SelectMany<TContext, TCollection, TResult>(
        this ILazyContext<TContext> lazyContext,
        Expression<Func<TContext, IQueryable<TCollection>>> collectionSelector,
        Expression<Func<TContext, TCollection, TResult>> resultSelector)
    {

The method signature is a slight variation from the corresponding overload of Queryable.SelectMany(), changed to require that collectionSelector returns an IQueryable that we can defer. If it doesn’t, the compiler will complain:

An expression of type ‘System.Collections.Generic.IEnumerable<int>’ is not allowed in a subsequent from clause in a query expression with source type ‘Solutionizing.Linq.Test.MyDataContext<Solutionizing.Linq.Test.MyDataContext>’. Type inference failed in the call to ‘SelectMany’.

Now that we’ve hijacked the query, we can control the rest of the translation process with the returned ILazyQueryable. Recalling that our ILazyContext knows how to make an ILazyQueryable, we just need to give it a QueryBuilder delegate:

        return lazyContext.CreateQuery<TResult, IQueryable<TResult>>(context =>
            {
                Func<TContext, IQueryable<TCollection>> getQueryFromContext = collectionSelector.Compile();
                IQueryable<TCollection> query = getQueryFromContext(context);

                ParameterExpression rangeParameter = resultSelector.Parameters[1];
                InvocationExpression invoke = Expression.Invoke(resultSelector, Expression.Constant(context), rangeParameter);
                Expression<Func<TCollection, TResult>> selector = Expression.Lambda<Func<TCollection, TResult>>(invoke, rangeParameter);

                return query.Select(selector);
            });
    }

This is pretty dense, so let’s walk through it:

  1. Our lambda expression’s context parameter represents the MyDataContext that will be passed in eventually.
  2. We’re going to manipulate the expression trees passed into the method, which will look something like this:
    • collectionSelector: dc => dc.MyTable
    • resultSelector: (dc, row) => new { dc = dc, row = row }
  3. Compiling collectionSelector produces a delegate we can invoke on context to get an IQueryable<TCollection>context.MyTable, in this case.
  4. Before we can use resultSelector on MyTable, we need to wrap it in a lambda expression to eliminate its first parameter.:
    1. Save the second parameter (row) to use later.
    2. Create a new invocation expression that will represent calling resultSelector with the current context and our saved row parameter.
    3. Create a new lambda expression that will accept that same row parameter and return the invocation expression.
  5. The resulting selector, of type Expression<Func<TCollection, TResult>>, can then be passed to query.Select() which happily returns the desired IQueryable<TResult>.

Essentially we’re pretending that the SelectMany call is just a Select call on the IQueryable<TCollection> generated by collectionSelector, all wrapped in a lazy query.

Hopefully this overview has piqued your interest. Next time we’ll look at a base implementation of the interfaces.