Hacking LINQ Expressions: Select With Index

First, a point of clarification: I use LINQ Expressions to mean (Language-INtegrated) Query Expressions (the language feature) rather than Expression Trees (the .NET 3.5 library in System.Linq.Expressions).

So what do I mean by “Hacking LINQ Expressions”? Quite simply, I’m not content with the rather limited set of operations that query expressions allow me to represent. By understanding how queries are translated, we can use various techniques to broaden our expressive reach. I have already documented one such hack for managing IDisposable objects with LINQ, so I guess we can call this the second in an unbounded series.

The Problem

In thinking over use cases for functional construction of web control trees, I paused to think through how I would express alternate row styling. My mind immediately jumped to the overload of Select() that exposes the current element’s index:

Controls.Add(
    new Table().WithControls(
        data.Select((x, i) =>
            new TableRow() {
                CssClass = i % 2 == 0 ? "" : "alt"
            }.WithControls(
                new TableCell().WithControls(x)
            )
        )
    )
);

This works fine for simple cases, but breaks down for more complex queries:

Controls.Add(
    new Table().WithControls((
        from x in Xs
        join y in Ys on x.Key equals y.Key
        select new { x, y }
        ).Select((z, i) =>
            new TableRow() {
                CssClass = i % 2 == 0 ? "" : "alt"
            }.WithControls(
                new TableCell().WithControls(z.x.ValueX, z.y.ValueY)
            )
        )
    )
);

The Goal

Instead, I propose a simple extension method to retrieve an index at arbitrary points in a query:

var res = from x in data
          from i in x.GetIndex()
          select new { x, i };

Or our control examples:

Controls.Add(
    new Table().WithControls(
        from x in data
        from i in x.GetIndex()
        select new TableRow() {
            CssClass = i % 2 == 0 ? "" : "alt"
        }.WithControls(
            new TableCell().WithControls(x)
        )
    )
);

Controls.Add(
    new Table().WithControls(
        from x in Xs
        join y in Ys on x.Key equals y.Key
        from i in y.GetIndex()
        select new TableRow() {
            CssClass = i % 2 == 0 ? "" : "alt"
        }.WithControls(
            new TableCell().WithControls(x.ValueX, y.ValueY)
        )
    )
);

Much like in the IDisposable solution, we use a from clause to act as an intermediate assignment. But in this case our hack is a bit trickier than a simple iterator.

The Hack

For this solution we’re going to take advantage of how multiple from clauses are translated:

var res = data.SelectMany(x => x.GetIndex(), (x, i) => new { x, i });

Looking at the parameter list, we see that our collectionSelector should return the result of x.GetIndex() and our resultSelector‘s second argument needs to be an int:

public static IEnumerable<TResult> SelectMany<TSource, TResult>(
    this IEnumerable<TSource> source,
    Func<TSource, SelectIndexProvider> collectionSelector,
    Func<TSource, int, TResult> resultSelector)

The astute observer will notice that the signature of this resultSelector exactly matches the selector used by Select‘s with-index overload, trivializing the method implementation:

{
    return source.Select(resultSelector);
}

Note that we’re not even using collectionSelector! We’re just using its return type as a flag to force the compiler to use this version of SelectMany(). The rest of the pieces are incredibly simple now that we know the actual SelectIndexProvider value is never used:

public sealed class SelectIndexProvider
{
    private SelectIndexProvider() { }
}

public static SelectIndexProvider GetIndex<T>(this T element)
{
    return null;
}

And for good measure, an equivalent version to extend IQueryable<>:

public static IQueryable<TResult> SelectMany<TSource, TResult>(
    this IQueryable<TSource> source,
    Expression<Func<TSource, SelectIndexProvider>> collectionSelector,
    Expression<Func<TSource, int, TResult>> resultSelector)
{
    return source.Select(resultSelector);
}

Because we’re just calling Select(), the query expression isn’t even aware of the call to GetIndex():

System.Linq.Enumerable+<RangeIterator>d__b1.Select((x, i) => (x * i))

We’re essentially providing our own syntactic sugar over the sugar already provided by query expressions. Pretty sweet, eh?

As a final exercise for the reader, what would this print?

var res = from x in Enumerable.Range(1, 5)
          from i in x.GetIndex()
          from y in Enumerable.Repeat(i, x)
          where y % 2 == 1
          from j in 0.GetIndex()
          select i+j;

foreach (var r in res)
    Console.WriteLine(r);
About these ads
Posted in .NET, LINQ. Tags: . 2 Comments »

2 Responses to “Hacking LINQ Expressions: Select With Index”

  1. Brandon Dimperio Says:

    Is there a way to make this work with linqBridge in .net 2.0? the hinge is IQueryable(of T).

    http://www.albahari.com/nutshell/linqbridge.aspx

    • Keith Dahlby Says:

      I believe LinqBridge includes the Select() overload in question, so it should work. An IEnumerable<T> version can be pieced together from code earlier in the post:

      public static IEnumerable<TResult> SelectMany<TSource, TResult>(
          this IEnumerable<TSource> source,
          Func<TSource, SelectIndexProvider> collectionSelector,
          Func<TSource, int, TResult> resultSelector)
      {
          return source.Select(resultSelector);
      }

Comments are closed.

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: