Improve Your Code Golf Game with LINQ

I always enjoy a good coding challenge, and variations of code golf are most common. For the uninitiated, code golf provides a problem with the objective of providing a solution that requires the fewest keystrokes or lines. While production code certainly deserves more white space than games tend to afford, there are still some lessons we can learn from the experience.

This particular post comes on the heels of Scott Hanselman‘s casual challenge to clean up some of his code in as few lines as possible. The general requirements are as follows:

  1. Accept a path to a project
  2. For each of a few files in each project…
  3. Back up the file
  4. Perform a few string replacements in the file
  5. Save the updated version of the file

As I was looking over his code for some chances to optimize, it quickly became clear that the bulk of the “hard” stuff could be solved through some LINQ-supported functional programming. I posted a first draft, upon which others iterated, but his spam filter ate this new version so I thought it might be educational to walk through it here instead.

First, let’s define a few arrays that will specify the entirety of our configuration, along with the input array of project paths:

static void Main(string[] args)
{
    if (args.Length == 0) Console.WriteLine("Usage: ASPNETMVCUpgrader pathToProject1 [pathToProject2] [pathToProject3]");

    var configFiles = new[] { "web.config", @"Views\web.config" };
    var changes = new[] {
        new { Regex = new Regex(@"(?<1>System.Web.Mvc, Version=)1.0(?<2>.0.0,)", RegexOptions.Compiled), Replacement = "${1}1.1${2}"},
        new { Regex = new Regex(@"(?<1>System.Web.Routing, Version=)3.5(?<2>.0.0,)", RegexOptions.Compiled), Replacement = "${1}4.0${2}"} };

The regular expressions are based on those provided by commenter Dušan Radovanović. Next, we can use some LINQ to build a list of all our files to update:

    var filesToUpdate = from file in configFiles
                        from projectPath in args
                        let path = Path.Combine(projectPath, file)
                        where File.Exists(path)
                        select new { Path = path, Content = File.ReadAllText(path) };

If you’re not familiar with C# 3.0, by line this does the following:

  1. Let file be the current item in configFiles.
  2. Let projectPath be the current item in args.
  3. Let path be the combined value of projectPath and file.
  4. Only include path values for files that exist.
  5. Create new anonymous objects with Path and Content properties set to the path and file contents, respectively.

As with most LINQ operations, execution of this code will be deferred until filesToUpdate is enumerated.

Now we’re ready to update our files. First, I’ll define a sequence of our possible backup file names, which will add “.backup_XX” to the file name.* Since the sequence is lazily evaluated, we can just call LINQ’s First() to find an available backup file name. Note that First() would throw an exception if all 100 files existed, as the backupFileNames sequence would be empty.

    foreach (var file in filesToUpdate)
    {
        var backupFileNames = from n in Enumerable.Range(0, 100)
                              let backupPath = string.Format("{0}.backup_{1:00}", file.Path, n)
                              where !File.Exists(backupPath)
                              select backupPath;

        File.Move(file.Path, backupFileNames.First());

Finally, we need to actually update the file content. To do that, we’ll use LINQ’s Aggregate operator:

        string newContent = changes.Aggregate(file.Content, (s, c) => c.Regex.Replace(s, c.Replacement));
        File.WriteAllText(file.Path, newContent);
        Console.WriteLine("Done converting: {0}", file.Path);
    }
}

Aggregate takes two parameters: a seed value and a function that defines the aggregation. In our case, the seed value is of type string and the function is of type Func<string, 'a, string>, where 'a is our anonymous type with Regex and Replacement properties. In practice, this call is going to take our original content and apply each of our changes in succession, using the result of one replacement as the input to the next. In functional terminology, Aggregate is known as a left fold; for more on Aggregate and folds, see this awesome post by language guru Bart de Smet.

What strikes me about this code is that it’s both terse and expressive. And for the purposes of the challenge, we can rewrite some of the queries in extension method syntax:

static void Main(string[] args)
{
  if (args.Length == 0) Console.WriteLine("Usage: ASPNETMVCUpgrader pathToProject1 [pathToProject2] [pathToProject3]");

  var configFiles = new[] { "web.config", @"Views\web.config" };
  var changes = new[] {
    new { Regex = new Regex(@"(?<1>System.Web.Mvc, Version=)1.0(?<2>.0.0,)", RegexOptions.Compiled), Replacement = "${1}1.1${2}"},
    new { Regex = new Regex(@"(?<1>System.Web.Routing, Version=)3.5(?<2>.0.0,)", RegexOptions.Compiled), Replacement = "${1}4.0${2}"} };

  var files = from path in configFiles.SelectMany(file => args, (file, arg) => Path.Combine(arg, file))
              where File.Exists(path) select new { Path = path, Content = File.ReadAllText(path) };

  foreach (var file in files)
    try
    {
      File.Move(file.Path, Enumerable.Range(0, 100).Select(n => string.Format("{0}.backup_{1:00}", file.Path, n)).First(p => !File.Exists(p)));
      File.WriteAllText(file.Path, changes.Aggregate(file.Content, (s, c) => c.Regex.Replace(s, c.Replacement)));
      Console.WriteLine("Done converting: {0}", file.Path);
    }
    catch (Exception ex) { Console.WriteLine("Error with: {0}" + Environment.NewLine + "Exception: {1}", file.Path, ex.Message); }
}

* The original code had the most recent backup with extension .mvc10backup, with the next oldest backup called .mvc10backup2. My original version extended this concept to “unlimited” backups with old backups continuously incremented so the lower values were more recent. It could probably be improved, but I thought I’d include the adapted code here for completeness:

  foreach (var file in files)
    try
    {
      var backupPaths = Enumerable.Repeat<int?>(null, 1)
            .Concat(Enumerable.Range(2, int.MaxValue - 2).Select(i => (int?)i))
            .Select(i => Path.ChangeExtension(filename, ".mvc10backup" + i));
      string toCopy = file.Path;
      foreach (var f in backupPaths.TakeWhile(_ => toCopy != null))
      {
          string temp = null;
          if (File.Exists(f))
              File.Move(f, temp = f + "TEMP");
          File.Move(toCopy, f);
          toCopy = temp;
      }
      File.WriteAllText(file.Path, changes.Aggregate(file.Content, (s, c) => c.Regex.Replace(s, c.Replacement)));
      Console.WriteLine("Done converting: {0}", file.Path);
    }
    catch (Exception ex) { Console.WriteLine("Error with: {0}" + Environment.NewLine + "Exception: {1}", file.Path, ex.Message); }
}
Posted in .NET. Tags: , . Comments Off on Improve Your Code Golf Game with LINQ

Refactoring with LINQ & Iterators: FindDescendantControl and GetDescendantControls

A while back I put together a quick and dirty implementation of a FindControl extension method:

public static T FindControl<T>(this Control root, string id) where T : Control
{
    Control c = root;
    Queue<Control> q = new Queue<Control>();

    if (c == null || c.ID == id)
        return c as T;
    do
    {
        foreach (Control child in c.Controls)
        {
            if (child.ID == id)
                return child as T;
            if (child.HasControls())
                q.Enqueue(child);
        }
        c = q.Dequeue();
    } while (c != null);
    return null;
}

It got the job done (if the control exists!), but I think we can do better.

Refactoring with Iterators

My first concern is that the method is doing too much. Rather than searching for the provided ID, the majority of the code is devoted to navigating the control’s descendents. Let’s factor out that logic into its own method:

public static IEnumerable<Control> GetDescendantControls(this Control root)
{
    var q = new Queue<Control>();

    var current = root;
    while (true)
    {
        if (current != null && current.HasControls())
            foreach (Control child in current.Controls)
                q.Enqueue(child);

        if (q.Count == 0)
            yield break;

        current = q.Dequeue();
        yield return current;
    }
}

The new method is almost as long as the old one, but now satisfies the Single Responsibility Principle. I also added a check to prevent calling Dequeue() on an empty queue. For those that have studied algorithms, note that this is a breadth-first tree traversal.

Now we can update FindControl:

public static T FindControl<T>(this Control root, string id) where T : Control
{
    Control c = root;

    if (c == null || c.ID == id)
        return c as T;

    foreach (Control child in c.GetDescendantControls())
    {
        if (child.ID == id)
            return child as T;
    }
    return null;
}

With the control tree traversal logic extracted, this updated version is already starting to smell better. But we’re not done yet.

DRY? Don’t Repeat Someone Else, Either

My second concern is how we’re checking for the ID in question. It’s not that the equality operator is a bad choice, as it will work in many scenarios, but rather that it’s not consistent with the existing FindControl method. In particular, the existing FindControl understands naming containers (IDs that contain ‘$’ or ‘:’). Rather than implement our own comparison logic, we should just leverage the framework’s existing implementation:

public static T FindControl<T>(this Control root, string id) where T : Control
{
    if (id == null)
        throw new ArgumentNullException("id");

    if (root == null)
        return null;

    Control c = root.FindControl(id);
    if (c != null)
        return c as T;

    foreach (Control child in c.GetDescendantControls())
    {
        c = child.FindControl(id);
        if (c != null)
            return child as T;
    }
    return null;
}

Fun fact: FindControl will throw a NullReferenceException if id is null.

Refactoring with LINQ

So we have extracted the descendant logic and leaned on the framework for finding the controls, but I’m still not quite satisfied. The method just feels too…procedural. Let’s break down what we’re really trying to do:

  1. Look at the current control and all its descendants.
  2. Use FindControl on each with the specified ID.
  3. When we find the control, return it as type T.

As the subheading might suggest, we can express these steps quite nicely with LINQ:

  1. var controls = root.AsSingleton().Concat(root.GetDescendantControls());
  2. var foundControls = from c in controls
                        let found = c.FindControl(id)
                        where found != null
                        select found;
  3. return foundControls.FirstOrDefault() as T;

Behind the scenes, this is how I might have thought through this code:

  1. We use AsSingleton() (my new preferred name, to align with F#’s Seq.singleton, for AsEnumerable(), which I introduced here) and Concat() to prepend root to the list of its descendants, returned as a lazy enumeration.
  2. We use a query over those controls to retrieve matches from FindControl(), again returned as a lazy enumeration.
  3. We grab the first control found, or null if none match, and return it as T.

Because all our enumerations are lazy, we put off traversal of the entire control tree until we know we need to. In fact, if our ID is found in the root control, GetDescendantControls() won’t even be called! Through just a bit of refactoring, we have both an efficient and readable solution.

For completeness, here’s the final version with a more descriptive name to contrast with the existing FindControl():

public static T FindDescendantControl<T>(this Control root, string id) where T : Control
{
    if (id == null)
        throw new ArgumentNullException("id");

    if (root == null)
        return null;

    var controls = root.AsSingleton().Concat(root.GetDescendantControls());

    var foundControls = from c in controls
                        let found = c.FindControl(id)
                        where found != null
                        select found;

    return foundControls.FirstOrDefault() as T;
}

I have added these methods, along with AsSingleton() and a host of others, to the SharePoint Extensions Lib project. Check it out!

TextReader.TryReadLine And Idiomatic Code

Today I reread a post by Eric Lippert on High Maintenance code. It’s an interesting read in general, but some of the comments got me thinking about idioms in code. The discussion started with this bit of C#:

string line;
while ((line = reader.ReadLine()) != null)
    yield return line;

A familiar pattern in C and C++, but it just doesn’t feel right for C#. Eric cites this is a flaw “because it tries to do so much in one line.” His rewrite uses the following instead:

while (true)
{
    string line = reader.ReadLine();
    if (line == null)
        yield break;
    yield return line;
}

While certainly easier to read, it still seems like a lot of work. Commenter Sebastien Lorion suggests  this is a flaw in the API: “…we would not need such a hack or the awkward code you posted if the method was better designed.” I’m not sure I agree—ReadLine implies the line read will be returned, and if there’s nothing to read then nothing (null) should be returned. Sebastien’s “better design” doesn’t make sense idiomatically; what he really wants is TryReadLine, which would closely match his suggested signature of bool ReadLine(out string line). This strikes me as a perfect extension candidate:

public static bool TryReadLine(this TextReader reader, out string line)
{
    line = reader.ReadLine();
    return line != null;
}

With which we can write a more C#-idiomatic yet compact version of the original code:

public static IEnumerable<string> GetLines(this TextReader reader)
{
    string line;
    while (reader.TryReadLine(out line))
        yield return line;
}

Now one could argue whether or not TryReadLine() should catch the possible exceptions from ReadLine(), but that’s not the point. The point is that C# has an established pattern for methods that return a boolean indicating success in populating an out parameter, and code written as such is easier to read in a C# context.

It’s also interesting to note how the coding patterns can evolve with language features. For example, F# happens to have an Option type that aligns perfectly with this pattern: it either has Some value or is None. Paired with pattern matching, this presents a superior alternative to booleans and out parameters. So in idiomatic F# we would write TryReadLine like this instead:

namespace System.IO
    [<AutoOpen>]
    module TextReaderExtensions
        type System.IO.TextReader with
            member r.TryReadLine() =
                match r.ReadLine() with
                | null  -> None
                | line  -> Some line

Which could be used tail-recursively to fetch a sequences of lines:

            member r.Lines =
                let rec lines_core (tr : TextReader) =
                    seq {
                        match tr.TryReadLine() with
                        | None   -> yield! Seq.empty
                        | Some l -> yield l; yield! lines_core tr
                    }
                lines_core r

Eric ends his article with excellent advice: “Whenever you write a method think about the contract of that method.” As a corollary to that, whenever you write a method think about idioms that fit the method contract. An important part of maintainability is writing code that makes sense in the context of other code that has been written. Clever code that nobody can read isn’t particularly clever.

Posted in .NET, F#. Tags: . Comments Off on TextReader.TryReadLine And Idiomatic Code

Elegant Inline Debug Tracing

As much fun as it is to step through code with a debugger, I usually prefer to use System.Diagnostics.Debug and Trace with DebugView to see what’s happening in realtime. This is particularly handy to track intermediate results in higher-order functions that you might not be able to step into. However, it’s not always convenient to insert debugging statements amongst the composed expressions of F#, PowerShell or LINQ.

An alternative first came to mind while working in F#:

let dbg x = System.Diagnostics.Debug.WriteLine(x |> sprintf "%A"); x

(Read |> as “as next parameter to”.) We can then use this function anywhere to peek at a value, perhaps an intermediate list in this trivial example:

let data = [1..10]
           |> List.filter (fun i -> i%3 = 0) |> dbg
           |> List.map (fun i -> i*i)

Indeed [3; 6; 9] are traced as multiples of three. Not a particularly convincing example, but it should be pretty easy to imagine a more complex algorithm for which unintrusive tracing would be useful.

This works pretty well with F#’s |> operator to push values forward, but what about C#? Given my posting history, it shouldn’t be hard to guess where I’m going with this…

Extension Methods

So if |> is “as next parameter to”, the . of an extension method call might read “as first parameter to”. So we can implement a roughly equivalent function (sans F#’s nice deep-print formatter "%A") like so:

    public static T Debug<T>(this T value)
    {
        Debug.WriteLine(value);
        return value;
    }

    public static T Dbg<T>(this T value, string category)
    {
        Debug.WriteLine(value, category);
        return value;
    }

I find the optional label handy to keep different traces separate. Looking again, there’s an overload that accepts a category, so we’ll use that instead. So why might this be useful? Maybe we want to log the value assigned within an object initializer:

var q = new SPQuery() {
  Query = GetMyQuery().Debug("Query")
};

Rather than store the query string to a temporary variable or retrieve the property after it’s been set, we can just trace the value inline. Or consider a LINQ example:

var items = from SPListItem item in list.GetItems(q)
            let url = new SPFieldUrlValue(item["URL"] as string)
            where url.Url.Debug("URL").StartsWith(baseUrl, StringComparison.OrdinalIgnoreCase)
            select new
            {
                Title = item.Title.Debug("Title"),
                Description = url.Description,
            };

Here we log all URLs that pass through, even the ones excluded from the result by the predicate. This would be much harder to implement efficiently without inline logging.

This technique works great for simple objects with a useful ToString(), but what about more complex objects? As has often been the answer lately, we can use higher-order functions:

    public static T Dbg<T, R>(this T value, Func<T, R> selector)
    {
        Debug.WriteLine(selector(value));
        return value;
    }

    public static T Dbg<T, R>(this T value, string category, Func<T, R> selector)
    {
        Debug.WriteLine(selector(value), category);
        return value;
    }

Now we can provide a delegate to trace whatever we want without affecting the object itself. For example, we can easily trace a row count for the DataView being returned:

public DataView GetResults()
{
    var myTable = GetDataTable();
    // Process data...
    return myTable.DefaultView.Dbg("Result Count", v => v.Count);
}

I could go on, but you get the idea.

PowerShell Filter

Finally, we can implement similar functionality in PowerShell using a filter with an optional scriptblock parameter:

filter Debug([scriptblock] $sb = { $_ })
{
  [Diagnostics.Debug]::WriteLine((& $sb))
  $_
}

PS > 1..3 | Debug { $_*2 } | %{ $_*$_ }
1
4
9

Which traces 2, 4, 6, as expected.

Update 4/19/2009: Changed functions to use category overloads. And another point to consider: if the value being traced could be null, selector should be designed accordingly to avoid NullReferenceException. There’s nothing worse than bugs introduced by tracing or logging.

Thinking Functional: Using

In the comments of my last post, Peter Seale pointed me to Matthew Podwysocki‘s implementation of GenerateUsing as functional abstraction of using. I like the idea, but the Generator pattern isn’t particularly useful for common SharePoint tasks. However, I think we can get some value from looking at a more generalized solution.

But first, let me suggest a minor correction to Matt’s version of Generate, at least if it’s going to be used to fulfill an IDisposable contract:

public static IEnumerable<TResult> Generate<T, TResult>(Func<T> opener,
                                                        Func<T, Option<TResult>> generator,
                                                        Action<T> closer)
{
    var openerResult = opener();
    bool stop = false;

    while (true)
    {
        var res = Option<TResult>.None;
        try
        {
            res = generator(openerResult);
        }
        finally
        {
            if (stop = res.IsNone)
                closer(openerResult);
        }
        if (stop)
            yield break;

        yield return res.Value;
    }
}

The stop “hack” is needed because you can’t yield from a finally clause. It seems to me that a closer that might not get called isn’t much of a closer, or am I missing something?

So how else might we use this opener/closer idea? How about something like this:

public static void Process<T>(Func<T> opener,
                              Action<T> action,
                              Action<T> closer)
{
    T openerResult = opener();
    try
    {
        action(openerResult);
    }
    finally
    {
        if (closer != null)
            closer(openerResult);
    }
}

public static void Using<T>(Func<T> opener,
                            Action<T> action
                           ) where T : IDisposable
{
    Process(opener, action, x => x.Dispose());
}

We have now abstracted the idea of a using statement: get the object, do something with it, Dispose(). Abstraction in hand, let’s apply it to SharePoint:

public static void ProcessWeb<TResult>(this SPSite site,
                                       string url,
                                       Action<SPWeb> action)
{
    Using(() => site.OpenWeb(url), action);
}

Now, one could argue that we haven’t gained much over the obvious implementation:

    using(SPWeb web = site.OpenWeb())
        action(web);

But in truth, the vast majority of functional code has a non-functional alternative. It’s just a different thought process. In the former, we specify what we’re trying to do: use the result of site.OpenWeb() to do action. In the latter, we specify how to do it: use an SPWeb named web, assigned from site.OpenWeb(), to perform action. I’m not saying either approach is more correct, just different means to the same end.

Performing actions is all well and good, but we often want to get something back as well:

public TResult Select<T, TResult>(Func<T> opener, Func<T, TResult> selector, Action<T> closer)
{
    T openerResult = opener();
    try
    {
        return selector(openerResult);
    }
    finally
    {

        closer(openerResult);
    }
}

public TResult SelectUsing<T, TResult>(Func<T> opener,
                                       Func<T, TResult> selector,
                                       Action<T> closer
                                      ) where T : IDisposable
{
    return Select(opener, selector, x => x.Dispose());
}
public static TResult SelectFromWeb<TResult>(this SPSite site,
                                             string url,
                                             Func<SPWeb, TResult> selector)
{
    return SelectUsing(() => site.OpenWeb(url), selector);
}

What do you think? Useful?

Generic Method Invocation with Expression Trees

It’s against the rules and completely unsupported, but sometimes it’s just so much easier to use a base class’s private/internal members. Reflection has always been an option, but performance is less than ideal. Lightweight Code Generation is an option, but emitting IL isn’t for everyone. Since .NET 3.5 came out, there have been several discussions of using expression trees as a developer-friendly yet efficient alternative. There is an up-front cost to compile the expression into IL, but the resulting delegate can be reused with performance very close to direct invocation.

Alkampfer provides a great overview of expression tree method invocation in this article, which inspired this more general solution.

First, let’s set up a cache to store our compiled delegates. I didn’t put much effort into making it efficiently thread-safe, but suggestions are certainly welcome.

private static Dictionary<string, Delegate> accessors = new Dictionary<string, Delegate>();
private static object accessorLock = new object();
private static D GetCachedAccessor<D>(string key)
                 where D : class // Constraint cannot be special class 'System.Delegate'
{
    D result = null;
    Delegate cachedDelegate;
    lock (accessorLock)
    {
        if (accessors.TryGetValue(key, out cachedDelegate))
        {
            Debug.WriteLine("Found cache entry for " + key);
            result = cachedDelegate as D;
        }
    }
    return result;
}
private static void SetCachedAccessor(string key, Delegate value)
{
    if (value != null)
        lock (accessorLock)
        {
            accessors[key] = value;
        }
}

GetFieldAccessor

Now we can dive into our expression trees. As a warm-up, here’s a relatively simple cached field accessor, inspired by Roger Alsing‘s great post:

public static Func<T, R> GetFieldAccessor<T, R>(string fieldName)
{
    Type typeT = typeof(T);

    string key = string.Format("{0}.{1}", typeT.FullName, fieldName);
    Func<T, R> result = GetCachedAccessor<Func<T, R>>(key);

    if (result == null)
    {
        var param = Expression.Parameter(typeT, "obj");
        var member = Expression.PropertyOrField(param, fieldName);
        var lambda = Expression.Lambda<Func<T, R>>(member, param);

        Debug.WriteLine("Caching " + key + " : " + lambda.Body);
        result = lambda.Compile();
        SetCachedAccessor(key, result);
    }
    return result;
}

The method returns a function that will accept an object of type T and return its fieldName property with type R. For example, we can wrap this in an extension method to check if an SPWeb has been disposed:

public static bool GetIsClosed(this SPWeb web)
{
    return GetFieldAccessor<SPWeb, bool>("m_closed")(web);
}

Because the delegate is cached, successive calls of GetFieldAccessor() will immediately return the necessary delegate without recompilation.

GetMethodAccessor

Building a method accessor is a bit trickier because of the various combinations of parameter and return types. One option is to explicitly define overloads for various method signatures, as seen in the article referenced earlier. Instead, I figure we can let the caller specify the desired delegate signature and figure out the intended method based on that.

public static D GetMethodAccessor<D>(string methodName, BindingFlags bindingAttr)
                where D : class // Constraint cannot be special class 'System.Delegate'
{
    Type[] args = typeof(D).GetGenericArguments();
    Type objType = args[0];

    Type[] argTypes = args.Skip(1).ToArray();
    string[] argTypesArray = argTypes.Select(t => t.Name).ToArray();
    string key = string.Format("{0}.{1}({2})", objType.FullName, methodName, string.Join(",", argTypesArray));

    D result = GetCachedAccessor<D>(key);
    if (result == null)
    {
        MethodInfo mi = objType.GetMethod(methodName, bindingAttr, null, argTypes, null);

        if (mi == null || mi.ReturnType != typeof(void))
        {
            argTypes = argTypes.Take(argTypesArray.Length - 1).ToArray();
            mi = objType.GetMethod(methodName, bindingAttr, null, argTypes, null);
        }

        if (mi == null)
            throw new ArgumentException("Could not find appropriate overload.", methodName);

        var param = Expression.Parameter(objType, "obj");
        var arguments = argTypes.Select((t, i) => Expression.Parameter(t, "p" + i)).ToArray();
        var invoke = Expression.Call(param, mi, arguments);
        var lambda = Expression.Lambda<D>(invoke, param.AsEnumerable().Concat(arguments));

        Debug.WriteLine("Caching " + key + " : " + lambda.Body);
        result = lambda.Compile();
        SetCachedAccessor(key, result as Delegate);
    }
    return result;
}

As you can see, we depend heavily on the generic arguments of the delegate type. This means passing a closed delegate type to this function won’t work – it needs to be Func, Action, or something compatible with the expected argument structure. So what is that structure? The processing logic works like this:

  1. Take the first generic argument as the type whose method we are going to invoke.
  2. Fetch an array, skipping the first argument, that we pass to GetMethod as the argument types.
  3. If GetMethod can’t find an appropriate overload, or if the method’s return type is not void, then we shouldn’t have used all of the arguments as parameters.
  4. Redefine our parameter array without the last argument – this is our delegate’s non-void return type.
  5. Try GetMethod again with the trimmed array; throw if we still don’t find a match.

Once we have the details of our method, we can build the expression tree. I use the mapi-style Select overload to build an array of typed parameters named p0, p1, etc., which is then passed to Expression.Call to represent the method invocation. Finally, Expression.Lambda expects a list of all parameters including the instance param. Rather than allocate an intermediate data structure, I use a trick I picked up from Keith Rimington:

public static IEnumerable<T> AsEnumerable<T>(this T obj)
{
    yield return obj;
}

By turning param into a single-element IEnumerable<ParameterExpression>, we can simply Concat the rest of the arguments. Beautiful.

GetMethodAccessor Usage

The usage is a bit more complex then GetFieldAccessor, but still quite manageable:

public static bool SetBoolValue(this SPField field, string attrName, bool attrValue)
{
    Func<SPField, string, bool, bool> lambda =
        GetMethodAccessor<Func<SPField, string, bool, bool>>("SetFieldBoolValue",
                                                             BindingFlags.Instance | BindingFlags.NonPublic);
    return lambda(field, attrName, attrValue);
}

The intermediate variable is unnecessary, but makes it easier to see what’s going on. SPField.SetFieldBoolValue is of type Func<string, bool, bool>, so our delegate needs to be Func<SPField, string, bool, bool> to accept the instance variable first. The parameters for GetMethodAccessor are identical to what we would pass to field.GetType().GetMethod() if we were using normal reflection. Then we invoke lambda to effectively call field.SetFieldBoolValue(attrName, attrValue).

For methods that return void, we just pass an Action type instead:

public static void SetHidden(this SPField field, bool value)
{
    GetMethodAccessor<Action<SPField, bool>>("SetHidden", BindingFlags.Instance | BindingFlags.NonPublic)(field, value);
}

And these can be used like any other extension methods:

SPField field = GetField();
field.SetBoolValue("CanToggleHidden", !field.CanToggleHidden);
field.SetBoolValue("CanBeDeleted", !field.CanBeDeleted);
field.SetHidden(!field.Hidden);
field.Update();

Which will show the following in DebugView:

Caching Microsoft.SharePoint.SPField.SetFieldBoolValue(String,Boolean,Boolean) : obj.SetFieldBoolValue(p0, p1)
Found cache entry for Microsoft.SharePoint.SPField.SetFieldBoolValue(String,Boolean,Boolean)
Caching Microsoft.SharePoint.SPField.SetHidden(Boolean) : obj.SetHidden(p0)

Because of the penalty for compilation, this technique is not right for all situations. But for frequent access to inaccessible members, it might be worth a try.

LINQ Tip: Enumerable.OfType

In the past I’ve mentioned LINQ’s Cast<T>() as an efficient way to convert a SharePoint collection into an IEnumerable<T> that has access to LINQ’s various extension methods. Fundamentally, Cast<T>() is implemented like this:

public IEnumerable<T> Cast<T>(this IEnumerable source)
{
  foreach(object o in source)
    yield return (T) o;
}

Using an explicit cast performs well, but will result in an InvalidCastException if the cast fails. A less efficient yet useful variation on this idea is OfType<T>():

public IEnumerable<T> OfType<T>(this IEnumerable source)
{
  foreach(object o in source)
    if(o is T)
      yield return (T) o;
}

The returned enumeration will only include elements that can safely be cast to the specified type. Why would this be useful?

Example 1: SPWindowsServiceInstance

SharePoint, especially with MOSS, has several different services that can run on the various servers in a farm. We know where our web services are running, but where are the various windows services running?

var winsvc = from svr in SPFarm.Local.Servers
             from inst in svr.ServiceInstances.OfType<SPWindowsServiceInstance>()
             select new
             {
                 Server = svr.Name,
                 ID = inst.Id,
                 ServiceType = inst.Service.GetType().Name
             };

Example 2: SPDocumentLibrary

SharePoint provides a few special subclasses of SPList for specific kinds of lists. These include SPDocumentLibrary, SPPictureLibrary and the essentially obsolete SPIssueList. We can use OfType() to retrieve only lists of a certain type, like this LINQified MSDN sample that enumerates all files in a site collection’s libraries, excluding catalogs and form libraries:

SPSite site = SPContext.Current.Site;
var docs = from web in site.AllWebs.AsSafeEnumerable()
           from lib in web.Lists.OfType<SPDocumentLibrary>()
           from SPListItem doc in lib.Items
           where !lib.IsCatalog && lib.BaseTemplate != SPListTemplateType.XMLForm
           select new { WebTitle = web.Title, ListTitle = lib.Title,
                        ItemTitle = doc.Fields.ContainsField("Title") ? doc.Title : "" };

foreach (var doc in docs)
  Label1.Text += SPEncode.HtmlEncode(doc.WebTitle) + " -- " +
                 SPEncode.HtmlEncode(doc.ListTitle) + " -- " +
                 SPEncode.HtmlEncode(doc.ItemTitle) + "<BR>";

Example 3: SPFieldUser

Finally, let’s pull a list of all user fields attached to lists in the root web. This could also be used to easily find instances of a custom field type.

var userFields = from SPList list in site.RootWeb.Lists
                 from fld in list.Fields.OfType<SPFieldUser>()
                 select new
                 {
                     ListTitle = list.Title,
                     FieldTitle = fld.Title,
                     InternalName = fld.InternalName,
                     PresenceEnabled = fld.Presence
                 };

Contrived examples, perhaps, but potentially useful nonetheless.

Re: Abstracting Away From Exceptions

I was going to comment on this post, but ended up writing way too much so I figure I’ll post my thoughts here instead.

In the comments, Hristo Deshev suggested making the method generic rather than the class. I figure, why not make the exception generic as well?

public static class Adverse
{
  public static TResult Call<TResult, TException>(Func<TResult> attempt,
                                                  Func<TException, TResult> recover
                                                 ) where TException : Exception
  {
    try { return attempt(); }
    catch (TException e) { return recover(e); }
  }
}

Regarding Aaron Powell’s concern about not wanting to handle all exceptions the same, you can always rethrow an exception you don’t recognize:

var res = Adverse.Call(() => { return ""; },
  (ArgumentException ae) =>
  {
    if (ae is ArgumentNullException)
      return "Null";
    else if (ae is ArgumentOutOfRangeException)
      return "OOR";
    else
      throw ae;
  });

And finally, a few comments pointed out that a method that doesn’t accept arguments isn’t really a function. So I set out to add support for an argument. Here is my first attempt:

public static TResult Call<TArg, TResult, TException>(TArg arg,
                                                      Func<TArg, TResult> attempt,
                                                      Func<TArg, TException, TResult> recover
                                                     ) where TException : Exception
{
    try { return attempt(arg); }
    catch (TException e) { return recover(arg, e); }
}

It gets the job done, but I found the usage rather awkward:

var r1 = Call(5,
    i => string.Format("{0} squared is {1}", i, i * i),
    (int i, ArithmeticException e) => string.Format("Could not square {0}", i));

This got me thinking about how F# supports currying and partial function application, in other words functions that return functions with fewer arguments. Applied here, we can write a method that returns a function from TArg to TResult:

public static Func<TArg, TResult> Call<TArg, TResult, TException>(Func<TArg, TResult> attempt,
                                                                  Func<TArg, TException, TResult> recover
                                                                 ) where TException : Exception
{
    return (arg) =>
    {
        try { return attempt(arg); }
        catch (TException e) { return recover(arg, e); }
    };
}

So instead of accepting an argument in Call(), we pass the argument into the function it returns:

var snd = Call(s => string.Format("The second character of {0} is {1}", s, s[1]),
    (string s, IndexOutOfRangeException e) => string.Format("{0} has no 2nd character", s));

foreach (string s in new string[] { "ab", "c", "de" })
    Console.WriteLine(snd(s));

Or an argument can be passed directly:

var str = Call(o => o.ToString(), (object o, NullReferenceException e) => "(null)")(myObj);

This could easily be extended to support functions with multiple arguments.

I’m not sure if I’ll ever get around to using this pattern, but Vesa Karvonen provides good justification for doing so:

…a try-catch statement has to produce a result via a side-effect, because it doesn’t return any result (it returns unit/void). For example, it has to update a variable declared outside of the try-catch statement. Whether or not both the tried statement and the handler statement update the variable is not apparent at the type level.

Thoughts?

Update 1/11/2009:

Paul suggests using a params array of “recover” delegates to handle different Exception types. I looked into it at first, but without covariance the delegates couldn’t be strongly typed (can’t assign Func<string, ArgumentException, bool> to Func<string, Exception, bool>). And if they’re all of type Func<string, Exception, bool>, how do we pick which one to use?

I don’t have VS 2010, but I would think C# 4.0 could resolve this, assuming Func<T1, T2, TResult> is redefined as Func<in T1, in T2, out TResult>.

With covariance, we could do something like this:

public static Func<TArg, TResult> Call<TArg, TResult>(Func<TArg, TResult> attempt,
                                                      params Func<TArg, Exception, TResult>[] recover)
{
  return (arg) =>
  {
    try { return attempt(arg); }
    catch (Exception e)
    {
      foreach(var r in recover)
      {
        Type rType = r.GetType().GetGenericArguments()[1];
        if (rType.IsInstanceOfType(e))
          return (TResult)r.DynamicInvoke(arg, e);
      }
      throw e;
    }
  };
}

As with try…catch, more specific types would need to be listed first. It would be great if someone with access to C# 4.0 could give this a try…

Which StringComparison? Ordinal!

I’ve never had to deal with localization, so I haven’t put much thought into the various .NET internationalization features. In the case of StringComparison, I should have done my homework. I’ve seen code samples that use any combination of CurrentCulture, InvariantCulture and Ordinal, but MSDN is very clear: Ordinal and OrdinalIgnoreCase are almost always the right choice. This is especially true for strings like URLs that should never have non-ASCII characters anyway.

The specific MSDN recommendations are worth repeating:

  • DO: Use StringComparison.Ordinal or OrdinalIgnoreCase for comparisons as your safe default for culture-agnostic string matching.
  • DO: Use StringComparison.Ordinal and OrdinalIgnoreCase comparisons for increased speed.
  • DO: Use StringComparison.CurrentCulture-based string operations when displaying the output to the user.
  • DO: Switch current use of string operations based on the invariant culture to use the non-linguistic StringComparison.Ordinal or StringComparison.OrdinalIgnoreCase when the comparison is linguistically irrelevant (symbolic, for example).
  • DO: Use ToUpperInvariant rather than ToLowerInvariant when normalizing strings for comparison.
  • DON’T: Use overloads for string operations that don’t explicitly or implicitly specify the string comparison mechanism.
  • DON’T: Use StringComparison.InvariantCulture-based string operations in most cases; one of the few exceptions would be persisting linguistically meaningful but culturally-agnostic data.

First, note that Ordinal comparisons are significantly faster, “essentially a [byte-wise] C runtime strcmp.”

More importantly, note the recommendation to specify the comparison mechanism whenever possible, as different methods have different default behavior. In BCL 2.0, String.Equals is Ordinal by default, but the majority (Compare, IndexOf, StartsWith, etc) use CurrentCulture. InfoQ recently reported that these defaults will change in .NET 4.0; in fact, the shift has already started with BCL 2.0.5 that shipped with Silverlight 2.0.

For example, in mscorlib, Version=2.0.0.0:

public int IndexOf(string value)
{
    return CultureInfo.CurrentCulture.CompareInfo.IndexOf(this, value);
}

But in Silverlight’s mscorlib, Version=2.0.5.0:

public int IndexOf(string value)
{
    return this.IndexOf(value, StringComparison.Ordinal);
}

So be careful porting string-manipulation code for “linguistically meaningful but culturally-agnostic data,” and other data too, into Silverlight. If you are at all interested in internationalization, the MSDN article is definitely worth a read. And even if you’re not, at least remember to specify StringComparison types.