Is Functional Abstraction Too Clever?

I received a rather interesting comment on a recent Stack Overflow answer:

This code seems too clever by half. Is it art? – PeterAllenWebb

The code in question was a functional solution to an algorithm described approximately as follows:

Draw n−1 numbers at random, in the range 1 to m−1. Add 0 and m to the list and order these numbers. The difference between each two consecutive numbers gives you a return value.

Which I solved like this, with n = slots and m = max:

static int[] GetSlots(int slots, int max)
{
    return new Random().Values(1, max)
                       .Take(slots - 1)
                       .Append(0, max)
                       .OrderBy(i => i)
                       .Pairwise((x, y) => y - x)
                       .ToArray();
}

Using a few extension methods:

  • Values() returns an infinite sequence of random values within the specified range.
  • Append() takes a params array and appends its arguments to the original sequence.
  • Pairwise() generates a sequence from calculations on pairs of consecutive elements in the original sequence.

I can see how one would think the code is clever; however, I’m not sure what would qualify it as too clever. Every method call has a well-defined purpose and maps directly to part of the original algorithm:

  1. From random numbers in the range 1 to m−1…
  2. …draw n−1.
  3. Add 0 and m to the list…
  4. …and order these numbers.
  5. The difference between each two consecutive numbers…
  6. …gives you a return value [in the array].

As far as I’m concerned, a solution couldn’t get much clearer than this, but that’s easy enough for me to say—what do you think? Is there a better way to express the algorithm? Would an imperative solution with shared state be more readable? How about maintainable?

For example, one could add the requirement that the random numbers not be repeated so that the difference between adjacent numbers is always nonzero. Updating the functional solution is as simple as adding a Distinct() call:

    return new Random().Values(1, max)
                       .Distinct()
                       .Take(slots - 1)
                       ...

To me, this is the value proposition of functional programming. By expressing the algorithm in terms of common operations, we’re able to spend more time thinking about the problem than the details of the solution. A similar change in an imperative implementation would almost certainly have been more involved and prone to error.

For completeness, here are the implementations of the extension methods used:

public static IEnumerable<int> Values(this Random random, int minValue, int maxValue)
{
    while (true)
        yield return random.Next(minValue, maxValue);
}

public static IEnumerable<TResult> Pairwise<TSource, TResult>(
    this IEnumerable<TSource> source,
    Func<TSource, TSource, TResult> resultSelector)
{
    TSource previous = default(TSource);

    using (var it = source.GetEnumerator())
    {
        if (it.MoveNext())
            previous = it.Current;

        while (it.MoveNext())
            yield return resultSelector(previous, previous = it.Current);
    }
}

public static IEnumerable<T> Append<T>(this IEnumerable<T> source, params T[] args)
{
    return source.Concat(args);
}

This also reminds me that Jimmy posted a similar Append() method as part of his latest post on missing LINQ operators. I used to use a version similar to his, but have found the params version to be more flexible (and easier to implement). Its Prepend() counterpart is similarly trivial:

public static IEnumerable<T> Prepend<T>(this IEnumerable<T> source, params T[] args)
{
    return args.Concat(source);
}
Posted in .NET. Tags: . Comments Off on Is Functional Abstraction Too Clever?

Improve Your Code Golf Game with LINQ

I always enjoy a good coding challenge, and variations of code golf are most common. For the uninitiated, code golf provides a problem with the objective of providing a solution that requires the fewest keystrokes or lines. While production code certainly deserves more white space than games tend to afford, there are still some lessons we can learn from the experience.

This particular post comes on the heels of Scott Hanselman‘s casual challenge to clean up some of his code in as few lines as possible. The general requirements are as follows:

  1. Accept a path to a project
  2. For each of a few files in each project…
  3. Back up the file
  4. Perform a few string replacements in the file
  5. Save the updated version of the file

As I was looking over his code for some chances to optimize, it quickly became clear that the bulk of the “hard” stuff could be solved through some LINQ-supported functional programming. I posted a first draft, upon which others iterated, but his spam filter ate this new version so I thought it might be educational to walk through it here instead.

First, let’s define a few arrays that will specify the entirety of our configuration, along with the input array of project paths:

static void Main(string[] args)
{
    if (args.Length == 0) Console.WriteLine("Usage: ASPNETMVCUpgrader pathToProject1 [pathToProject2] [pathToProject3]");

    var configFiles = new[] { "web.config", @"Views\web.config" };
    var changes = new[] {
        new { Regex = new Regex(@"(?<1>System.Web.Mvc, Version=)1.0(?<2>.0.0,)", RegexOptions.Compiled), Replacement = "${1}1.1${2}"},
        new { Regex = new Regex(@"(?<1>System.Web.Routing, Version=)3.5(?<2>.0.0,)", RegexOptions.Compiled), Replacement = "${1}4.0${2}"} };

The regular expressions are based on those provided by commenter Dušan Radovanović. Next, we can use some LINQ to build a list of all our files to update:

    var filesToUpdate = from file in configFiles
                        from projectPath in args
                        let path = Path.Combine(projectPath, file)
                        where File.Exists(path)
                        select new { Path = path, Content = File.ReadAllText(path) };

If you’re not familiar with C# 3.0, by line this does the following:

  1. Let file be the current item in configFiles.
  2. Let projectPath be the current item in args.
  3. Let path be the combined value of projectPath and file.
  4. Only include path values for files that exist.
  5. Create new anonymous objects with Path and Content properties set to the path and file contents, respectively.

As with most LINQ operations, execution of this code will be deferred until filesToUpdate is enumerated.

Now we’re ready to update our files. First, I’ll define a sequence of our possible backup file names, which will add “.backup_XX” to the file name.* Since the sequence is lazily evaluated, we can just call LINQ’s First() to find an available backup file name. Note that First() would throw an exception if all 100 files existed, as the backupFileNames sequence would be empty.

    foreach (var file in filesToUpdate)
    {
        var backupFileNames = from n in Enumerable.Range(0, 100)
                              let backupPath = string.Format("{0}.backup_{1:00}", file.Path, n)
                              where !File.Exists(backupPath)
                              select backupPath;

        File.Move(file.Path, backupFileNames.First());

Finally, we need to actually update the file content. To do that, we’ll use LINQ’s Aggregate operator:

        string newContent = changes.Aggregate(file.Content, (s, c) => c.Regex.Replace(s, c.Replacement));
        File.WriteAllText(file.Path, newContent);
        Console.WriteLine("Done converting: {0}", file.Path);
    }
}

Aggregate takes two parameters: a seed value and a function that defines the aggregation. In our case, the seed value is of type string and the function is of type Func<string, 'a, string>, where 'a is our anonymous type with Regex and Replacement properties. In practice, this call is going to take our original content and apply each of our changes in succession, using the result of one replacement as the input to the next. In functional terminology, Aggregate is known as a left fold; for more on Aggregate and folds, see this awesome post by language guru Bart de Smet.

What strikes me about this code is that it’s both terse and expressive. And for the purposes of the challenge, we can rewrite some of the queries in extension method syntax:

static void Main(string[] args)
{
  if (args.Length == 0) Console.WriteLine("Usage: ASPNETMVCUpgrader pathToProject1 [pathToProject2] [pathToProject3]");

  var configFiles = new[] { "web.config", @"Views\web.config" };
  var changes = new[] {
    new { Regex = new Regex(@"(?<1>System.Web.Mvc, Version=)1.0(?<2>.0.0,)", RegexOptions.Compiled), Replacement = "${1}1.1${2}"},
    new { Regex = new Regex(@"(?<1>System.Web.Routing, Version=)3.5(?<2>.0.0,)", RegexOptions.Compiled), Replacement = "${1}4.0${2}"} };

  var files = from path in configFiles.SelectMany(file => args, (file, arg) => Path.Combine(arg, file))
              where File.Exists(path) select new { Path = path, Content = File.ReadAllText(path) };

  foreach (var file in files)
    try
    {
      File.Move(file.Path, Enumerable.Range(0, 100).Select(n => string.Format("{0}.backup_{1:00}", file.Path, n)).First(p => !File.Exists(p)));
      File.WriteAllText(file.Path, changes.Aggregate(file.Content, (s, c) => c.Regex.Replace(s, c.Replacement)));
      Console.WriteLine("Done converting: {0}", file.Path);
    }
    catch (Exception ex) { Console.WriteLine("Error with: {0}" + Environment.NewLine + "Exception: {1}", file.Path, ex.Message); }
}

* The original code had the most recent backup with extension .mvc10backup, with the next oldest backup called .mvc10backup2. My original version extended this concept to “unlimited” backups with old backups continuously incremented so the lower values were more recent. It could probably be improved, but I thought I’d include the adapted code here for completeness:

  foreach (var file in files)
    try
    {
      var backupPaths = Enumerable.Repeat<int?>(null, 1)
            .Concat(Enumerable.Range(2, int.MaxValue - 2).Select(i => (int?)i))
            .Select(i => Path.ChangeExtension(filename, ".mvc10backup" + i));
      string toCopy = file.Path;
      foreach (var f in backupPaths.TakeWhile(_ => toCopy != null))
      {
          string temp = null;
          if (File.Exists(f))
              File.Move(f, temp = f + "TEMP");
          File.Move(toCopy, f);
          toCopy = temp;
      }
      File.WriteAllText(file.Path, changes.Aggregate(file.Content, (s, c) => c.Regex.Replace(s, c.Replacement)));
      Console.WriteLine("Done converting: {0}", file.Path);
    }
    catch (Exception ex) { Console.WriteLine("Error with: {0}" + Environment.NewLine + "Exception: {1}", file.Path, ex.Message); }
}
Posted in .NET. Tags: , . Comments Off on Improve Your Code Golf Game with LINQ

Thinking Functional: Using

In the comments of my last post, Peter Seale pointed me to Matthew Podwysocki‘s implementation of GenerateUsing as functional abstraction of using. I like the idea, but the Generator pattern isn’t particularly useful for common SharePoint tasks. However, I think we can get some value from looking at a more generalized solution.

But first, let me suggest a minor correction to Matt’s version of Generate, at least if it’s going to be used to fulfill an IDisposable contract:

public static IEnumerable<TResult> Generate<T, TResult>(Func<T> opener,
                                                        Func<T, Option<TResult>> generator,
                                                        Action<T> closer)
{
    var openerResult = opener();
    bool stop = false;

    while (true)
    {
        var res = Option<TResult>.None;
        try
        {
            res = generator(openerResult);
        }
        finally
        {
            if (stop = res.IsNone)
                closer(openerResult);
        }
        if (stop)
            yield break;

        yield return res.Value;
    }
}

The stop “hack” is needed because you can’t yield from a finally clause. It seems to me that a closer that might not get called isn’t much of a closer, or am I missing something?

So how else might we use this opener/closer idea? How about something like this:

public static void Process<T>(Func<T> opener,
                              Action<T> action,
                              Action<T> closer)
{
    T openerResult = opener();
    try
    {
        action(openerResult);
    }
    finally
    {
        if (closer != null)
            closer(openerResult);
    }
}

public static void Using<T>(Func<T> opener,
                            Action<T> action
                           ) where T : IDisposable
{
    Process(opener, action, x => x.Dispose());
}

We have now abstracted the idea of a using statement: get the object, do something with it, Dispose(). Abstraction in hand, let’s apply it to SharePoint:

public static void ProcessWeb<TResult>(this SPSite site,
                                       string url,
                                       Action<SPWeb> action)
{
    Using(() => site.OpenWeb(url), action);
}

Now, one could argue that we haven’t gained much over the obvious implementation:

    using(SPWeb web = site.OpenWeb())
        action(web);

But in truth, the vast majority of functional code has a non-functional alternative. It’s just a different thought process. In the former, we specify what we’re trying to do: use the result of site.OpenWeb() to do action. In the latter, we specify how to do it: use an SPWeb named web, assigned from site.OpenWeb(), to perform action. I’m not saying either approach is more correct, just different means to the same end.

Performing actions is all well and good, but we often want to get something back as well:

public TResult Select<T, TResult>(Func<T> opener, Func<T, TResult> selector, Action<T> closer)
{
    T openerResult = opener();
    try
    {
        return selector(openerResult);
    }
    finally
    {

        closer(openerResult);
    }
}

public TResult SelectUsing<T, TResult>(Func<T> opener,
                                       Func<T, TResult> selector,
                                       Action<T> closer
                                      ) where T : IDisposable
{
    return Select(opener, selector, x => x.Dispose());
}
public static TResult SelectFromWeb<TResult>(this SPSite site,
                                             string url,
                                             Func<SPWeb, TResult> selector)
{
    return SelectUsing(() => site.OpenWeb(url), selector);
}

What do you think? Useful?