Refactoring with Iterators: Prime Factors

Andrew Woodward recently posted a comparison of his test-driven Prime Factors solution to one written by Uncle Bob. In the comments, someone suggested that Andrew use an iterator instead so I thought I’d give it a try.

First, let’s repost the original code:

private const int SMALLEST_PRIME = 2;

public List<int> Generate(int i)
{
    List<int> primes = new List<int>();
    int divider = SMALLEST_PRIME;
    while (HasPrimes(i))
    {
        while (IsDivisable(i, divider))
        {
            i = AddPrimeToProductsAndReduce(i, primes, divider);
        }
        divider++;
    }
    return primes;
}

private bool IsDivisable(int i, int divider)
{
    return i%divider == 0;
}

private bool HasPrimes(int i)
{
    return i >= SMALLEST_PRIME;
}

private int AddPrimeToProductsAndReduce(int i, List<int> primes, int prime)
{
    primes.Add(prime);
    i /= prime;
    return i;
}

By switching our method to return IEnumerable<int>, we can replace the primes list with an iterator. We will also remove the AddPrimeToProducts functionality from that helper method since we don’t have the list any more:

public IEnumerable<int> Generate(int i)
{
    int divider = SMALLEST_PRIME;
    while (HasPrimes(i))
    {
        while (IsDivisable(i, divider))
        {
            yield return divider;
            i = Reduce(i, divider);
        }
        divider++;
    }
}

private int Reduce(int i, int prime)
{
    return i / prime;
}

I think this is a good change for three reasons:

  1. There’s nothing about the problem that requires a List<int> be returned, we just want a sequence of the factors.
  2. AddPrimeToProductsAndReduce suggested that it had a side effect, but exactly what wasn’t immediately obvious.
  3. It’s much easier to see what values are being included in the result.

That said, I think we can clean this up even more with a second iterator. Specifically, I think we should break out the logic for our candidate factors:

private IEnumerable<int> Divisors
{
    get
    {
        int x = SMALLEST_PRIME;
        while (true)
            yield return x++;
    }
}

Which allows us to separate the logic for generating a divider from the code that consumes it:

public IEnumerable<int> Generate(int toFactor)
{
    foreach (var divider in Divisors)
    {
        if (!HasPrimes(toFactor))
            break;

        while (IsDivisable(toFactor, divider))
        {
            yield return divider;
            toFactor = Reduce(toFactor, divider);
        }
    }
}

We should also eliminate the negation by flipping HasPrimes to become IsFactored:

public IEnumerable<int> Generate(int toFactor)
{
    foreach (var divider in Divisors)
    {
        if (IsFactored(toFactor))
            break;

        while (IsDivisable(toFactor, divider))
        {
            yield return divider;
            toFactor = Reduce(toFactor, divider);
        }
    }
}

private bool IsFactored(int i)
{
    return i <= 1;
}

This does introduce a (very) minor inefficiency in that the Divisors enumerator will MoveNext() one extra time before breaking out of the loop, which could be mitigated by checking IsFactored both before the foreach and after the while loop. Less readable, insignificantly more efficient…take your pick.

The other advantage to breaking out the logic to generate Divisors is that we can easily pick smarter candidates. One option is to skip even numbers greater than 2. An even better optimization takes advantage of the fact that all primes greater than 3 are of the form x±1 where x is a multiple of 6:

private IEnumerable<int> Divisors
{
    get
    {
        yield return 2;
        yield return 3;
        int i = 6;
        while (true)
        {
            yield return i - 1;
            yield return i + 1;
            i += 6;
        }
    }
}

Implementing this sort of logic in the original version would have been much more difficult, both in terms of correctness and readability.

Posted in .NET. Tags: , . Comments Off on Refactoring with Iterators: Prime Factors

Refactoring with LINQ & Iterators: FindDescendantControl and GetDescendantControls

A while back I put together a quick and dirty implementation of a FindControl extension method:

public static T FindControl<T>(this Control root, string id) where T : Control
{
    Control c = root;
    Queue<Control> q = new Queue<Control>();

    if (c == null || c.ID == id)
        return c as T;
    do
    {
        foreach (Control child in c.Controls)
        {
            if (child.ID == id)
                return child as T;
            if (child.HasControls())
                q.Enqueue(child);
        }
        c = q.Dequeue();
    } while (c != null);
    return null;
}

It got the job done (if the control exists!), but I think we can do better.

Refactoring with Iterators

My first concern is that the method is doing too much. Rather than searching for the provided ID, the majority of the code is devoted to navigating the control’s descendents. Let’s factor out that logic into its own method:

public static IEnumerable<Control> GetDescendantControls(this Control root)
{
    var q = new Queue<Control>();

    var current = root;
    while (true)
    {
        if (current != null && current.HasControls())
            foreach (Control child in current.Controls)
                q.Enqueue(child);

        if (q.Count == 0)
            yield break;

        current = q.Dequeue();
        yield return current;
    }
}

The new method is almost as long as the old one, but now satisfies the Single Responsibility Principle. I also added a check to prevent calling Dequeue() on an empty queue. For those that have studied algorithms, note that this is a breadth-first tree traversal.

Now we can update FindControl:

public static T FindControl<T>(this Control root, string id) where T : Control
{
    Control c = root;

    if (c == null || c.ID == id)
        return c as T;

    foreach (Control child in c.GetDescendantControls())
    {
        if (child.ID == id)
            return child as T;
    }
    return null;
}

With the control tree traversal logic extracted, this updated version is already starting to smell better. But we’re not done yet.

DRY? Don’t Repeat Someone Else, Either

My second concern is how we’re checking for the ID in question. It’s not that the equality operator is a bad choice, as it will work in many scenarios, but rather that it’s not consistent with the existing FindControl method. In particular, the existing FindControl understands naming containers (IDs that contain ‘$’ or ‘:’). Rather than implement our own comparison logic, we should just leverage the framework’s existing implementation:

public static T FindControl<T>(this Control root, string id) where T : Control
{
    if (id == null)
        throw new ArgumentNullException("id");

    if (root == null)
        return null;

    Control c = root.FindControl(id);
    if (c != null)
        return c as T;

    foreach (Control child in c.GetDescendantControls())
    {
        c = child.FindControl(id);
        if (c != null)
            return child as T;
    }
    return null;
}

Fun fact: FindControl will throw a NullReferenceException if id is null.

Refactoring with LINQ

So we have extracted the descendant logic and leaned on the framework for finding the controls, but I’m still not quite satisfied. The method just feels too…procedural. Let’s break down what we’re really trying to do:

  1. Look at the current control and all its descendants.
  2. Use FindControl on each with the specified ID.
  3. When we find the control, return it as type T.

As the subheading might suggest, we can express these steps quite nicely with LINQ:

  1. var controls = root.AsSingleton().Concat(root.GetDescendantControls());
  2. var foundControls = from c in controls
                        let found = c.FindControl(id)
                        where found != null
                        select found;
  3. return foundControls.FirstOrDefault() as T;

Behind the scenes, this is how I might have thought through this code:

  1. We use AsSingleton() (my new preferred name, to align with F#’s Seq.singleton, for AsEnumerable(), which I introduced here) and Concat() to prepend root to the list of its descendants, returned as a lazy enumeration.
  2. We use a query over those controls to retrieve matches from FindControl(), again returned as a lazy enumeration.
  3. We grab the first control found, or null if none match, and return it as T.

Because all our enumerations are lazy, we put off traversal of the entire control tree until we know we need to. In fact, if our ID is found in the root control, GetDescendantControls() won’t even be called! Through just a bit of refactoring, we have both an efficient and readable solution.

For completeness, here’s the final version with a more descriptive name to contrast with the existing FindControl():

public static T FindDescendantControl<T>(this Control root, string id) where T : Control
{
    if (id == null)
        throw new ArgumentNullException("id");

    if (root == null)
        return null;

    var controls = root.AsSingleton().Concat(root.GetDescendantControls());

    var foundControls = from c in controls
                        let found = c.FindControl(id)
                        where found != null
                        select found;

    return foundControls.FirstOrDefault() as T;
}

I have added these methods, along with AsSingleton() and a host of others, to the SharePoint Extensions Lib project. Check it out!