Fabulous adventures in coding


For the last two decades or so I’ve admired the simplicity and power of the Python language without ever actually doing any work in it or learning about the details. Having recently dug into those details, I think it is fair to say that some of the implementation choices are let’s say not what I would have chosen, but it’s still an interesting language that I’d like to know more about.

A recent question on Stack Overflow got me thinking about how to turn a recursive algorithm into an iterative one, and it turns out that Python is a pretty decent language for this. The problem that the poster faced was:

  • A player is on a positive-integer numbered “space”.
  • The aim is to get back to space 1.
  • If you’re on an even numbered space, you must pay one coin and jump backwards half that many spaces.
  • If you’re on an odd numbered space, you have two choices: pay five coins to go directly to space 1, or pay one coin and jump back one space (to an even numbered space, obviously) and go from there.

The problem is: given the space the player is on, what is the lowest cost in coins to get home? The recursive solution is straightforward:

def cost(s):
 if s <= 1: return 0 if s % 2 == 0:
 return 1 + cost(s // 2) return min(1 + cost(s - 1), 5)

However, apparently the user was experimenting with values so large that the program was crashing from exceeding the maximum recursion depth! Those must have been some big numbers. The question then is: how do we transform this recursive algorithm into an iterative algorithm in Python?

Of course, the technique that I’m going to show you is not necessarily “Pythonic”. There are probably more Pythonic solutions using generators and so on. What I’d like to show here is that to remove this sort of recursion, you can do so by re-organizing the code using a series of small, careful refactorings until the program is in a form where removing the recursion is easy. Let’s see first how to get the program into that form.

The first step of our transformation is: I want the thing that precedes every recursive call to be a computation of the argument, and the thing that follows every recursive call to be a return of a method call that takes the recursive result:

def add_one(n): return n + 1
def get_min(n): return min(n + 1, 5)
def cost(s):
 if s <= 1: return 0 if s % 2 == 0:
 argument = s // 2
 result = cost(argument)
 return add_one(result)  argument = s - 1
 result = cost(argument) return get_min(result)

The second step is: I want to compute the argument in a helper function:

...
def get_argument(s): if s % 2 == 0: return s // 2 return s - 1
def cost(s):
 if s <= 1: return 0 argument = get_argument(s) result = cost(argument) if s % 2 == 0:
    return add_one(result)  return get_min(result)

The third step is: I want to decide which function to call afterwards in a helper function. Notice that we have a function which returns a function here!

...
def get_after(s): if s % 2 == 0: return add_one return get_min
def cost(s):
 if s <= 1: return 0 argument = get_argument(s) after = get_after(s) # after is a function!
 result = cost(argument)
 return after(result) 

And you know, let’s make this a little bit more general, and make the recursive case a bit more concise:

...
def is_base_case(s): return s <= 1
def base_case_value(s): return 0
def cost(s):
 if is_base_case(s): return base_case_value(s) argument = get_argument(s) after = get_after(s)
 return after(cost(argument)) 

I hope it is clear that at every small refactoring we have maintained the meaning of the program. We’re now doing a small amount of work twice; we have two tests for even space number per recursion, whereas before we had just one, but we could solve that problem if we wanted by combining our two helpers into one function that returned a tuple. Let’s not worry about it for the sake of this exercise.

We’ve reduced our recursive method to an extremely general form:

  • If we are in the base case:
    • compute the base case value to be returned
    • return it
  • If we are not in the base case:
    • get the recursive argument
    • make the recursive call
    • compute the value to be returned
    • return it

Something important to notice at this stage is that none of the “afters” must themselves contain any calls to cost; the technique that I’m showing today only removes a single recursion. If you’re recursing two or more times, well then, we’ll need more special techniques for that.

Once we’ve got our recursive algorithm in this form, turning it into an iterative algorithm is straightforward. The trick is to think about what happens in the recursive program. As we do the recursive descent, we call get_argument before every recursion, and we call whatever is in after, well, after every recursion. That is, all of the calls to get_argument happen before all of the calls to every after. Therefore we can turn that into two loops: one calls all the get_arguments and makes a list of all the afters, and the other calls all the afters:

...
def cost(s): # Let's make a stack of afters. Remember, these are functions
 # that take the value returned by the "recursive" call, and # return the value to be returned by the "recursive" method. afters = [ ] while not is_base_case(s): argument = get_argument(s) after = get_after(s) afters.append(after) s = argument # Now we have a stack of afters: result = base_case_value(s) while len(afters) != 0: after = afters.pop() result = after(result) return result

No more recursion! It looks like magic, but of course all we are doing here is exactly what the recursive version of the program did, in the same order.

This illustrates a point I make often about the call stack: its purpose is to tell you what is coming next, not what happened before! The only relevant information on the call stack in the recursive version of our program is what the value of after is, since that is the function that is going to be called next; nothing else matters. Instead of using the call stack as an inefficient and bulky mechanism for storing a stack of afters, we can just, you know, store a stack of afters.

Next time on FAIC, we’ll look at a more advanced technique for removing a recursion in Python that very-long-time readers of my blog will recognize.

Many thanks to Phil Burgess, who invited me to be this week’s guest on his podcast, IT Career Energizer. I enjoyed our conversation, and I’m looking forward to listening to past episodes.

Check it out on the link above, or wherever you get your podcasts.

Sorry for that unexpected interlude into customer service horrors; thanks for your patience and let’s get right back to coding horrors!

So as not to bury the lede: the C# compiler is correct to flag the program from the last episode as having a definite assignment problem. Thanks to everyone who participated in the discussion in the comments.

Particular mentions go out to Jeffrey L. Whitledge who sketched the solution mere moments after I posted the problem, C# architect Neal Gafter who posted a concise demonstration, and former C# compiler dev Chris Burrows who slyly pointed out on Twitter that he “solved the puzzle”. Indeed, he solved it nine years ago when he designed and implemented this logic in the compiler.

The reasoning the compiler has to go through here is rather arcane, so let’s dig into it.

First off, let’s remind ourselves of the fundamental rule of dynamic in C#. We identify all the places in the program where the compiler deduces that the type of a thing is dynamic. Imagine that you then replaced that type judgment with the actual runtime type of the thing. The behaviour of the program with dynamic should be the same as the behaviour of the imaginary program where the compiler knew the runtime type.

This means that the compiler has to assume that the control flow of a program containing dynamic could be any control flow possible for any runtime type, including types that are legal but extremely weird. So let’s make an extremely weird type or two and see what happens.

I’ve already discussed in this blog how operator overloading works for the && operator. Briefly, the rules in C# are:

  • if left is false then left && right does not evaluate right, and has the same value as left
  • otherwise, left && right evaluates both operands and has the same value as left & right

Obviously those rules work for bool; to overload &&, we therefore need a way to evaluate whether left is false or not, and a way to determine the value of left & right. The “operator true” and “operator false” operators tell you whether a given value is logically true or logically false, and both must be provided if you provide either. Let’s look at an example, but we’ll make it a crazy example!

class Crazy
{
  public static bool operator true(Crazy c)
  {
    Console.WriteLine(“operator true”);
    return true;  // Um…
  }
  public static bool operator false(Crazy c)
  {
    Console.WriteLine(“operator false”);
    return true; // Oh dear…
  }
  public static Crazy operator &(Crazy a, Crazy b)
  {
    Console.WriteLine(“operator &”);
    return a;
  }
}

This is obviously weird and wrong; this program says that all instances of Crazy are logically true, and also all of them are logically false (because “is this false?” returns true) and also if we & two of them together, we get the first unconditionally. But, whatever, this is a legal program. If we have

Crazy c = GetCrazy() && GetAnother();

The code generated will be the moral equivalent of:

Crazy c1 = GetCrazy();
Crazy
 c = Crazy.operator_false(c1) ?
  c1 :
  Crazy.operator_&(c1, GetAnother());

But of course at runtime for this particular program we will always say that c1 is false, and always just take the first operand without calling GetAnother().

Already we’ve got a very strange program, but maybe it is not entirely clear how this produces a code path that has a definite assignment error. To get that, we need a second super-weird class. This one overloads equality — C# requires you to overload equality and inequality in pairs as well — and instead of returning a Boolean, it returns, you guessed it, something crazy:

class Weird
{
  public static Crazy operator ==(Weird p1, Weird p2)
  {
    Console.WriteLine(“==”);
    return new Crazy();
  }
  public static Crazy operator !=(Weird p1, Weird p2)
  {
    Console.WriteLine(“!=”);
    return new Crazy();
  }
  public override bool Equals(object obj) => true;
  public override int GetHashCode() => 0;
}

Maybe now you see where this is going, but we’re not quite there yet. Suppose we have

Weird obj = new Weird();
string str;
if (obj != null && GetString(out str))

What’s going to happen? Weird overrides !=, so this will call Weird.operator_!=, which returns Crazy, and now we have Crazy && bool, which is not legal, so the program will not compile. But what if we make Crazy convertible from bool? Then it will work! So add this to Crazy:

  public static implicit operator Crazy(bool b)
  {
    Console.WriteLine(“implicit conversion from bool “ + b);
    return new Crazy();
  }

Now what does if (obj != null && GetString(out str)) … mean?  Let’s turn it into method calls. This thing all together is:

Crazy c1 = Weird.operator_!=(obj, null);
bool b1 = Crazy.operator_false(c1);
// it is true that it is false
Crazy c2 = b1 ? 
  c1 : // This is the branch we take
  Crazy.operator_implicit(GetString(out str));
if (c2) …

But wait a minute, is if(c2) legal? Yes it is, because Crazy implements operator true, which is then called by the if statement, and it returns true!

We have just demonstrated that it is possible to write the program fragment

if (obj != null && GetString(out str))
  Console.WriteLine(obj + str);

and have the consequence get executed even if GetString is never called, and therefore str would be read before it was written, and therefore this program must be illegal. And indeed, if you remove the definite assignment error and run it, you get the output

!= operator false operator true

In the case where obj is dynamic, the compiler must assume the worst possible case, no matter how unlikely. It must assume that the dynamic value is one of these weird objects that overloads equality to return something crazy, and that the crazy thing can force the compiler to skip evaluating the right side but still result in a true value.

What is the moral of this story? There are several. The most practical moral is: do not compare any dynamic value to null! The correct way to write this code is

if ((object)obj != null && GetString(out str))

And now the problem goes away, because dynamic to object is always an identity conversion, and now the compiler can generate a null check normally, rather than going through all this rigamarole of having to see if the object overloads equality.

The moral for the programming language designer is: operator overloading is a horrible horrible feature that makes your life harder.

The moral for the compiler writer is: get good at coming up with weird and unlikely possible programs, because you’re going to have to do that a lot when you write test cases.

Thanks to Stack Overflow contributor “NH.” for the question which prompted this puzzle.

I’ve banked with First Tech Credit Union since 1994, when they were the only bank that would give a non-resident Microsoft intern like me an account. I had nothing but good service from them for many years, but in the last couple years things have gone really downhill, (starting right around the time they merged with another credit union; probably not a coincidence).

The final straw was this evening; here’s a screenshot of their web site showing my balance on my home equity line of credit:

On July 10th I must have needed some cash for something, so I borrowed $1000 and my balance went from $0 to $1000.  Five days later I paid it back, and it said that my new balance was $10.52.  Now, this seems like exorbitant interest on a five-day loan, but whatever, as you can see, I paid the $10.52, and my balance went back to zero.

A couple months later I was assessed a late fee of $5. I did not notice this, since my balance was still listed on the web site as zero.

This afternoon I did a credit check and discovered that my credit score had gone down 111 points.  111 points!

I called up First Tech and they cheerfully explained to me that they had charged me $0.75 interest on the $10.52 of interest, and that since I had not paid it in 60 days, they reported my account as delinquent.

As you can see, they were good enough to waive the five dollar fee, and I transferred over the seventy-five cents. Hopefully they will remove the delinquency from my credit report, but we’ll see.

This is the third time this has happened in the last two years. The first time it was for 7 cents, and the second time it was for 17 cents. I informed them at the time that I would be taking my business elsewhere if there was a third time, and that was today.

Add to that a number of other recent issues — during the merger they somehow managed to lose about a years worth of my online bill payment records. A bank lost my banking records. At tax time. And I recently got what I think was a sales call from a representative that was so confusing, uninformed, vague and generally incompetent that I called them back because I genuinely believed I was probably being phished.

So, long story short: first, do not bank with First Tech. They cannot make a web site that shows you your balance, and they will mark an account as delinquent for a discrepancy that does not show up on the web site.  I’ve just had my credit ruined over the price of a pack of gum.

Second, I need recommendations for a credit union in the Seattle area that does not suck. Anyone have recommendations? Whack ’em down in the comments!

UPDATE: I had a meeting with my local branch manager, who was the first person in this saga to apologize to me for my inconvenience and clearly state that it was a bad idea to ruin people’s credit over the price of a cup of coffee. But the super amusing part was that he said that this happens all the time.

How? Well, the HELOC has an annual fee, and fee is considered part of the loan, so you end up having to pay a few cents interest on it if you do not pre-pay it, but that interest does not show up in the balance on the web site.  So a great many people do not discover this until it goes to collections or shows up on their credit report.

Apparently, he told me, people getting their credit ruined because of a seven cent charge they didn’t know they had is bad for customer satisfaction metrics.

That’s some high quality Business Intelligence analysis right there.

What kind of clown town operation are these people running? Like I said, this has happened to me three times, over a period of several years, and apparently I am not alone. The only conclusion I can come up with is that the problem is not bad enough to bother fixing.

The branch manager asked me for advice, which seemed to me to be rather backwards, but, hey, I’ll answer the question. First Tech, the suggestions I gave your manager were:

  • Stop marking these accounts as delinquent. Put a WHERE clause in the SQL query that determines which accounts get flagged for collections or delinquency that filters out accounts that are delinquent by less than the price of, say, a Big Mac.
  • Fix your web site to show the actual balance in the “balance” column.
  • I was super irritated when I was on the phone getting this sorted out, but the mortgage specialist I spoke with was most interested in telling me that this was my fault, and that what they did was legal. But the problem is not that I think you’re criminals! The problem is that I think you’re incompetent clowns who are actively making my life worseStaff who are asked to speak to irate customers should be trained on how to solve the real problem, which is retaining the customer’s business and good will.

I think those are reasonable suggestions that could be implemented in pretty short order, and any one of them would have mitigated the problem. Maybe do all three.

UPDATE #2:

  • On December 4th my credit rating went back up by 109 points, so likely the delinquency has been removed from my account.
  • I’ve spoken to I think five different customer service representatives, most of whom were sympathetic and apologetic.
  • I’ve been informed that credit unions are required by law — the Fair Credit Reporting Act or some such thing — to report all delinquencies, even if they are a penny. I understand that you don’t want banks to selectively apply sanctions to “undesirable” customers; that invites all sorts of abuses.  But if we’re in a situation where reporting is mandatory, then it is beholden upon the banks to ensure that tiny, accidental delinquencies don’t happen if it can be avoided by any means. I had plenty of money in my accounts when this happened; auto-payment of accounts that are about to become delinquent by less than a dollar should be the default. There should be a great many such mechanisms; this sort of customer service failure should be rare, but I am assured that it is common.
  • This was in many ways a failure of communication, and I recognize that communicating effectively is hard. But banks need to recognize the role they’ve played in making it hard. Let’s look at some of the ways its hard.
  • I got a letter from my bank saying that I was in arrears by 75 cents. Did I read it? Of course not. I get pointless, ignorable letters from my bank constantly so I have been trained to ignore them. I get multiple statements a month, I get privacy policy change notifications, and solicitations for services, and tax documents. You open a few hundred envelopes that have a document inside that says either “we’re letting you know that everything is fine, for no reason” or “please sign up for a service you didn’t ask for and don’t want”, or “we’ve made a minor change to our privacy policy”, and why would you open the 101st in a timely manner? And if you did, and it said “you owe us $0.75”, would that suggest a need for immediate action? Maybe I am being naive and foolish, but let me tell you, when Internet Explorer puts up a “security alert” over nothing a thousand times, and you dismiss it a thousand times, and then the 1001st time your machine gets compromised, I don’t get to say “you shouldn’t have clicked OK that time”.
  • The same goes for email. Emails from my bank are sorted automatically into the “promotions” folder. Guess how often I read that? If you said “only when I think there’s miscategorized email in there”, you’re right.
  • The same goes for phone calls. The vast majority of the phone calls I get are from a boiler room in India full of angry young men trying to sell me fake Viagra. (I’m not sure how they got my phone number, but they sure are persistent, and pretty mad when I ask them to stop calling.) The last phone call I got from my bank, as I noted above, was so confusing that I was pretty sure I was being phished.

I’ve often noted that “dynamic” in C# is just “object” with a funny hat on, but there are some subtleties to that. Here’s a little puzzle; see if you can figure it out. I’ll post the answer next time. Suppose we have this trivial little program:

class Program
{
  static void Main()
  {
    object obj = GetObject();
    string str;
    if (obj != null && GetString(out str))
      System.Console.WriteLine(obj + str);
  }
  static object GetObject() => “hello”
  static bool GetString(out string str)
  {
    str = “goodbye”;
    return true;
  }
}

There’s no problem here. C# knows that Main‘s str is definitely assigned on every code path that reads it, because we do not enter the consequence of the if statement unless GetString returned normally, and methods that take out parameters are required to write to the parameter before they return normally.

Now make a small change:

    dynamic obj = GetObject();

C# now gives error CS0165: Use of unassigned local variable ‘str’

Is the C# compiler wrong to give this error? Why or why not?

Next time on FAIC: the solution to the puzzle.

Last time I gave a simple C# implementation of the first-order anti-unification algorithm. This is an interesting algorithm, but it’s maybe not so clear why anti-unification is useful at all. It’s pretty clear why unification is interesting: we have equations that need solving! But why would you ever need to have two (or more) trees that need anti-unifying to a more general tree?

Here’s a pair of code changes:

dog.drink();   —becomes—>   if (dog != null) dog.drink();
dog.bark();    —becomes—>   if (dog != null) dog.bark();

Now here’s the trick. Suppose we represent each of these code changes as a tree. The root has two children, “before” and “after”. The child of each side is the abstract syntax tree of the before and after code fragment, so our two trees are:

Now suppose we anti-unify those two trees; what would we get? We’d get this pattern:

dog.h0(); —> if (dog != null) dog.h0();

Take a look at that. We started with two code changes, anti-unified them, and now we have a template for making new code edits! We could take this template and write a tool that transforms all code of the form “an expression statement that calls a method on a variable called dog” into the form “an if statement that checks dog for null and calls the method if it is not null”.

What I’m getting at here is: if we have a pair of small, similar code edits, then we can use anti-unification to deduce a generalization of those two edits, in a form from which we could then build an automatic refactoring.

But what if we have three similar code edits?

edit 1: dog.drink(); —> if (dog != null) dog.drink();
edit 2: dog.bark();  —> if (dog != null) dog.bark();
edit 3: cat.meow();  —> if (cat != null) cat.meow();

Let’s take a look at the pairwise anti-unifications:

1 & 2: dog.h0(); —> if (dog != null) dog.h0();
1 & 3: h1.h0();  —> if (h1 != null) h1.h0();
2 & 3: the same.

Anti-unifying the first two makes a more specific pattern than any anti-unification involving the third. But the really interesting thing to notice here is that the anti-unifications of 1&3 and 2&3 is itself a generalization of the anti-unification of 1&2!

Maybe that is not 100% clear. Let’s put all the anti-unifications into a tree, where the more general “abstract” patterns are at the top, and the individual “concrete” edits are at the leaves:

Each parent node is the result of anti-unifying its children. This kind of tree, where the leaves are specific examples of a thing, and each non-leaf node is a generalization of everything below it, is called a dendrogram, and they are very useful when trying to visualize the output of a hierarchical clustering algorithm.

Now imagine that we took hundreds, or thousands, or hundreds of thousands of code edits, and somehow managed to work out a useful dendrogram of anti-unification steps for all of them. This is a computationally difficult problem, and in a future episode, I might describe some of the principled techniques and unprincipled hacks that you might try to make it computationally feasible. But just suppose for the moment we could.  Imagine what that dendrogram would look like.  At the root we’d have the most general anti-unification of our before-to-after pattern:

h0 —> h1

Which is plainly useless.  At the leaves, we’d have all of the hundreds of thousands of edits, which are not useful in themselves. But the nodes in the middle are pure gold! They are all the common patterns of code edits that get made in this code base, in a form that you could turn into a refactoring or automatic fix template. The higher in the tree they are, the more general they are.

You’ve probably deduced by now that this is not a mere flight of fancy; I spent eight months working on a tiny research team to explore the question of whether this sort of analysis is possible at the scale of a modern large software repository, and I am pleased to announce that indeed it is!

We started with a small corpus of code changes that were made in response to a static analysis tool (Infer) flagging Java code as possibly containing a null dereference, built tools to extract the portions of the AST which changed, and then did a clustering anti-unification on the corpus to find patterns. (How the AST extraction works is also very interesting; we use a variation on the Gumtree algorithm. I might do a blog series about that later.) It was quite delightful that the first things that popped out of the clustering algorithm were patterns like:

h0.h1(); —> if (h0 != null) h0.h1();
h0.h1(); —> if (h0 == null) return; h0.h1();
h0.h1(); —> if (h0 == null) throw …; h0.h1();
if (h0.h1()) h2; —> if (h0 != null && h0.h1()) h2;

and a dozen more variations. You might naively think that removing a null dereference is easy, but there are a great many ways to do it, and we found most of them in the first attempt.

I am super excited that this tool works at scale, and we are just scratching the surface of what we can do with it. Just a few thoughts:

  • Can it find patterns in bug fixes more complex than null-dereference fixes?
  • Imagine for example if you could query your code repository and ask “what was the most common novel code change pattern last month?” This could tell you if there was a large-scale code modification but the developer missed an example of it. Most static analysis tools are of the form “find code which fails to match a pattern”; this is a tool for finding new patterns and the AST operations that apply the pattern!
  • You could use it as signal to determine if there are new bug fix patterns emerging in the code base, and use them to drive better developer education.
  • And many more; if you had such a tool, what would you do with it? Leave comments please!

The possibilities of this sort of “big code” analysis are fascinating, and I was very happy to have played a part in this investigation.

The team has recently written a public-facing post on Facebook’s coding blog describing the high-level architecture of our pipeline, with much better graphics and figures than I’ve thrown together here. Please check it out and let me know what you think.

I have a lot of people to thank: our team leader Satish, who knows everyone in the code analysis community, our management Erik and Joe who are willing to take big bets on unproven technology, my colleagues Andrew and Johannes, who hit the ground running and implemented some hard algorithms and beautiful visualizations in very little time, our interns Austin and Waruna, and last but certainly not least, the authors of the enormous stack of academic papers I had to read to figure out what combination of techniques might work at FB scale. I’ll put some links to some of those papers below.

Last time we wrote all the boring boilerplate code for substitutions and trees. Now let’s implement the algorithm. As I noted a couple of episodes back, we can reduce the algorithm to repeated application of two rules that mutate three pieces of state: the current generalization, the current substitutions on s, and the current substitutions on t.

The function returns those three things, and they do not have any particular semantic connection to each other aside from being the solution to this problem, so let’s try returning them as a tuple.

This seems like a good place to try out nested functions in C# 7, since each rule is logically its own function, but also only useful in the context of the algorithm; there’s no real reason to make these private methods of the class since no other code calls them.  Also, they’re logically manipulating the local state of their containing function.

We’ll start by setting up the initial state as being the most general generalization:


public static (Tree, Substitutions, Substitutions)
  Antiunify(Tree s, Tree t)
{
  var h = MakeHole();
  var generalization = h;
  var sSubstitutions = Substitutions.Empty.Add(h, s);
  var tSubstitutions = Substitutions.Empty.Add(h, t);

Recall the first rule seeks situations where there is a substitution that is insufficiently specific. We want to go until no more rules apply, so we’ll have this return a Boolean indicating whether the rule was applied or not.

  bool RuleOne()
  {
    var holes = from subst in sSubstitutions
                let cs = subst.Value
                let ct = tSubstitutions[subst.Key]
                where cs.Kind == ct.Kind
                where cs.Value == ct.Value
                where cs.ChildCount == ct.ChildCount
                select subst.Key;
    var hole = holes.FirstOrDefault();
    if (hole == null)
      return false;
    var sTree = sSubstitutions[hole];
    var tTree = tSubstitutions[hole];
    sSubstitutions = sSubstitutions.Remove(hole);
    tSubstitutions = tSubstitutions.Remove(hole);
    var newHoles =
      sTree.Children.Select(c => MakeHole()).ToList();
    foreach (var (newHole, child) in newHoles.Zip(
        sTree.Children, (newHole, child) => (newHole, child)))
      sSubstitutions = sSubstitutions.Add(newHole, child);
    foreach (var (newHole, child) in newHoles.Zip(
         tTree.Children, (newHole, child) => (newHole, child)))
       tSubstitutions = tSubstitutions.Add(newHole, child);
    generalization = generalization.Substitute(
      hole, new Tree(sTree.Kind, sTree.Value, newHoles));
    return true;
  }

There is a small code smell here: tuples are value types, and so the “default” if there is no pair of holes that matches like this is (null, null), so that’s the condition that we’re using to check to see if the rules apply.

Notice that we’re using tuples to iterate over two sequences of equal size via zip. The code seems inelegant to me in a subtle way. The fundamental issue here is that C# has always had mutable tuples ever since version 1.0; it just called them “argument lists”, and that’s weird. It has always struck me as bizarre that C# requires you to pass an argument tuple, but that it gives you no syntax for manipulating that tuple in any way other than extracting the arguments from it or mutating them. You cannot treat what is logically a tuple as a tuple; instead you have to write code that explicitly constructs a real tuple out of the logical tuple, and end up writing what looks like it ought to be an identity:

(newHole, child) => (newHole, child)

For that matter, why do we need to zip at all? In this particular example it would be nice if the tuple syntax carried over into foreach loops; imagine if instead of that ugly zip code we could just write

foreach (var newHole, var child in newHoles, sTree.Children)

Zipping is only necessary here because the language lacks the feature of treating tuples as values consistently across the language. I’m hoping there will be further improvements in this area in C# 8.

But I digress. We’ve implemented the first rule, and the second is even more straightforward. Here we are looking for redundant holes and removing them:

  bool RuleTwo()
  {
    var pairs =
      from s1 in sSubstitutions
      from s2 in sSubstitutions
      where s1.Key != s2.Key
      where s1.Value == s2.Value
      where tSubstitutions[s1.Key] == tSubstitutions[s2.Key]
      select (s1.Key, s2.Key);
    var (hole1, hole2) = pairs.FirstOrDefault();
    if (hole1 == null)
      return false;
    sSubstitutions = sSubstitutions.Remove(hole1);
    tSubstitutions = tSubstitutions.Remove(hole1);
    generalization = generalization.Substitute(hole1, hole2);
    return true;
  }

Quite fine. And now the outer loop of the algorithm is trivial. We keep applying rules until we are in a situation where neither applies.

  while (RuleOne() || RuleTwo())
  { /* do nothing */ }
  return (generalization, sSubstitutions, tSubstitutions);
}

It’s slightly distasteful to have RuleOne and RuleTwo useful for both their side effects and their values, but really their values are only being used for control flow, not for the value that was computed, so I feel OK about this.

Let’s try it out! Again we’ll make a couple of local helper functions:

static void Main()
{
  Tree Cons(params Tree[] children) =>
    new Tree(“call”, “cons”, children);
  Tree Literal(string value) =>
    new Tree(“literal”, value);
  var one = Literal(“1”);
  var two = Literal(“2”);
  var three = Literal(“3”);
  var nil = Literal(“nil”);
  var s = Cons(Cons(one, two), Cons(Cons(one, two), nil));
  var t = Cons(three, Cons(three, nil));
  var (generalization, sSubstitutions, tSubstitutions) =
    Tree.Antiunify(s, t);
  Console.WriteLine(generalization);
  Console.WriteLine(sSubstitutions.LineSeparated());
  Console.WriteLine(tSubstitutions.LineSeparated());
}

And when we run it, we get the right answer:

cons(h1,cons(h1,nil)) cons(1,2)/h1

3/h1

Nice!

Next time on FAIC: Why is this useful?