Understanding the Y Combinator by writing about it

1 January 2011


This post is written for those who, like me, have tried and failed to un­der­stand the Y Com­bi­na­tor despite the help of Wikipedia, blog posts con­struct­ing the Y Com­bi­na­tor in mul­ti­ple languages, stories on Hacker News, com­ments on Hacker News… I think I mostly get it now (this comes cour­tesy of a night in Helsinki where I was up at a strange time due to the fallout from a Helsinki-Stockholm-Helsinki booze cruise)—and so perhaps these notes will help someone else. If nothing else, writing them up helped me! (Maybe the only way to un­der­stand it is to write a blog post about it.)

I think a large part of the problem un­der­stand­ing the Y Com­bi­na­tor is that the various bits of in­for­ma­tion avail­able online are written from several dif­fer­ent perspectives, which means that they em­pha­sise dif­fer­ent aspects of the Y Com­bi­na­tor story, to wit:

De­riv­ing the Y Com­bi­na­tor

Step 1

As I men­tioned above, dis­cus­sion of the Y Com­bi­na­tor is typ­i­cally mo­ti­vated by a par­tic­u­lar problem: how to write “recursive” func­tions without ex­plicit recursion. I don’t know any better way of going about it, so let’s first con­sider that stan­dard ex­em­plar of recursion, the fac­to­r­ial function, and trans­form it into a version that uses the Y Combinator. The stan­dard implementation:

(I’m using JavaScript in these ex­am­ples because its widely known and has good support for first class functions.)

Step 2

How might you go about re­mov­ing the self-referential call to fact() within the func­tion body? (The exact reasons for wanting to do this are not important, but for example suppose that you want to inline the entire call.) One obvious way is to use a keyword pro­vided by the lan­guage that refers to the func­tion itself such as JavaScript’s arguments.callee, but, well, this is cheating and we can do better!

Step 3

Our first attempt is to trans­form the fact() func­tion slightly, so that fact becomes an argument to a wrapper function, instead of a free vari­able re­fer­ring to the fac­to­r­ial function. We can then use this ar­gu­ment in the “recursive” call:

At this point, the func­tion con­tains no ref­er­ences to vari­ables de­clared outside the scope of the func­tion itself—though overall we haven’t much progress because we also don’t have a working fact() function… If only there were some way to call fact_wrapper1() passing it its own return value as an argument! If we could get this to work, the return value of fact_wrapper1() would be the fac­to­r­ial func­tion we want.

Step 4

Okay, so that attempt didn’t quite work out; let’s try a slightly dif­fer­ent approach. Perhaps it’s pos­si­ble to modify fact_wrapper1() so that it can be passed as an ar­gu­ment to itself. (This cur­rently isn’t pos­si­ble because fact_wrapper1()’s ar­gu­ment (fact) is a simple func­tion that takes an integer ar­gu­ment and returns an integer; by con­trast fact_wrapper1() itself is a higher-order function that re­ceives a function as an argument, re­turn­ing a simple func­tion that takes an integer ar­gu­ment and re­turn­ing an integer.) If we can pass the func­tion to itself, we elim­i­nate the need to figure out how to extract just its return value. fact_wrapper1() can be trans­formed into such a func­tion as follows:

Note that the only real change is that f(f)(n - 1) re­places fact(n - 1). This is because f(f) returns a function, which we then apply to n - 1.

Step 5

Let’s just check to see if fact_wrapper2 can really be used as an ar­gu­ment to itself:

Okay, that works: so far, so good! (You can copy-and-paste all these ex­am­ples into a NodeJS session.)

We can also make fact_wrapper2() look a bit more like fact_wrapper1() by moving the weird f(f) call into a helper function:

Step 6

By adding another helper function, we can avoid the need for fact_wrapper2 to appear twice, and if we inline its definition, we can avoid the need to give the wrapper func­tion a name at all:

We still get the same result:

Step 7

Next, let’s trans­form Step 6 by elim­i­nat­ing the local vari­able g, instead passing it in as an ar­gu­ment of a new inlined function:

Step 8

This lets us move the fac­to­r­ial func­tion itself out of the helper code:

Step 9

Almost there! At this point we can also inline recur() itself:

Step 10

Finally, we can turn the helper code into a stand­alone func­tion (i.e. elim­i­nat­ing the ref­er­ence to fact_core in the body of Step 9’s fact by con­vert­ing it into an argument), which we’ll call Y():

Hurrah! We have finally achieved Y Combinator:

Notes and Ob­ser­va­tions

Neither the func­tion Y() nor fact_core() are self-referential—these names do not appear any­where in the code. A lot of other func­tions are involved, but since they’re all anonymous, they can’t refer to them­selves either. Hence, the Y Com­bi­na­tor can be used to im­ple­ment self-referential (i.e. recursive) pro­grams in lan­guages that don’t oth­er­wise support it.

The func­tion fact() is tech­ni­cally the “fixed point” of the func­tion fact_core() and the Y Com­bi­na­tor is a func­tion for finding such fixed points. The fixed point p of a func­tion f is a func­tion that sat­is­fies the con­di­tion f(p) = p. What this ac­tu­ally means is some­what hard to get one’s head around (see the Wikipedia page for more), but it’s easy to check that fact_core() and fact() do have these properties:

This means that the fol­low­ing ex­pres­sions are all (mathematically) equiv­a­lent to the func­tion fact: