# Understanding the Y Combinator by writing about it

1 January 2011

## Motivation

This post is written for those who, like me, have tried and failed to understand the Y Combinator despite the help of Wikipedia, blog posts constructing the Y Combinator in multiple languages, stories on Hacker News, comments on Hacker News… I think I mostly get it now (this comes courtesy of a night in Helsinki where I was up at a strange time due to the fallout from a Helsinki-Stockholm-Helsinki booze cruise)—and so perhaps these notes will help someone else. If nothing else, writing them up helped me! (Maybe the only way to understand it is to write a blog post about it.)

I think a large part of the problem understanding the Y Combinator is that the various bits of information available online are written from several different perspectives, which means that they emphasise different aspects of the Y Combinator story, to wit:

**The properties of the Y Combinator function itself.**Technically, the Y Combinator is one example of a fixed point combinator, meaning that it’s a higher order function with a particular property (to be described later). Sadly it seems this property is very hard to grok unless you’ve recently spent some time with lambda calculus (or at least it was for me), so discussion is usually motivated by one particular application of the Y Combinator, which is that…**The Y Combinator allows you to write “recursive” functions without explicit recursion.**(“Without explicit recursion” means “recursion” without explicitly calling a function named`foo`

in the body of a function that itself is called`foo`

.) Thus, discussion of the Y Combinator is tied up with anonymous (i.e. unnamed) functions. (Note that the Y Combinator-powered version of a function will almost certainly not be any faster than the recursive version, and it’s going to need just as much stack space.) Finally, Y Combinator posts and articles often have an extended section on…**The implementation of the Y Combinator in some language.**The Y Combinator is an important part of (theoretical) computer science, and its implementation typically stretches a language’s support for first class functions—it’s often instructive to compare how elegantly (or not) it can be implemented in various languages. (A few examples: JavaScript, Perl, PHP.)

## Deriving the Y Combinator

### Step 1

As I mentioned above, discussion of the Y Combinator is typically motivated by a particular problem: how to write “recursive” functions without explicit recursion. I don’t know any better way of going about it, so let’s first consider that standard exemplar of recursion, the factorial function, and transform it into a version that uses the Y Combinator. The standard implementation:

```
var fact = function (n) {
if (n < 2) return 1;
return n * fact(n - 1);
};
```

(I’m using JavaScript in these examples because its widely known and has good support for first class functions.)

### Step 2

How might you go about removing the self-referential call to `fact()`

within the function body? (The exact reasons for wanting to do this are not important, but for example suppose that you want to inline the entire call.) One obvious way is to use a keyword provided by the language that refers to the function itself such as JavaScript’s arguments.callee, but, well, this is *cheating* and we can do better!

### Step 3

Our first attempt is to transform the `fact()`

function slightly, so that `fact`

becomes an *argument* to a wrapper function, instead of a free variable referring to the factorial function. We can then use this argument in the “recursive” call:

```
var fact_wrapper1 = function (fact) {
return function (n) {
if (n < 2) return 1;
return n * fact(n - 1);
};
};
```

At this point, the function contains no references to variables declared outside the scope of the function itself—though overall we haven’t much progress because we also don’t have a working `fact()`

function… If only there were some way to call `fact_wrapper1()`

passing it its own return value as an argument! If we could get this to work, the return value of `fact_wrapper1()`

would be the factorial function we want.

### Step 4

Okay, so that attempt didn’t quite work out; let’s try a slightly different approach. Perhaps it’s possible to modify `fact_wrapper1()`

so that it can be passed as an argument to itself. (This currently isn’t possible because `fact_wrapper1()`

’s argument (`fact`

) is a simple function that takes an integer argument and returns an integer; by contrast `fact_wrapper1()`

itself is a higher-order function that receives a *function* as an argument, returning a simple function that takes an integer argument and returning an integer.) If we can pass the function to itself, we eliminate the need to figure out how to extract just its return value. `fact_wrapper1()`

can be transformed into such a function as follows:

```
var fact_wrapper2 = function (f) {
return function (n) {
if (n < 2) return 1;
return n * f(f)(n - 1);
};
};
```

Note that the only real change is that `f(f)(n - 1)`

replaces `fact(n - 1)`

. This is because `f(f)`

returns a function, which we then apply to `n - 1`

.

### Step 5

Let’s just check to see if `fact_wrapper2`

can really be used as an argument to itself:

```
> var fact = fact_wrapper2(fact_wrapper2);
> fact(5);
120
```

Okay, that works: so far, so good! (You can copy-and-paste all these examples into a NodeJS session.)

We can also make `fact_wrapper2()`

look a bit more like `fact_wrapper1()`

by moving the weird `f(f)`

call into a helper function:

```
var fact_wrapper2 = function (f) {
var g = function (n) {
return f(f)(n);
};
return function (n) {
if (n < 2) return 1;
return n * g(n - 1);
};
};
```

### Step 6

By adding another helper function, we can avoid the need for `fact_wrapper2`

to appear twice, and if we inline its definition, we can avoid the need to give the wrapper function a name at all:

```
var recur = function (f) {
return f(f);
};
var fact = recur(function (f) {
var g = function (n) {
return f(f)(n);
};
return function (n) {
if (n < 2) return 1;
return n * g(n - 1);
};
});
```

We still get the same result:

```
> fact(5);
120
```

### Step 7

Next, let’s transform Step 6 by eliminating the local variable `g`

, instead passing it in as an argument of a new inlined function:

```
var recur = function (f) {
return f(f);
};
var fact = recur(function (f) {
return (function (g) {
return function (n) { // unchanged from above
if (n < 2) return 1;
return n * g(n - 1);
}
})(function (n) {
return f(f)(n);
});
});
```

### Step 8

This lets us move the factorial function itself out of the helper code:

```
var fact_core = function (g) {
return function(n) {
if (n < 2) return 1;
return n * g(n - 1);
}
};
var recur = function (f) {
return f(f);
};
var fact = recur(function (f) {
return fact_core(function (n) {
return f(f)(n);
});
});
```

### Step 9

Almost there! At this point we can also inline `recur()`

itself:

```
var fact_core = function (g) { // same as above
return function(n) {
if (n < 2) return 1;
return n * g(n - 1);
}
};
var fact = (function (g) { // the inlined recur() function
return g(g);
})(function (f) { // same as above; now an argument to the inlined recur()
return fact_core(function (n) {
return f(f)(n);
});
});
```

### Step 10

Finally, we can turn the helper code into a standalone function (i.e. eliminating the reference to `fact_core`

in the body of Step 9’s `fact`

by converting it into an argument), which we’ll call `Y()`

:

```
var Y = function (h) {
return (function (f) {
return f(f);
})(function (f) {
return h(function (n) {
return f(f)(n);
});
});
};
var fact_core = function (g) { // same as above
return function(n) {
if (n < 2) return 1;
return n * g(n - 1);
}
};
var fact = Y(fact_core);
```

Hurrah! We have finally achieved Y Combinator:

```
> fact(5)
120
```

## Notes and Observations

Neither the function `Y()`

nor `fact_core()`

are self-referential—these names do not appear anywhere in the code. A lot of other functions are involved, but since they’re all anonymous, they can’t refer to themselves either. Hence, the Y Combinator can be used to implement self-referential (i.e. recursive) programs in languages that don’t otherwise support it.

The function `fact()`

is technically the “fixed point” of the function `fact_core()`

and the Y Combinator is a function for finding such fixed points. The fixed point *p* of a function *f* is a function that satisfies the condition *f(p) = p*. What this actually means is somewhat hard to get one’s head around (see the Wikipedia page for more), but it’s easy to check that `fact_core()`

and `fact()`

do have these properties:

```
> fact_core(fact)(5);
120
> fact(5)
120
```

This means that the following expressions are all (mathematically) equivalent to the function `fact`

:

`Y(fact_core)`

`fact_core(Y(fact_core))`

`fact_core(fact)`

`fact_core(fact_core(fact_core(Y(fact_core))))`

## References

- Wikipedia’s page on the fixed point combinator
- Deriving the Y Combinator in 7 Easy Steps – JavaScript
- Deriving the Y Combinator – JavaScript
- The Why of Y (PDF) – Richard P. Gabriel; Scheme
- Fixed-point combinators in JavaScript: Memoizing recursive functions – JavaScript
- Y-Combinator in PHP – PHP
- The PHP 5.3 Y-Combinator – PHP