Fold Left in JavaScript (with all thinkable examples)

Fold left (aka reduce or reduce left) is the basic list iterator. Straight, efficient and dumb: once it starts it has to get to the end of the list. No way to talk to the driver during the trip. At each element, a given operation (expressed by the function s) gets performed using as ingredients the result of the process so far and the current head of the list; the result gets brought along in the next recursion. The operation happens inside the recursion call. This makes fold left suited for tail recursion optimisation done by the compiler, but that’s its only advantage. To be frank, fold left is boring.

Fold it left

The definition of fold left (shortly foldl) is (*):

foldl (s) (a) ([]) = a

fold (s) (a) (x:xs) = foldl (s (s(a)(x)) (a) (xs))

Where:

  • x:xs is a binary tree that represents a list containing elements x of type α, that’s to say that the type of x:xs is [α]
  • s is a binary function of type β -> α -> β (**)
  • a (type β) is the accumulator for the end result; at the beginning it carries a starting value
  • foldl (s) (a) altogether is a function of type [α] -> β (***)

(*) I am putting brackets JavaScript-style; in Haskell they wouldn’t be necessary.

(**) I am embracing the Haskell habit of providing multiple parameters one at a time, by creating a chain of partially applied functions.

(***) you may see this in the fact that after foldl (s) (a) we have either [], x:xs or xs, which are all of type [α].

When you feed a β to an s (e.g. s(a) in the right side of the second equation) the next operand gotta be an α, like x is. Once the brackets are removed, the heart of the second equation reads saxand. This helps me remember the formula and that’s the main reason I picked s & a as variable names.

The hunting of the s-nark

Along the lines of thought that lead to the solution of a programming problem using fold right, I find it useful to think that expressing an operation g on a list (e.g. the sum of all its elements) as a fold left means finding a function s such that (please follow carefully the sequence of equalities):

g (x:xs) = foldl s a x:xs = foldl s(s a x) xs

with the additional constraint that:

g ([]) = a

In any real-world, non-trivial case, s is not a ready-made operator, but rather a custom-built anonymous lambda. Because my goal is to get used to generic problems and solutions, in the following Haskell examples I will use for g real names like sum, get, etc., but will write always anonymous lambdas for s, also in the cases where an operator is available.

In some cases I may write explicitly the lambda the first time it occurs and refer later to it as ‘s’.

Fold left in JavaScript – geiesfolds on GitHub

Having abandoned my initial goal of learning functional programming in Java (one of the most foolish undertakings of my whole professional life), since about one year I am thoroughly enjoying the study and applications of FP in JavaScript. I have made quite a few repositories on GitHub presenting various aspects of the matter. The repository handling folds is currently in beta status and its name is geiesfolds (which reads j-s-folds in Italian). It contains both fold right and left. All its implementations are covered by tests, using YUI.

Fold left in JavaScript reads like this:

// fold left (foldl)
function foldl(s){ // step
return function (a) { // accumulator (aka zero)
return function(xs){
if (isEmpty(xs)){
return a;
} else {
return foldl(s)(s(a)(head(xs)))(tail(xs)); // foldl s sax xs
}
};
};
};

JS implementations of all mainstream list functions (like head/car, tail/cdr, isEmpty) are presented in another post of mine and are to be found in geieslists.

CAVEAT EMPTOR: as I anticipated two paragraphs before, this implementation is based upon partially applied functions; you see that e.g. the arguments for f are given one at a time. That’s very elegant to see and I believe is the best way for understanding the inner working of the whole mechanism, but alas: appending partially applied functions is implemented very inefficiently in Javascript, because it creates a bunch of short-lived closures (see [3]). When I need a sound implementation for folds (and for a lot of other functions) my choice is always underscore.js. But there you have to respect the signatures and provide tuples of parameters exactly the way the API requires.

JavaScript examples of fold left

This page starts from the idea of deducing JavaScript implementations for the examples given in [1], a neat and thorough page in my opinion. As I believe it is with every learning experience, you develop a feeling of the matter at hand and begin experimenting on your own. And then the language you’re programming in starts to exert its influence. JavaScript does not offer the toolbox of features that Scala carries along, and forces you to express all manipulations using original functions rather than built-in operators like pattern matching, tuples, _ and other niceties. With JavaScript you gotta reach fast to the heart of the operator, which in the case of foldl is simple and ruthless.

The focus of this page is on developing a flair for the smartest way to create a solution for a practical case using a list iterator. This activity has given me new insights also in situations (e.g. using Java) where fold left was not an ingredient of the solution.

As I already said, in this page all needed list manipulation functions are provided by geieslists. Complete JS implementations are all available on GitHub.

Even though I am always greatly reluctant to fire up the clumsy Haskell REPL, I must admit that the syntax which allows me to better follow the various chained function applications is indeed Haskell. When I am trying to solve a case with paper and pencil, the syntax I find myself jotting down is kind of that one, with a few Scala/JavaScript keywords here and there.

Therefore, in all complex or somehow new cases, I will present first an Haskell-ish solution of the various cases, followed by a rigorous and runnable one in JavaScript.

sum, product, length, and, or (the basic stuff)

These are mentioned in many other pages. Here are the Haskell and JS implementations.

  • sum

The Haskell-ish definition is:

sum x:xs = foldl ((\a x -> a+x) a x) xs

In JS the function s is the following (I have renamed it adder for the sake of clarity):

function adder(a){
return function(x){
return a + x;
}
};
var sumWithFoldl = foldl(adder)(0);

Using 0 as the starting value for a.

  • product, count, and & or

To speed up things, which aren’t really complicated, here are the JS implementations, where I have put in bold the relevant bits. They all use anonymous lambdas rather than named helpers like adder in the previous case.

var productWithFoldl = foldl(function(a){return function(x){return a*x;};})(1);
var countWithFoldl = foldl(function(a){return function(x){return a+1;};})(0);
var andWithFoldl = foldl(function(a){return function(x){return a&&x;};})(true);
var orWithFoldl = foldl(function(a){return function(x){return a||x;};})(false);

You may appreciate how the various identity elements vary.

average

As [1] points out, one solution is found by carrying two values in the accumulator, making it a pair: one is the running average value, the other is the current count of list elements.
The Haskell-ish definition is:

average x:xs = foldl ((\(av,co) x -> ((av*co+x)/x,co+1)) (ave,count) x) xs

In JS the function s is the following:

var averager = function(aPair){ // that's the accumulator!
var ave = first(aPair), count = second(aPair);
return function(x){
var newAve = (ave*count+x)/(count+1);
return pair(newAve)(count+1);
};
};

The fold left implementation starts with a pair of 0’s in the accumulator and must return only the first element of the final value.

function averageWithFoldl(xs) {
return first(foldl(averager)(pair(0)(0))(xs));
}

last

Here there is a neat trick I’ve put together mostly for fun, using the accumulator as a signal to s as when it’s time to fetch the result. We put the tail of the input list in the accumulator and we slice it at each step, while fold left is slicing the input list on its own initiative. When the accumulator is empty, fold left is still processing the last list item. We stick that item in the accumulator; at the next recursion (how convenient!) the input list is empty, fold left chucks out the accumulator and here we have what we were looking for.

The Haskell-ish definition is:

last x:xs = foldl ((\a x -> if (isEmpty a) x else tail a) a x) xs

In JS the function s is the following:

var laster = function(a){
return function(x){
return isEmpty(a)?x:tail(a);
}
};

As previously said, this fold left starts with tail(xs) in the accumulator.


function lastWithFoldl(xs){
return foldl(laster)(tail(xs))(xs);
};

This implementation will work only in JavaScript: the type of a varies along the run from [β] to β. Don’t try this with a typed language.

penultimate (aka last-but-one)

In this case we may see the dumbness of fold left. The temptation would be to follow the line of the previous case and simply put tail(tail(xs)) in the starting accumulator.

Problem is: at the penultimate step, x jumps in the a, but foldl cannot stop here. It has to reach the end of the list, and next step screws the accumulator. We need an additional guard inside function s. To me this kills the neatness of the trick.

We can then follow the lines of [1] and use a pair to carry a couple of consequent elements of the list.

We start by storing head(list) and head(tail(list)) in the accumulator. At each step we shift second(pair) into first(pair) and we load the current head of the list into second(pair). At the end of the process we extract first from the return value, and here we go.


function pairShifter(aPair){ // (last,penultimate)
return function(x){
var penultimate = second(aPair);
return pair(penultimate)(x);
}
}

function penultimateWithFoldl(xs){
return first(foldl(pairShifter)(pair(head(xs))(head(tail(xs))))(xs));
};

contains

In this case the accumulator is a boolean. We start with false and we check x at each step. If x matches the element we are looking for, we set accumulator to true and, from that moment on, we don’t care about the rest of the list. But we got to go until its end.

Two possible JS implementations:

1) special operator with a foldl hidden inside
function containsWithFoldl(xs){
return function(elem){
var container = function(a){
return function(x){
return a || x===elem;
}
};
return foldl(container)(false)(xs);
};
};

Usage is: containsWithFoldl(list)(element);

2) explicit foldl invocation
var container = function(elem){
return function(a){
return function(x){
return a || x===elem;
}
};

Usage is: foldl(container(element))(list);

Personally I prefer the second. It’s more funky.

get with fold left

We want to grab the n-th element of a list.

As accumulator we need a pair (currentIndex,result). We start with the value (0,[]). At each step we check currentIndex to see if we are in position n. First time this is true, we load the current head of the list into second(a). From that time we just keep it like that, until the list is over.

function getWithFoldl(xs){
return function(index){
var getter = function(aPair){
var currentIndex = first(aPair), result = second(aPair);
return function(x){
var nextResult = (currentIndex<=index)?x:result;
return pair(currentIndex+1)(nextResult);
};
};
return second(foldl(getter)(pair(0)(EMPTY))(xs));
};
};

It is interesting to see how fold right handles this same problem. As fold right allows to exit the loop, we can implement it so that it returns as soon as the n-th position is reached. I have prepared a whole fold right examples page, which in my opinion is a lot more interesting and thought-provoking than the present one.

reverse

The first time you entrust fold left with a task that foresees a list as return value, the little bugger surprises you with one more bad habit: fold left returns lists of reversed results. The first element in the result list maps to the last element of the input list.

This is visible in the nature of the sax operation: at a certain step, a contains the current list of results. We elaborate x and the natural thing to do is to cons the result at the front of a. Therefore the previous results will be after this one. Right thing to do is to reverse everything at the end of the fold left run, because a concat to the tail of the accumulator performed at each step would be a disaster for the efficiency of the operation.

But if we expect a reversed result list, fold left is a gift from heaven. To reverse a list, all it needs is an s like:

\a x -> x:a

function(a){
return function(x){
return cons(x,a);
};
}

This basic behavior can be tailored to suit our specific needs.

uniq

We use the accumulator to collect items once, from a list where they may be present several times.

In this case we reuse contains (see a few paragraphs before) to check at each step if the current head of the list is already present in the accumulator. If not, we grab it. our s this time is:

function(a){
return function(x){
return containsWithFoldl(a)(x)?a:cons(x,a);
};
}

We already know that the resulting accumulator will contain the unique elements in reversed order.

encode

Given a list (a,a,b,c,c,c,a) return a list of pairs ((a,2),(b,1),(c,3),(a,1)).

At each step s checks whether the head of the list is equal to the previous element. This happens when the first pair in the accumulator is an encoding of x, i.e. (x,count). If yes, count must be increased; if not, a new encoding pair (x,1) is put at the front of the accumulator. Reverse the result as usual.

In pseudo-Haskell fold left receives this anonymous lambda:

\x (key,count):as -> if (x == key) ((key,count+1):as) else ((x,1):((key,count):as))

In JavaScript we call it encoder and then:

var encoder = function(pairs){
var currentEncoding, leadValue, currentValue;
return function(x){
if (isEmpty(pairs)) return cons(pair(x)(1),pairs);
else {
currentEncoding = head(pairs);
leadValue = first(currentEncoding);
currentValue = second(currentEncoding);
      if (x===leadValue) return cons(pair(leadValue)(currentValue+1),tail(pairs));
else return cons(pair(x)(1),pairs);
    }
};
};
var encodeWithFoldl = function(xs){return reverseWithFoldl(foldl(encoder)(EMPTY)(xs));}

decode

This is the reversed problem of the previous paragraph. Given ((a,2),(b,1),(c,3),(a,1)), return (a,a,b,c,c,c,a).

A plain, straight solution using a for-loop is not particularly insightful.

What I found interesting once I got to this point is to reason about the meaning of “doing something n times”. Doing something n times is an iteration, and therefore it’s akin to a fold left; but the fold left we’ve been reasoning about so far is hard-wired to recurse on lists.

After leaving aside this point for a while I got to reach page 70 of [2], where there’s mention of a foldn function, akin to fold right but recursing on natural numbers. This seems a fruitful road. We shall see…

WIP

[1] http://oldfashionedsoftware.com/2009/07/30/lots-and-lots-of-foldleft-examples/
[2] Introduction to Functional Programming using Haskell, R. Bird, Prentice Hall
[3] see http://spencertipping.com/js-instabench/ – please compare:
– allocating small structures -> 1M (int, int) pairs as array literals
– allocating small structures -> 1M (int, int) pairs as binary CPS closures
– accessing small structures -> 1M (int, int) pairs as array literals
– accessing small structures -> 1M (int, int) pairs as binary CPS closures

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s