Difference between revisions of "Foldr Foldl Foldl'"
(Added an example) 
(→Foldr: spelling, wording) 

(23 intermediate revisions by 11 users not shown)  
Line 1:  Line 1:  
−  To ''foldr'', ''foldl'' or ''foldl''' that's the question! This article demonstrates the differences between these different folds by a simple example. 

+  __NOTOC__ 

−  If you want you can copy/paste this article into your [http://haskell.org/haskellwiki/Haskell_mode_for_Emacs favorite editor] and run it. 

+  To ''foldr'', ''foldl'' or ''foldl''', that is the question! 

+  
+  This article demonstrates the differences between these different folds by a simple example. 

+  
+  If you want you can copy/paste this article into your favorite editor and run it. 

We are going to define our own folds so we hide the ones from the Prelude: 
We are going to define our own folds so we hide the ones from the Prelude: 

<haskell>> import Prelude hiding (foldr, foldl)</haskell> 
<haskell>> import Prelude hiding (foldr, foldl)</haskell> 

+  
+  ==Foldr== 

Say we want to calculate the sum of a very big list: 
Say we want to calculate the sum of a very big list: 

Line 11:  Line 17:  
<haskell>> veryBigList = [1..1000000]</haskell> 
<haskell>> veryBigList = [1..1000000]</haskell> 

−  +  Let's start with the following: 

<haskell> 
<haskell> 

Line 26:  Line 32:  
<tt><nowiki>*** Exception: stack overflow</nowiki></tt> 
<tt><nowiki>*** Exception: stack overflow</nowiki></tt> 

−  Too bad... So what happened: 
+  Too bad... So, what happened is this: 
<haskell> 
<haskell> 

try1 > 
try1 > 

+  sum1 veryBigList > 

+  foldr (+) 0 veryBigList > 

+  
foldr (+) 0 [1..1000000] > 
foldr (+) 0 [1..1000000] > 

1 + (foldr (+) 0 [2..1000000]) > 
1 + (foldr (+) 0 [2..1000000]) > 

Line 52:  Line 61:  
</haskell> 
</haskell> 

−  The problem, as you can see, is that a large chain of (+)'s is 

+  The problem is that (+) is strict in both of its arguments. This means that both arguments must be fully evaluated before (+) can return a result. So to evaluate: 

−  created which eventually won't fit in your stack anymore. This will 

−  then trigger a stack overflow exception. 

−  For a nice interactive animation of the above behaviour see: http://foldr.com 

+  <haskell>1 + (2 + (3 + (4 + (...))))</haskell> 

−  Lets think about how to solve it... 

+  <tt>1</tt> is pushed on the stack. Then: 

−  One problem with the chain of (+)'s is that we can't make it 

+  <haskell>2 + (3 + (4 + (...)))</haskell> 

−  smaller (reduce it) until at the very last moment when it's already 

−  too late. 

−  The reason we can't reduce it, is that the chain doesn't contain an 

+  is evaluated. So <tt>2</tt> is pushed on the stack. Then: 

−  expression which can be reduced (a so called "redex" for '''red'''ucible 

+  
+  <haskell>3 + (4 + (...))</haskell> 

+  
+  is evaluated. So <tt>3</tt> is pushed on the stack. Then: 

+  
+  <haskell>4 + (...)</haskell> 

+  
+  is evaluated. So <tt>4</tt> is pushed on the stack. Then: ... 

+  
+  ... your limited stack will eventually run full when you evaluate a large enough chain of (+)s. This then triggers the stack overflow exception. 

+  
+  Let's think about how to solve this. 

+  
+  ==Foldl== 

+  
+  One problem with the chain of (+)'s is that it can't be made smaller (reduced) until the very last moment, when it's already too late. 

+  
+  The reason we can't reduce it is that the chain doesn't contain [[reducible expressionan 

+  expression which can be reduced]] (a ''[[reducible expressionredex]]'', for '''red'''ucible 

'''ex'''pression.) If it did we could reduce that expression before going 
'''ex'''pression.) If it did we could reduce that expression before going 

to the next element. 
to the next element. 

−  +  We can introduce a redex by forming the chain in another way. If 

instead of the chain <tt>1 + (2 + (3 + (...)))</tt> we could form the chain 
instead of the chain <tt>1 + (2 + (3 + (...)))</tt> we could form the chain 

−  <tt>(((0 + 1) + 2) + 3) + ...</tt> then there would always be a redex. 
+  <tt>(((0 + 1) + 2) + 3) + ...</tt>, then there would always be a redex. 
−  We can form 
+  We can form such a chain by using a function called ''foldl'': 
<haskell> 
<haskell> 

> foldl f z [] = z 
> foldl f z [] = z 

−  > foldl f z (x:xs) = 
+  > foldl f z (x:xs) = let z' = z `f` x 
+  > in foldl f z' xs 

> sum2 = foldl (+) 0 
> sum2 = foldl (+) 0 

Line 92:  Line 115:  
<haskell> 
<haskell> 

try2 > 
try2 > 

+  sum2 veryBigList > 

+  foldl (+) 0 veryBigList > 

+  
foldl (+) 0 [1..1000000] > 
foldl (+) 0 [1..1000000] > 

−  foldl (+) (0 + 1) [2..1000000] > 

+  
−  foldl (+) ((0 + 1) + 2) [3..1000000] > 

+  let z1 = 0 + 1 

−  foldl (+) 
+  in foldl (+) z1 [2..1000000] > 
−  +  
+  let z1 = 0 + 1 

+  z2 = z1 + 2 

+  in foldl (+) z2 [3..1000000] > 

+  
+  let z1 = 0 + 1 

+  z2 = z1 + 2 

+  z3 = z2 + 3 

+  in foldl (+) z3 [4..1000000] > 

+  
+  let z1 = 0 + 1 

+  z2 = z1 + 2 

+  z3 = z2 + 3 

+  z4 = z3 + 4 

+  in foldl (+) z4 [5..1000000] > 

+  
+   ... after many foldl steps ... 

+  
+  let z1 = 0 + 1 

+  z2 = z1 + 2 

+  z3 = z2 + 3 

+  z4 = z3 + 4 

+  ... 

+  z999997 = z999996 + 999997 

+  in foldl (+) z999997 [999998..1000000] > 

+  
+  let z1 = 0 + 1 

+  z2 = z1 + 2 

+  z3 = z2 + 3 

+  z4 = z3 + 4 

+  ... 

+  z999997 = z999996 + 999997 

+  z999998 = z999997 + 999998 

+  in foldl (+) z999998 [999999..1000000] > 

+  
+  let z1 = 0 + 1 

+  z2 = z1 + 2 

+  z3 = z2 + 3 

+  z4 = z3 + 4 

+  ... 

+  z999997 = z999996 + 999997 

+  z999998 = z999997 + 999998 

+  z999999 = z999998 + 999999 

+  in foldl (+) z999999 [1000000] > 

+  
+  let z1 = 0 + 1 

+  z2 = z1 + 2 

+  z3 = z2 + 3 

+  z4 = z3 + 4 

+  ... 

+  z999997 = z999996 + 999997 

+  z999998 = z999997 + 999998 

+  z999999 = z999998 + 999999 

+  z100000 = z999999 + 1000000 

+  in foldl (+) z1000000 [] > 

+  
+  let z1 = 0 + 1 

+  z2 = z1 + 2 

+  z3 = z2 + 3 

+  z4 = z3 + 4 

+  ... 

+  z999997 = z999996 + 999997 

+  z999998 = z999997 + 999998 

+  z999999 = z999998 + 999999 

+  z100000 = z999999 + 1000000 

+  in z1000000 > 

+  
+   Now a large chain of +'s will be created: 

+  
+  let z1 = 0 + 1 

+  z2 = z1 + 2 

+  z3 = z2 + 3 

+  z4 = z3 + 4 

+  ... 

+  z999997 = z999996 + 999997 

+  z999998 = z999997 + 999998 

+  z999999 = z999998 + 999999 

+  in z999999 + 1000000 > 

+  
+  let z1 = 0 + 1 

+  z2 = z1 + 2 

+  z3 = z2 + 3 

+  z4 = z3 + 4 

+  ... 

+  z999997 = z999996 + 999997 

+  z999998 = z999997 + 999998 

+  in (z999998 + 999999) + 1000000 > 

+  
+  let z1 = 0 + 1 

+  z2 = z1 + 2 

+  z3 = z2 + 3 

+  z4 = z3 + 4 

+  ... 

+  z999997 = z999996 + 999997 

+  in ((z999997 + 999998) + 999999) + 1000000 > 

+  
+  let z1 = 0 + 1 

+  z2 = z1 + 2 

+  z3 = z2 + 3 

+  z4 = z3 + 4 

+  ... 

+  in (((z999996 + 999997) + 999998) + 999999) + 1000000 > 

+  
 ... 
 ... 

 ... My stack overflows when there's a chain of around 500000 (+)'s !!! 
 ... My stack overflows when there's a chain of around 500000 (+)'s !!! 

 ... But the following would happen if you got a large enough stack: 
 ... But the following would happen if you got a large enough stack: 

 ... 
 ... 

−  foldl (+) ((((((0 + 1) + 2) + 3) + 4) + ...) + 999999) [1000000] > 

−  foldl (+) (((((((0 + 1) + 2) + 3) + 4) + ...) + 999999) + 1000000) [] > 

−  ((((((0 + 1) + 2) + 3) + 4) + ...) + 999999) + 1000000 > 

+  let z1 = 0 + 1 

−  (((((1 + 2) + 3) + 4) + ...) + 999999) + 1000000 > 

+  z2 = z1 + 2 

−  ((((3 + 3) + 4) + ...) + 999999) + 1000000 > 

+  z3 = z2 + 3 

−  (((6 + 4) + ...) + 999999) + 1000000 > 

+  z4 = z3 + 4 

−  (( 
+  in (((((z4 + 5) + ...) + 999997) + 999998) + 999999) + 1000000 > 
+  
+  let z1 = 0 + 1 

+  z2 = z1 + 2 

+  z3 = z2 + 3 

+  in ((((((z3 + 4) + 5) + ...) + 999997) + 999998) + 999999) + 1000000 > 

+  
+  let z1 = 0 + 1 

+  z2 = z1 + 2 

+  in (((((((z2 + 3) + 4) + 5) + ...) + 999997) + 999998) + 999999) + 1000000 > 

+  
+  let z1 = 0 + 1 

+  in ((((((((z1 + 2) + 3) + 4) + 5) + ...) + 999997) + 999998) + 999999) + 1000000 > 

+  
+  (((((((((0 + 1) + 2) + 3) + 4) + 5) + ...) + 999997) + 999998) + 999999) + 1000000 > 

+  
+   Now we can actually start reducing: 

+  
+  ((((((((1 + 2) + 3) + 4) + 5) + ...) + 999997) + 999998) + 999999) + 1000000 > 

+  
+  (((((((3 + 3) + 4) + 5) + ...) + 999997) + 999998) + 999999) + 1000000 > 

+  
+  ((((((6 + 4) + 5) + ...) + 999997) + 999998) + 999999) + 1000000 > 

+  
+  (((((10 + 5) + ...) + 999997) + 999998) + 999999) + 1000000 > 

+  
+  ((((15 + ...) + 999997) + 999998) + 999999) + 1000000 > 

+  
+  (((499996500006 + 999997) + 999998) + 999999) + 1000000 > 

+  
+  ((499997500003 + 999998) + 999999) + 1000000 > 

(499998500001 + 999999) + 1000000 > 
(499998500001 + 999999) + 1000000 > 

−  499999500000 + 1000000 

+  
+  499999500000 + 1000000 > 

+  
500000500000 > 
500000500000 > 

</haskell> 
</haskell> 

−  For a nice interactive animation of the above behaviour see: http://foldl.com (actually this animation is not quite the same :( ) 

+  Well, you clearly see that the redexes are created. But instead of being directly reduced, they are allocated on the heap: 

−  Well, you clearly see that the redexen <tt>0 + 1</tt>, <tt>(0 + 1) + 2</tt>, 

+  <haskell> 

−  etc. are created. So why doesn't the chain reduce sooner than 

+  let z1 = 0 + 1 

+  z2 = z1 + 2 

+  z3 = z2 + 3 

+  z4 = z3 + 4 

+  ... 

+  z999997 = z999996 + 999997 

+  z999998 = z999997 + 999998 

+  z999999 = z999998 + 999999 

+  z1000000 = z999999 + 1000000 

+  in z1000000 

+  </haskell> 

+  
+  Note that your heap is only limited by the amount of memory in your system (RAM and swap). So the only thing this does is filling up a large part of your memory. 

+  
+  The problem starts when we finally evaluate z1000000: 

+  
+  We must evaluate <tt>z1000000 = z999999 + 1000000</tt>, so <tt>1000000</tt> is pushed on the stack. Then <tt>z999999</tt> is evaluated; <tt>z999999 = z999998 + 999999</tt>, so <tt>999999</tt> is pushed on the stack. Then <tt>z999998</tt> is evaluated; <tt>z999998 = z999997 + 999998</tt>, so <tt>999998</tt> is pushed on the stack. Then <tt>z999997</tt> is evaluated... 

+  
+  ...your stack will eventually fill when you evaluate a large enough chain of (+)'s. This then triggers the stack overflow exception. 

+  
+  But this is exactly the problem we had in the foldr case — only now the chain of (+)'s is going to the left instead of the right. 

+  
+  So why doesn't the chain reduce sooner than 

before? 
before? 

−  The answer is that GHC uses a lazy reduction strategy. This means 

+  It's because of GHC's lazy reduction strategy: expressions are reduced only when they are actually needed. In this case, the outerleftmost redexes are reduced first. In this case it's the outer <tt>foldl (+) ... [1..10000]</tt> 

−  that GHC only reduces an expression when its value is actually 

+  redexes which are repeatedly reduced. So the inner <tt>z1, z2, z3, ...</tt> redexes only get reduced when the foldl is completely gone. 

−  needed. 

−  The reduction strategy works by reducing the outerleftmost redex 

+  ==Foldl'== 

−  first. In this case it are the outer <tt>foldl (+) ... [1..10000]</tt> 

−  redexen which are repeatedly reduced. 

−  So the inner <tt>(((0 + 1) + 2) + 3) + 4</tt> redexen only get reduced when 

−  the foldl is completely gone. 

We somehow have to tell the system that the inner redex should be 
We somehow have to tell the system that the inner redex should be 

Line 138:  Line 181:  
''seq'' is a primitive system function that when applied to ''x'' and 
''seq'' is a primitive system function that when applied to ''x'' and 

−  ''y'' will first reduce ''x'' 
+  ''y'' will first reduce ''x'' then return ''y''. The idea is that ''y'' references ''x'' so that when ''y'' is reduced ''x'' will not be a big unreduced chain anymore. 
−  the latter. The idea is that ''y'' references ''x'' so that when ''y'' is 

−  reduced ''x'' will not be a big unreduced chain anymore. 

Now lets fill in the pieces: 
Now lets fill in the pieces: 

Line 160:  Line 203:  
<haskell> 
<haskell> 

try3 > 
try3 > 

+  sum3 veryBigList > 

+  foldl' (+) 0 veryBigList > 

+  
foldl' (+) 0 [1..1000000] > 
foldl' (+) 0 [1..1000000] > 

foldl' (+) 1 [2..1000000] > 
foldl' (+) 1 [2..1000000] > 

Line 176:  Line 222:  
first. 
first. 

+  ==Conclusion== 

Usually the choice is between <hask>foldr</hask> and <hask>foldl'</hask>, since <hask>foldl</hask> and <hask>foldl'</hask> are the same except for their strictness properties, so if both return a result, it must be the same. <hask>foldl'</hask> is the more efficient way to arrive at that result because it doesn't build a huge thunk. However, if the combining function is lazy in its ''first'' argument, <hask>foldl</hask> may happily return a result where <hask>foldl'</hask> hits an exception: 
Usually the choice is between <hask>foldr</hask> and <hask>foldl'</hask>, since <hask>foldl</hask> and <hask>foldl'</hask> are the same except for their strictness properties, so if both return a result, it must be the same. <hask>foldl'</hask> is the more efficient way to arrive at that result because it doesn't build a huge thunk. However, if the combining function is lazy in its ''first'' argument, <hask>foldl</hask> may happily return a result where <hask>foldl'</hask> hits an exception: 

Line 212:  Line 259:  
foldl' (?) 6 [undefined,5,0] > 
foldl' (?) 6 [undefined,5,0] > 

6 ? undefined > 
6 ? undefined > 

−  +  *** Exception: Prelude.undefined 

</haskell> 
</haskell> 

−  For another explanation about folds see the [http://haskell.org/haskellwiki/Fold Fold] article. 

+  Note that even <hask>foldl'</hask> may not do what you expect. 

+  The involved <hask>seq</hask> function does only evaluate the ''topmost constructor''. 

+  
+  If the accumulator is a more complex object, then <hask>fold'</hask> will still build up unevaluated thunks. You can introduce a function or a strict data type which forces the values as far as you need. Failing that, the "brute force" solution is to use {{HackagePackageid=deepseq}}. For a worked example of this issue, see [http://book.realworldhaskell.org/read/profilingandoptimization.html#id678431 ''Real World Haskell'' chapter 25]. 

+  
+  ==Rules of Thumb for Folds== 

+  
+  Folds are among the most useful and common functions in Haskell. They are an oftensuperior replacement for what in other language would be loops, but can do much more. Here are a few rules of thumb on which folds to use when. 

+  
+  <hask>foldr</hask> is not only the right fold, it is also most commonly the ''right'' fold to use, in particular when transforming lists (or other foldables) into lists with related elements in the same order. Notably, <hask>foldr</hask> will be effective for transforming even infinite lists into other infinite lists. For such purposes, it should be your first and most natural choice. For example, note that <hask>foldr (:) []==id</hask>. 

+  
+  
+  Note that the initial element is irrelevant when <hask>foldr</hask> is applied to an infinite list. For that reason, it is may be good practice when writing a function which should only be applied to infinite lists to replace <hask>foldr f []</hask> with <hask>foldr f undefined</hask>. This both documents that the function should only be applied to infinite lists and will result in an error when you try to apply it to a finite list. 

+  
+  
+  The other very useful fold is <hask>foldl'</hask>. It can be thought of as a foldr with these differences: 

+  
+  * <hask>foldl'</hask> conceptually reverses the order of the list. One consequence is that a <hask>foldl'</hask> (unlike <hask>foldr</hask>) applied to an infinite list will be bottom; it will not produce any usable results, just as an express <hask>reverse</hask> would not. Note that <hask>foldl' (flip (:)) []==reverse</hask>. 

+  * <hask>foldl'</hask> often has much better time and space performance than a <hask>foldr</hask> would for the reasons explained in the previous sections. 

+  
+  You should pick <hask>foldl'</hask> principally in two cases: 

+  
+  * When the list to which it is applied is large, but definitely finite, you do not care about the implicit reversal (for example, because your combining function is commutative like <hask>(+)</hask>, <hask>(*)</hask>, or <hask>Set.union</hask>), and you seek to improve the performance of your code. 

+  * When you actually do want to reverse the order of the list, in addition to possibly performing some other transformation to the elements. In particular, if you find that you precede or follow your fold with a reverse, it is quite likely that you could improve your code by using the other fold and taking advantage of the implicit reverse. 

+  
+  <hask>foldl</hask> is rarely the right choice. It gives you the implicit reverse of fold, but without the performance gains of <hask>foldl'</hask>. Only in rare, or specially constructed cases like in the previous section, will it yield better results than <hask>foldl'</hask>. 

+  
+  
+  Another reason that <hask>foldr</hask> is often the better choice is that the folding function can ''shortcircuit'', that is, terminate early by yielding a result which does not depend on the value of the accumulating parameter. When such possibilities arise with some frequency in your problem, shortcircuiting can greatly improve your program's performance. Left folds can never shortcircuit. 

+  
+  
+  To illustrate this consider writing a fold that computes the product of the last digits of a list of integers. One might think that <hask>foldl'</hask> is the superior fold in this situation as the result does not depend on the order of the list and is generally not computable on infinite lists anyway. 

+  
+  
+  On my workstation running GHC 7.10.2, <hask>foldl' (\a e > (mod e 10)*a) 1 [1..10^7]</hask> has a compiled runtime of 422 ms (all of it calculation) and allocates over 400 MBytes on the heap. Conversely, <hask>foldr (\e a > (mod e 10)*a) 1 [1..10^7]</hask> has a compiled runtime of 2203 ms (the same 422 ms calculation time and 1781 ms of garbage collection) and allocates more than 550 MBytes on the heap. That is what you should expect from reading the previous sections; <hask>foldr</hask> will build up a huge stack of nested expression before evaluating the result and that stack needs to be garbage collected. 

+  
+  
+  But think about the problem a little more. Once the fold hits a number with last digit 0, there is no need to evaluate any further. The ultimate result will always be 0, so you can shortcircuit to that answer immediately. Indeed <hask>foldr (\e a > if mod e 10==0 then 0 else (mod e 10)*a) 1 [1..10^7]</hask> has a measured runtime of 0ms and allocates less than 50 kBytes on the heap. This fold will run just as fast on the range <hask>[1..10^100]</hask> or even <hask>[1..]</hask>. 

+  
+  
+  The left fold cannot shortcircuit and is condemned to evaluate the entire input list. Running <hask>foldl' (\a e > if mod e 10==0 then 0 else (mod e 10)*a) 1 [1..10^7]</hask> takes 781 ms and allocates over 500 MByte of heap space; it is inferior to even the original left fold, not to mention the shortcircuiting right fold. 

+  
+  == See also == 

+  
+  * [[Fold]] 

+  * [[Foldl as foldr]] 

+  
+  [[Category:FAQ]] 

+  [[Category:Idioms]] 
Latest revision as of 10:26, 29 March 2019
To foldr, foldl or foldl', that is the question!
This article demonstrates the differences between these different folds by a simple example.
If you want you can copy/paste this article into your favorite editor and run it.
We are going to define our own folds so we hide the ones from the Prelude:
> import Prelude hiding (foldr, foldl)
Foldr
Say we want to calculate the sum of a very big list:
> veryBigList = [1..1000000]
Let's start with the following:
> foldr f z [] = z
> foldr f z (x:xs) = x `f` foldr f z xs
> sum1 = foldr (+) 0
> try1 = sum1 veryBigList
If we evaluate try1 we get:
*** Exception: stack overflow
Too bad... So, what happened is this:
try1 >
sum1 veryBigList >
foldr (+) 0 veryBigList >
foldr (+) 0 [1..1000000] >
1 + (foldr (+) 0 [2..1000000]) >
1 + (2 + (foldr (+) 0 [3..1000000])) >
1 + (2 + (3 + (foldr (+) 0 [4..1000000]))) >
1 + (2 + (3 + (4 + (foldr (+) 0 [5..1000000])))) >
 ...
 ... My stack overflows when there's a chain of around 500000 (+)'s !!!
 ... But the following would happen if you got a large enough stack:
 ...
1 + (2 + (3 + (4 + (... + (999999 + (foldr (+) 0 [1000000]))...)))) >
1 + (2 + (3 + (4 + (... + (999999 + (1000000 + ((foldr (+) 0 []))))...)))) >
1 + (2 + (3 + (4 + (... + (999999 + (1000000 + 0))...)))) >
1 + (2 + (3 + (4 + (... + (999999 + 1000000)...)))) >
1 + (2 + (3 + (4 + (... + 1999999 ...)))) >
1 + (2 + (3 + (4 + 500000499990))) >
1 + (2 + (3 + 500000499994)) >
1 + (2 + 500000499997) >
1 + 500000499999 >
500000500000
The problem is that (+) is strict in both of its arguments. This means that both arguments must be fully evaluated before (+) can return a result. So to evaluate:
1 + (2 + (3 + (4 + (...))))
1 is pushed on the stack. Then:
2 + (3 + (4 + (...)))
is evaluated. So 2 is pushed on the stack. Then:
3 + (4 + (...))
is evaluated. So 3 is pushed on the stack. Then:
4 + (...)
is evaluated. So 4 is pushed on the stack. Then: ...
... your limited stack will eventually run full when you evaluate a large enough chain of (+)s. This then triggers the stack overflow exception.
Let's think about how to solve this.
Foldl
One problem with the chain of (+)'s is that it can't be made smaller (reduced) until the very last moment, when it's already too late.
The reason we can't reduce it is that the chain doesn't contain an expression which can be reduced (a redex, for reducible expression.) If it did we could reduce that expression before going to the next element.
We can introduce a redex by forming the chain in another way. If instead of the chain 1 + (2 + (3 + (...))) we could form the chain (((0 + 1) + 2) + 3) + ..., then there would always be a redex.
We can form such a chain by using a function called foldl:
> foldl f z [] = z
> foldl f z (x:xs) = let z' = z `f` x
> in foldl f z' xs
> sum2 = foldl (+) 0
> try2 = sum2 veryBigList
Lets evaluate try2:
*** Exception: stack overflow
Good Lord! Again a stack overflow! Lets see what happens:
try2 >
sum2 veryBigList >
foldl (+) 0 veryBigList >
foldl (+) 0 [1..1000000] >
let z1 = 0 + 1
in foldl (+) z1 [2..1000000] >
let z1 = 0 + 1
z2 = z1 + 2
in foldl (+) z2 [3..1000000] >
let z1 = 0 + 1
z2 = z1 + 2
z3 = z2 + 3
in foldl (+) z3 [4..1000000] >
let z1 = 0 + 1
z2 = z1 + 2
z3 = z2 + 3
z4 = z3 + 4
in foldl (+) z4 [5..1000000] >
 ... after many foldl steps ...
let z1 = 0 + 1
z2 = z1 + 2
z3 = z2 + 3
z4 = z3 + 4
...
z999997 = z999996 + 999997
in foldl (+) z999997 [999998..1000000] >
let z1 = 0 + 1
z2 = z1 + 2
z3 = z2 + 3
z4 = z3 + 4
...
z999997 = z999996 + 999997
z999998 = z999997 + 999998
in foldl (+) z999998 [999999..1000000] >
let z1 = 0 + 1
z2 = z1 + 2
z3 = z2 + 3
z4 = z3 + 4
...
z999997 = z999996 + 999997
z999998 = z999997 + 999998
z999999 = z999998 + 999999
in foldl (+) z999999 [1000000] >
let z1 = 0 + 1
z2 = z1 + 2
z3 = z2 + 3
z4 = z3 + 4
...
z999997 = z999996 + 999997
z999998 = z999997 + 999998
z999999 = z999998 + 999999
z100000 = z999999 + 1000000
in foldl (+) z1000000 [] >
let z1 = 0 + 1
z2 = z1 + 2
z3 = z2 + 3
z4 = z3 + 4
...
z999997 = z999996 + 999997
z999998 = z999997 + 999998
z999999 = z999998 + 999999
z100000 = z999999 + 1000000
in z1000000 >
 Now a large chain of +'s will be created:
let z1 = 0 + 1
z2 = z1 + 2
z3 = z2 + 3
z4 = z3 + 4
...
z999997 = z999996 + 999997
z999998 = z999997 + 999998
z999999 = z999998 + 999999
in z999999 + 1000000 >
let z1 = 0 + 1
z2 = z1 + 2
z3 = z2 + 3
z4 = z3 + 4
...
z999997 = z999996 + 999997
z999998 = z999997 + 999998
in (z999998 + 999999) + 1000000 >
let z1 = 0 + 1
z2 = z1 + 2
z3 = z2 + 3
z4 = z3 + 4
...
z999997 = z999996 + 999997
in ((z999997 + 999998) + 999999) + 1000000 >
let z1 = 0 + 1
z2 = z1 + 2
z3 = z2 + 3
z4 = z3 + 4
...
in (((z999996 + 999997) + 999998) + 999999) + 1000000 >
 ...
 ... My stack overflows when there's a chain of around 500000 (+)'s !!!
 ... But the following would happen if you got a large enough stack:
 ...
let z1 = 0 + 1
z2 = z1 + 2
z3 = z2 + 3
z4 = z3 + 4
in (((((z4 + 5) + ...) + 999997) + 999998) + 999999) + 1000000 >
let z1 = 0 + 1
z2 = z1 + 2
z3 = z2 + 3
in ((((((z3 + 4) + 5) + ...) + 999997) + 999998) + 999999) + 1000000 >
let z1 = 0 + 1
z2 = z1 + 2
in (((((((z2 + 3) + 4) + 5) + ...) + 999997) + 999998) + 999999) + 1000000 >
let z1 = 0 + 1
in ((((((((z1 + 2) + 3) + 4) + 5) + ...) + 999997) + 999998) + 999999) + 1000000 >
(((((((((0 + 1) + 2) + 3) + 4) + 5) + ...) + 999997) + 999998) + 999999) + 1000000 >
 Now we can actually start reducing:
((((((((1 + 2) + 3) + 4) + 5) + ...) + 999997) + 999998) + 999999) + 1000000 >
(((((((3 + 3) + 4) + 5) + ...) + 999997) + 999998) + 999999) + 1000000 >
((((((6 + 4) + 5) + ...) + 999997) + 999998) + 999999) + 1000000 >
(((((10 + 5) + ...) + 999997) + 999998) + 999999) + 1000000 >
((((15 + ...) + 999997) + 999998) + 999999) + 1000000 >
(((499996500006 + 999997) + 999998) + 999999) + 1000000 >
((499997500003 + 999998) + 999999) + 1000000 >
(499998500001 + 999999) + 1000000 >
499999500000 + 1000000 >
500000500000 >
Well, you clearly see that the redexes are created. But instead of being directly reduced, they are allocated on the heap:
let z1 = 0 + 1
z2 = z1 + 2
z3 = z2 + 3
z4 = z3 + 4
...
z999997 = z999996 + 999997
z999998 = z999997 + 999998
z999999 = z999998 + 999999
z1000000 = z999999 + 1000000
in z1000000
Note that your heap is only limited by the amount of memory in your system (RAM and swap). So the only thing this does is filling up a large part of your memory.
The problem starts when we finally evaluate z1000000:
We must evaluate z1000000 = z999999 + 1000000, so 1000000 is pushed on the stack. Then z999999 is evaluated; z999999 = z999998 + 999999, so 999999 is pushed on the stack. Then z999998 is evaluated; z999998 = z999997 + 999998, so 999998 is pushed on the stack. Then z999997 is evaluated...
...your stack will eventually fill when you evaluate a large enough chain of (+)'s. This then triggers the stack overflow exception.
But this is exactly the problem we had in the foldr case — only now the chain of (+)'s is going to the left instead of the right.
So why doesn't the chain reduce sooner than before?
It's because of GHC's lazy reduction strategy: expressions are reduced only when they are actually needed. In this case, the outerleftmost redexes are reduced first. In this case it's the outer foldl (+) ... [1..10000] redexes which are repeatedly reduced. So the inner z1, z2, z3, ... redexes only get reduced when the foldl is completely gone.
Foldl'
We somehow have to tell the system that the inner redex should be reduced before the outer. Fortunately this is possible with the seq function:
seq :: a > b > b
seq is a primitive system function that when applied to x and y will first reduce x then return y. The idea is that y references x so that when y is reduced x will not be a big unreduced chain anymore.
Now lets fill in the pieces:
> foldl' f z [] = z
> foldl' f z (x:xs) = let z' = z `f` x
> in seq z' $ foldl' f z' xs
> sum3 = foldl' (+) 0
> try3 = sum3 veryBigList
If we now evaluate try3 we get the correct answer and we get it very quickly:
500000500000
Lets see what happens:
try3 >
sum3 veryBigList >
foldl' (+) 0 veryBigList >
foldl' (+) 0 [1..1000000] >
foldl' (+) 1 [2..1000000] >
foldl' (+) 3 [3..1000000] >
foldl' (+) 6 [4..1000000] >
foldl' (+) 10 [5..1000000] >
 ...
 ... You see that the stack doesn't overflow
 ...
foldl' (+) 499999500000 [1000000] >
foldl' (+) 500000500000 [] >
500000500000
You can clearly see that the inner redex is repeatedly reduced first.
Conclusion
Usually the choice is between foldr
and foldl'
, since foldl
and foldl'
are the same except for their strictness properties, so if both return a result, it must be the same. foldl'
is the more efficient way to arrive at that result because it doesn't build a huge thunk. However, if the combining function is lazy in its first argument, foldl
may happily return a result where foldl'
hits an exception:
> (?) :: Int > Int > Int
> _ ? 0 = 0
> x ? y = x*y
>
> list :: [Int]
> list = [2,3,undefined,5,0]
>
> okey = foldl (?) 1 list
>
> boom = foldl' (?) 1 list
Let's see what happens:
okey >
foldl (?) 1 [2,3,undefined,5,0] >
foldl (?) (1 ? 2) [3,undefined,5,0] >
foldl (?) ((1 ? 2) ? 3) [undefined,5,0] >
foldl (?) (((1 ? 2) ? 3) ? undefined) [5,0] >
foldl (?) ((((1 ? 2) ? 3) ? undefined) ? 5) [0] >
foldl (?) (((((1 ? 2) ? 3) ? undefined) ? 5) ? 0) [] >
((((1 ? 2) ? 3) ? undefined) ? 5) ? 0 >
0
boom >
foldl' (?) 1 [2,3,undefined,5,0] >
1 ? 2 > 2
foldl' (?) 2 [3,undefined,5,0] >
2 ? 3 > 6
foldl' (?) 6 [undefined,5,0] >
6 ? undefined >
*** Exception: Prelude.undefined
Note that even foldl'
may not do what you expect.
The involved seq
function does only evaluate the topmost constructor.
If the accumulator is a more complex object, then fold'
will still build up unevaluated thunks. You can introduce a function or a strict data type which forces the values as far as you need. Failing that, the "brute force" solution is to use deepseq. For a worked example of this issue, see Real World Haskell chapter 25.
Rules of Thumb for Folds
Folds are among the most useful and common functions in Haskell. They are an oftensuperior replacement for what in other language would be loops, but can do much more. Here are a few rules of thumb on which folds to use when.
foldr
is not only the right fold, it is also most commonly the right fold to use, in particular when transforming lists (or other foldables) into lists with related elements in the same order. Notably, foldr
will be effective for transforming even infinite lists into other infinite lists. For such purposes, it should be your first and most natural choice. For example, note that foldr (:) []==id
.
Note that the initial element is irrelevant when foldr
is applied to an infinite list. For that reason, it is may be good practice when writing a function which should only be applied to infinite lists to replace foldr f []
with foldr f undefined
. This both documents that the function should only be applied to infinite lists and will result in an error when you try to apply it to a finite list.
The other very useful fold is foldl'
. It can be thought of as a foldr with these differences:

foldl'
conceptually reverses the order of the list. One consequence is that afoldl'
(unlikefoldr
) applied to an infinite list will be bottom; it will not produce any usable results, just as an expressreverse
would not. Note thatfoldl' (flip (:)) []==reverse
. 
foldl'
often has much better time and space performance than afoldr
would for the reasons explained in the previous sections.
You should pick foldl'
principally in two cases:
 When the list to which it is applied is large, but definitely finite, you do not care about the implicit reversal (for example, because your combining function is commutative like
(+)
,(*)
, orSet.union
), and you seek to improve the performance of your code.  When you actually do want to reverse the order of the list, in addition to possibly performing some other transformation to the elements. In particular, if you find that you precede or follow your fold with a reverse, it is quite likely that you could improve your code by using the other fold and taking advantage of the implicit reverse.
foldl
is rarely the right choice. It gives you the implicit reverse of fold, but without the performance gains of foldl'
. Only in rare, or specially constructed cases like in the previous section, will it yield better results than foldl'
.
Another reason that foldr
is often the better choice is that the folding function can shortcircuit, that is, terminate early by yielding a result which does not depend on the value of the accumulating parameter. When such possibilities arise with some frequency in your problem, shortcircuiting can greatly improve your program's performance. Left folds can never shortcircuit.
To illustrate this consider writing a fold that computes the product of the last digits of a list of integers. One might think that foldl'
is the superior fold in this situation as the result does not depend on the order of the list and is generally not computable on infinite lists anyway.
On my workstation running GHC 7.10.2, foldl' (\a e > (mod e 10)*a) 1 [1..10^7]
has a compiled runtime of 422 ms (all of it calculation) and allocates over 400 MBytes on the heap. Conversely, foldr (\e a > (mod e 10)*a) 1 [1..10^7]
has a compiled runtime of 2203 ms (the same 422 ms calculation time and 1781 ms of garbage collection) and allocates more than 550 MBytes on the heap. That is what you should expect from reading the previous sections; foldr
will build up a huge stack of nested expression before evaluating the result and that stack needs to be garbage collected.
But think about the problem a little more. Once the fold hits a number with last digit 0, there is no need to evaluate any further. The ultimate result will always be 0, so you can shortcircuit to that answer immediately. Indeed foldr (\e a > if mod e 10==0 then 0 else (mod e 10)*a) 1 [1..10^7]
has a measured runtime of 0ms and allocates less than 50 kBytes on the heap. This fold will run just as fast on the range [1..10^100]
or even [1..]
.
The left fold cannot shortcircuit and is condemned to evaluate the entire input list. Running foldl' (\a e > if mod e 10==0 then 0 else (mod e 10)*a) 1 [1..10^7]
takes 781 ms and allocates over 500 MByte of heap space; it is inferior to even the original left fold, not to mention the shortcircuiting right fold.