Home > Mobile >  Trouble understanding the behavior of `foldr` and `map` in Haskell
Trouble understanding the behavior of `foldr` and `map` in Haskell

Time:03-18

I have a function prefixes that, given [1, 2, 3], returns the prefixes [[1], [1, 2], [1, 2, 3]]. It is defined as follows:

prefixes :: Num a => [a] -> [[a]]
prefixes = foldr (\x acc -> [x] : (map ((:) x) acc)) []

I have spent nearly two days trying to understand why this works. When I debug this in my head, I imagine this for prefixes [1, 2, 3]:

foldr call|__________________________________________________________________________
    1     | [1] : (map ((:) 1) [])
          |
          |         where x = 1 and acc = []
          |         returns acc = [[1]]
          |
    2     | [2] : (map ((:) 2) [[1]])
          |
          |         where x = 2 and acc = [[1]]
          |         and (map ((:) 2) [[1]])
          |                  returns acc = [[1, 2]]
          |         and [2] : [[1, 2]]
          |                  returns [[2], [1, 2]]
          |
    3     | [3] : (map ((:) 3) [[2], [1, 2]])
          |
          |         where x = 3 and acc = [[2], [1, 2]]
          |         and (map ((:) 3) [[2], [1, 2]])
          |                  returns acc = [[2, 3], [1, 2, 3]]
          |         and [3] : [[2, 3], [1, 2, 3]]
          |                  returns [[3], [2, 3], [1, 2, 3]]
          |

And then the function terminates and returns [[3], [2, 3], [1, 2, 3]]. But obviously that is not happening. It returns [[1], [1, 2], [1, 2, 3]].

In Ghci, I find this:

Stopped in Main.prefixes, ex.hs:21:20-63

_result :: [a] -> [[a]] = _
[ex.hs:21:20-63] *Main> :step

Stopped in Main.prefixes, ex.hs:21:37-59

_result :: [[Integer]] = _
acc :: [[Integer]] = _
x :: Integer = 1

[ex.hs:21:37-59] *Main> :step

[[1]

Stopped in Main.prefixes, ex.hs:21:44-58

_result :: [[Integer]] = _
acc :: [[Integer]] = _
x :: Integer = 1

[ex.hs:21:44-58] *Main> :step
Stopped in Main.prefixes, ex.hs:21:37-59

_result :: [[Integer]] = _
acc :: [[Integer]] = _
x :: Integer = 2

[ex.hs:21:37-59] *Main> :step
,
Stopped in Main.prefixes, ex.hs:21:49-53

_result :: [Integer] -> [Integer] = _
x :: Integer = 1

[ex.hs:21:49-53] *Main> :step
[1,2]
Stopped in Main.prefixes, ex.hs:21:44-58

_result :: [[Integer]] = _
acc :: [[Integer]] = _
x :: Integer = 2

[ex.hs:21:44-58] *Main> :step
Stopped in Main.prefixes, ex.hs:21:37-59

_result :: [[Integer]] = _
acc :: [[Integer]] = _
x :: Integer = 3

[ex.hs:21:37-59] *Main> :step
,
[1Stopped in Main.prefixes, ex.hs:21:49-53

_result :: [Integer] -> [Integer] = _
x :: Integer = 2

[ex.hs:21:49-53] *Main> :step
,2,3]
Stopped in Main.prefixes, ex.hs:21:44-58

_result :: [[Integer]] = _
acc :: [[Integer]] = _
x :: Integer = 3

[ex.hs:21:44-58] *Main> :step

]

Which I interpret as:

__lines___|__________________________________________________________________________
 21:37-59 | [1] : (map ((:) 1) acc)           ->  [[1]
          |        
          |              
          |
 21:44-58 |       (map ((:) 1) acc)           ->  does nothing, as acc = []
          |                                                 
          |              
          |
 21:37-59 | [2] : (map ((:) 2) acc)           ->  ,
          |         
          |              
          |
 21:49-53 |            ((:) 1)                ->  [1, 2]
          |            
          |                 
          |
 21:44-58 |       (map ((:) 2) acc)           ->  outputs nothing
          |         
          |              
          |
 21:37-59 | [3] : (map ((:) 3) acc)           ->  ,[1
          |         
          |              
          |
 21:49-53 |            ((:) 2)                ->  , 2, 3]
          |             
          |
 21:44-58 |       (map ((:) 3) acc)           ->  ]
          |

Printing [[1], [1, 2], [1, 2, 3]]. Could someone explain why, when lines 49-53 are evaluated, x is the x value from the previous foldr invocation?

I know that (map ((:) x) acc) can be expanded to (foldr ((:) . ((:) x)) [] acc), as map f = foldr ((:) . f) []. So I rewrote the function into the following

prefixesSolution :: Num a => [a] -> [[a]]
prefixesSolution = foldr (\x acc -> [x] : (foldr ((:) . ((:) x)) [] acc)) []

And this works as well. Now, the lambda passed to the second foldr ((:) . ((:) x)) I would imagine could be refactored as (\ element accumulator -> (element:accumulator) . ((element:accumulator) x)). But this does not work: Couldn't match expected type ‘a -> a0 -> b0’ with actual type ‘[[a]]’. All this I have done in order to pinpoint exactly what is happening.

I also do not understand the function passed to map ((:) x).

I apologize for how convoluted this post is. At this point I don't even know what I don't know. If someone could clearly walk me through this function I would be so so grateful.

CodePudding user response:

foldr accumulates from the end of the list.

Initially acc = [] (using the second argument of foldr).

Starting from the end, we apply the given function \x acc -> [x] : (map ((:) x) acc) with x = 3:

[3] : map (3 :) []
= [[3]]

With acc = [[3]], add the preceding element, x = 2:

[2] : map (2 :) [[3]]
= [[2], [2,3]]

With acc = [[2], [2,3]], add the preceding element, x = 1:

[1] : map (1 :) [[2], [2,3]]
= [[1], [1,2], [1,2,3]]

You can also still evaluate foldr "left to right", but in that case, remember that acc gets instantiated with "the next recursive call".

foldr f b (x : xs) = f x (foldr f b xs)

prefixes [1,2,3]
= [1] : map (1 :) (prefixes [2,3])    -- acc = prefixes [2,3], the next recursive call
= [1] : map (1 :) ([2] : map (2 :) (prefixes [3]))
...

CodePudding user response:

Starting with the question about the function passed to map:

In Haskell, all operators are also functions. By itself, : is the list construction ("cons") operator:

 1 : [2,3]  -- > [1,2,3]

If you put parentheses around it, it becomes a prefix function instead of an infix operator:

 (:) 1 [2,3]  -- > [1,2,3]

When you remember that Haskell function application is curried, then you can see that (:) 1 is necessarily a function that prepends 1 to a list:

 f = (:) 1
 f [2,3]   -- > [1,2,3]

So the function passed to map is one that takes a list as its argument and prepends x (the current item from the foldr) to that list.

The surrounding function prepends [x] to the result of that map, growing the list.

Next let's talk about foldr itself. It may help to think of the list [1,2,3] as the sequence of cons calls required to create it. In tree form that looks like this:

   (:)
1      (:)
    2      (:)
        3      []

And in Haskell you could write it like this:

(:) 1 ( (:) 2 ( (:) 3 [] ) )

Given the above, what the call foldr func init [1,2,3] does is replace the final [] with the init value and all the (:)s with the supplied func. So the final result is the same as the result of this expression, which you can think of as an expansion of the foldr version:

func 1 ( func 2 ( func 3 init ) )

That is, foldr first calls the func on 3 (which becomes x) and [] (which becomes acc). (Technically, it calls the function on 3, and the result of that call is another function that it then calls on [], but that's just how function application works in Haskell; the difference is not important to this particular discussion.) Then it calls the func on 2 and the result of the first call, and then it calls it on 1 and the result of the second call.

As we established above, the func first does a map ((:) 3) [] - returning [], since mapping anything across the empty list just returns the empty list - and prepends [3] to the result, giving [[3]].

Then it calls the func on 2 and [[3]]. The map returns [[2,3]], to which it prepends [2], yielding [[2],[2,3]].

Finally it calls the func on 1 and [[2],[2,3]]. The map returns [[1,2],[1,2,3]] and the func prepends [1] to it, yielding the final answer [[1],[1,2],[1,2,3]].

CodePudding user response:

When evaluating something like prefixes [1,2,3] by hand, you should try to be very careful in writing out each step of the evaluation.

I would look at it like this:

Before we start, I suggest a couple of steps of preparation. I'll also give variables fresh names as we go, to hopefully make things more clear.

It will help to write the pattern matches as case expressions, so we will do this next.

We can observe that foldr can be written as

foldr f z list =
  case list of
    [] -> z
    (y:ys) -> f y (foldr f z ys)

I'll skip over some of the details of specific map applications later on and focus more on the foldr steps. If this is unclear, I can expand on that more.

Now that we've got that taken care of, we can evaluate. I'm not going to focus so much on the evaluation order, since this will not affect the final result. This will let me simplify a couple of things. As a result, you shouldn't necessarily assume this is exactly what the computer is doing (even though the result is the same here, it could have differences in terms of memory efficiency, time efficiency and possibly strictness properties).

    prefixes [1,2,3]

==> {definition of prefixes}
    foldr (\x acc -> [x] : (map ((:) x) acc)) [] [1,2,3]

==> {definition of foldr}
    let f = \x acc -> [x] : (map ((:) x) acc)
    in
    case [1,2,3] of
      [] -> []
      (y:ys) -> f y (foldr f [] ys)

==> {reduce case match on known value}
    let f = \x acc -> [x] : (map ((:) x) acc)
    in
    f 1 (foldr f [] [2,3])

==> {definition of foldr}
    let f = \x acc -> [x] : (map ((:) x) acc)
    in
    f 1 (case [2,3] of
           [] -> []
           (y:ys) -> f y (foldr f [] ys))

==> {reduce case match on known value}
    let f = \x acc -> [x] : (map ((:) x) acc)
    in
    f 1 (f 2 (foldr f [] [3]))

==> {definition of foldr}
    let f = \x acc -> [x] : (map ((:) x) acc)
    in
    f 1 (f 2 (case [3] of
               [] -> []
               (y:ys) -> f y (foldr f [] ys)))

==> {reduce case match on known value}
    let f = \x acc -> [x] : (map ((:) x) acc)
    in
    f 1 (f 2 (f 3 (foldr f [] [])))

==> {definition of foldr}
    let f = \x acc -> [x] : (map ((:) x) acc)
    in
    f 1 (f 2 (f 3 (case [] of
                    [] -> []
                    (y:ys) -> f y (foldr f [] ys))))

==> {reduce case match on known value}
    let f = \x acc -> [x] : (map ((:) x) acc)
    in
    f 1 (f 2 (f 3 []))

==> {apply f}
    let f = \x acc -> [x] : (map ((:) x) acc)
    in
    f 1 (f 2 ([3] : map ((:) 3) []))

==> {apply map}
    let f = \x acc -> [x] : (map ((:) x) acc)
    in
    f 1 (f 2 ([3] : []))

==> {list sugar}
    let f = \x acc -> [x] : (map ((:) x) acc)
    in
    f 1 (f 2 [[3]])

==> {apply f}
    let f = \x acc -> [x] : (map ((:) x) acc)
    in
    f 1 ([2] : map ((:) 2) [[3]])

==> {apply map}
    let f = \x acc -> [x] : (map ((:) x) acc)
    in
    f 1 ([2] : [((:) 2) [3]])

==> {list sugar}
    let f = \x acc -> [x] : (map ((:) x) acc)
    in
    f 1 [[2], [2,3]]

==> {apply f}
    [1] : map ((:) 1) [[2], [2,3]]

==> {apply map}
    [1] : [((:) 1) [2], ((:) 1) [2,3]]

==> {list sugar}
    [1] : [[1,2], [1,2,3]]

==> {list sugar}
    [[1], [1,2], [1,2,3]]

This is the general process can be used to understand the result obtained from evaluating expressions. Note that every step is a valid Haskell expression that behaves identically to the original expression. Essentially, I just expanded definitions, reduced case expressions when the case is matching on a (:) ... ... or a [], applied functions (using beta-reduction) and introduced some syntactic sugar for lists to make things a bit easier to read in parts. Those kinds of steps already cover a significant portion of the tools you need to reduce most Haskell expressions by hand.

A very similar process can also be used for equational reasoning, which can be used as a systematic technique to optimize Haskell programs. It works by replacing expressions with other expressions that always give the same result but could have different efficiency characteristics. Essentially anything written by Richard Bird will provide examples of equational reasoning, among others.

  • Related