I created an instance of Functor
for the following definition of a DoubeList
(doubly linked list) data type recently. (Motivated by this post) My goal was to define an instance of Functor so that the function f
in fmap f db
is applied to the first field in each node in the doubly linked list - regardless of which value constructor is passed to fmap
. Afterward, I was having trouble explaining to myself how my fmap
definition would actually evaluate.
For example, passing in some value constructed with LeftEnd
will result in fmap
on a Middle
or RightEnd
. Likewise, a fmap
on some value with a Middle
or RightEnd
constructor will result in an fmap
on a previous or next node - and so on and so forth.
However, after exploring the result of fmapping on LeftEnd
, MiddleEnd
or RightEnd
using returnFirst
and returnNext
, it seems that for some function f
in fmap f db
, f
is applied to the first field in each node only once - like I set out to do. But I can't seem to grasp how this is possible.
My intuition is telling me that the lazy nature of Haskell is coming into play here. Any insight or explanation would be greatly appreciated!
module DoubleList where
data DoubleList a
= LeftEnd a (DoubleList a)
| Middle a (DoubleList a) (DoubleList a)
| RightEnd a (DoubleList a)
-- perserve identity
-- fmap id create = id create
-- perserve function composition
-- fmap (f . g) create = fmap f . fmap g $ create
instance Functor DoubleList where
fmap f (LeftEnd a nxt) = LeftEnd (f a) (fmap f nxt)
fmap f (Middle a prev nxt) = Middle (f a) (fmap f prev) (fmap f nxt)
fmap f (RightEnd a prev) = RightEnd (f a) (fmap f prev)
instance Show a => Show (DoubleList a) where
show (LeftEnd x next) = "Left End " show x "<->" show next
show (Middle x prev next) = show x "<->" show next
show (RightEnd x next) = show x " Right End"
create :: DoubleList Integer
create = let n1 = LeftEnd 1 n2
n2 = Middle 2 n1 n3
n3 = Middle 3 n2 n4
n4 = Middle 4 n3 n5
n5 = RightEnd 5 n4
in n1
returnFirst :: DoubleList a -> DoubleList a
returnFirst (Middle _ prev _) = returnFirst prev
returnFirst (RightEnd _ prev) = returnFirst prev
returnFirst firstNode = firstNode
returnPrev :: DoubleList a -> DoubleList a
returnPrev (Middle x fst@(LeftEnd _ _) _ ) = fst
returnPrev (Middle x mid@(Middle _ _ _) _ ) = mid
returnPrev (RightEnd x prev) = prev
returnPrev leftEnd = leftEnd
returnNext :: DoubleList a -> DoubleList a
returnNext (LeftEnd x nxt) = nxt
returnNext (Middle x prev nxt) = nxt
returnNext (RightEnd _ prev) = prev
CodePudding user response:
instance Functor DoubleList where
fmap f (LeftEnd a nxt) = LeftEnd (f a) (fmap f nxt)
fmap f (Middle a prev nxt) = Middle (f a) (fmap f prev) (fmap f nxt)
fmap f (RightEnd a prev) = RightEnd (f a) (fmap f prev)
Each call to a constructor represents a new node, and each call to fmap
represents a separate traversal over the data structure. In your definition of create
, you use let
bindings to ensure that the nodes are shared, but your definition of fmap
doesn’t preserve that sharing. However, you can’t actually observe this from normal safe code.
The reason it works is that the fields of your data structure are lazy, so your fmap
can produce a result incrementally: pattern-matching on the result of fmap f create
just evaluates up to the constructor (LeftEnd
/ Middle
/ RightEnd
) and doesn’t evaluate the fields. This means that even though create
contains reference cycles, you won’t run into an infinite loop when mapping a function over it and examining the results.
The reason it doesn’t quite work is that it duplicates work. If it could preserve the sharing in the input, the result would be structured like this:
fmap f create
===
let n1' = LeftEnd (f 1) n2'
n2' = Middle (f 2) n1' n3'
n3' = Middle (f 3) n2' n4'
n4' = Middle (f 4) n3' n5'
n5' = RightEnd (f 5) n4'
in n1'
But as it is, it’s structured like this:
let n1' = LeftEnd (f 1) n2'
n2' = Middle (f 2) (fmap f n1) (fmap f n3)
in n1'
Notice that n2'
doesn’t contain any reference to the new n1'
, it contains a thunk that will construct a value that happens to be equivalent to n1'
. So what you’ve written is a valid Functor
instance, but it will perhaps use more memory than you expected, depending on how you traverse the results. This data structure isn’t really a doubly-linked list, it’s a (possibly infinite) tree of computations, where each node may have 1 or 2 children.
By factoring, it’s equivalent to this, which you can think of as a type of infinite streams of values that may split into two at each point:
-- An infinite <3 -ary tree.
data LoveTree a
= LoveTree a (These (LoveTree a) (LoveTree a))
-- Using ‘Data.These.These’:
data These a b = This a | That b | These a b