I suppose this may be a controversial topic because I touch the language design deeply, and I know some one around won't like that because they misunderstand I deny some virtue of what they like.
Why does Haskell need to have IO/Actions even though it's lazy-evaluation?
I understand the value of IO/Actions mechanism to hold so-called "purity" of functional programming if it's for an eager-evaluating language such as C, JavaScript or any others.
In fact, I did emulate/implement IO ()
in Typescript that evaluates eagerly, then I thought "Ok, cool, but why does Haskell need this??"
Haskell is lazy in default, therefore even the function is defined as
print
== console.log
in JavaScript syntax, in Haskell since it's lazy, print
won't be executed in anyway unless it's connected to main :: IO ()
.
Any thoughts?
Edit:
Apparently, this question arises from total misconception of mine.
In Haskell, it defined as
print
== console.log
print :: Show a => a -> IO () -- Defined in ‘System.IO’
I simply misunderstood as if defined as
print :: Show a => a -> _ -> IO ()
because it's needed to be so to emulate in eager evaluation.
CodePudding user response:
You've mixed things up! Haskell doesn't need IO
in spite of laziness, it needs it because of laziness.
Let's imagine for a second that we don't have IO
(or, equivalently, that everything that does IO
is implicitly wrapped by unsafePerformIO
). So, for example, I might write:
main = print (readLn readLn)
This would get two lines of input from the user, parse them as numbers, add them up, and print the result. Nice! No problem so far. Now I decide I want to implement a little language. The thing I want to do is read a couple -- say, 5 -- of variable/value pairs from the user, stick them in a Map
, and then read an expression from the user that might mention those variables. So an interaction with the user might look like
> 5
> 32
> 17
> -6
> 72
> (x1 x4) * (x0 x3)
< -104
where >
marks lines I type in and <
marks lines the program prints. The answer is -104 because x1=32, x4=72, x0=5, and x3=-6. The binding for x2=17 isn't used. Okay, let's write it.
import qualified Data.Map as M
interpret :: M.Map String Int -> String -> Int
interpret = {- not relevant, really... right? -}
main = interpret env expr where
env = M.fromList [("x0", readLn), ("x1", readLn), ("x2", readLn), ("x3", readLn), ("x4", readLn)]
expr = getLine
Okay, now, pop quiz: what does this program do? Well, if we are taking laziness seriously, then all those getLine
s are deferred until somebody actually looks at them. And if anybody's looking, who is it? It's interpret
! So, to know what this program does, we actually do have to know what interpret
does. Okay, let's start filling it out:
interpret env s = case parseExpr s of
Just expr -> evaluateArithmetic (replaceVariables env expr)
Nothing -> 404 -- lol
...aaaand, now we're in trouble. For a bunch of reasons, actually. Because the first thing interpret
does is it evaluates s
, which means the first line the user types actually plays the role of the expression, not the last line. So that's kind of unfortunate, but okay, maybe we just decide that's fine and reimagine our ideal interaction to conform to these implementation details:
> (x1 x4) * (x0 x3)
> 5
> 32
> 17
> -6
> 72
< -104
But even if we give up on the dream of putting the expression last, we're still in trouble. Because look what replaceVariables
does:
data Expr = Lit Int | Var String | Add Expr Expr | Times Expr Expr
replaceVariables env (Lit n) = Lit n
replaceVariables env (Var v) = Lit (env M.! v)
replaceVariables env (Add x y) = Add (replaceVariables env x) (replaceVariables env y)
replaceVariables env (Times x y) = Times (replaceVariables env x) (replaceVariables env y)
Did you spot it? With the expression the user typed in, x1
is the first variable it tries to replace -- meaning that it is the first readLn
that gets executed, and instead of being 32, the second number we entered, as we intended, it is 5, the first number we entered. Similarly, x4 becomes 32 instead of 72, etc. and we get a just plain wrong answer. (Also, the program replies after we enter the fourth number without waiting for the fifth. But maybe that's not such a big deal.)
So this is the crux of the problem: without IO
, the programmer has much less control over what order interactions with the user happen in. There's a follow-on problem that we didn't explore here, which is that not only is there little control, but that refactoring can change the interface -- if we made replaceVariables
swap the arguments to Add
for some reason, even though this really seems like a change that shouldn't affect anything, it makes the order that lines get read from the user even more different and confusing!
This is the core problem that IO
solves. The implementation of (>>=)
adds a data dependency that prevents later computations from executing until earlier ones finish. This means that when we write
main = readLn >>= \x -> {- rest of the program -}
we can be sure that x
contains the contents of the first line the user types in, not some other line determined by the structure of the entire rest of the program.
Having to understand entire programs at once to know what small chunks of it do just doesn't work at scale!