Coming from a Python and Javascript background, I really can't understand why updating the element of a list in Haskell isn't straightforward. I mean, why can't I write something as simple as myList[0] = "newValue"
?
So, I have two questions:
- Why isn't it straightforward to update an element of a list in Haskell? I would appreciate some theoretical response here.
- How can I go about updating an element of a list? So far, my approach has been to re-create the entire list while updating the elements I am interested on. But this approach doesn't seem performant.
CodePudding user response:
This doesn't have anything to do with lists. It is, by intention, not possible to update any values whatsoever at all in Haskell.
Why? Well, it's just the way the language is designed. You might as well ask why imperative languages permit updating values. I mean, you write
x = 3
and then you write
x = 4
wut? was the first one a lie then or what?? Surely, we should have been explicit about that we're referring to two different variables in time. This is just asking for bugs! Certainly it can't be worth it just to save some characters and enable some low-level optimisation that could also be achieved in other, safer ways?
...right, no?
Even in an imperative language, it actually doesn't make much sense to update values in (linked) lists – you need to traverse O (n) arguments anyway to even get to the one that should be changed. Creating a whole new list takes on average only twice as long as changing the old one, but unlike with an imperative update you never need to worry about whether something else still needed the old version because you never interfer with it anyway.†
And linked lists are typically pretty slow anyway, so then worrying about the 2× factor doesn't make much sense either. However, GHC often optimises lists away completely so they're never actually built up in memory at all, and that's among the things that would be much harder for the compiler to guarantee if it had to worry about someone mutating the list somewhere else.
Updating an element in an array, that's a different story. And indeed updating values in arrays is quite common in Haskell too, for the applications where this brings an important performance gain. It is still not possible to update a value of array type, but it is possible to update a value through a reference to a monadically mutable array. This ends up looking quite similar to how you update arrays in an imperative language, though it is normally a bit verbose because all the array tooling needs to be imported from a library, rather than using built-in syntax.
import qualified Data.Vector.Mutable as VM
main :: IO ()
main = do
myArray <- VM.generate 37 (const "this element isn't set yet")
print =<< VM.read myArray 0
VM.write myArray 0 "newValue"
print =<< VM.read myArray 0
Equivalent Python:
def main():
myArray = ["this element isn't set yet"] * 37
print(myArray[0])
myArray[0] = "newValue"
print(myArray[0])
However, very often you don't really need to update elements. In fact, we immediately see a problem here: you're indexing into an array. That means you need to ensure the index actually exists. In imperative language, this is so common that you hardly think about it, but in Haskell we really prefer total code, i.e. even if we accidentally swap two variables (like two different loop-indices) it should not give a runtime error, but preferrably a compiler one.
More often than not, if you're updating a single element, you'll also be updating other elements. In fact, very often you'll be sequentially updating all of them, and then there isn't much advantage anymore over simply building a new list from scratch that contains the updated values right away. And with that approach, there's very little that can go wrong, at least not at runtime.
†There's a big caveat here: if somebody else still uses the old list, it means the garbage collector can't reclaim the memory. This is the reason why it's quite easy to get memory leaks in Haskell – IMO the single biggest problem of the language.
CodePudding user response:
It should feel familiar: Python strings behave the same way. You can't do myString[0] = 'H'
, even though that's fine in other languages like C.
This makes writing code easier and helps to prevent bugs because you don't have to be constantly mindful of who modifies which string.
If you do want to "modify" a Python string? You wouldn't think twice about doing:
myString = 'H' myString[1:]
Sure, it requires that you restructure the program a bit if you need to propagate such a change, but overall it's helpful and effective. Python does the same with numbers and tuples for the same reason.
Haskell aims to be even safer, and therefore does the same with lists. (And maps. And sets. And everything else.)