Home > Net >  When to use tabs and when to use spaces in Haskell?
When to use tabs and when to use spaces in Haskell?

Time:06-27

I am wondering when I should use tabs and when I should use spaces? Especially in guards, I'm working through he learn you a Haskell book and it said I should always use spaces. The book itself seems to use 4 spaces in definition with guards though. For example this function:

replicate' :: (Num i, Ord i) => i -> a -> [a]  
replicate' n x  
    | n <= 0    = []  
    | otherwise = x:replicate' (n-1) x

When I replace the 4 spaces/tabs with a single space I get an indentation / missed brackets error in the otherwise case. However, if I use a tab or 4 spaces this works. Did I misunderstand something about using spaces over tabs? Should it be 4 spaces each time? Because often 1 space does work, just with guards ghci is almost always (infuriatingly not always) complaining here.

I'm using sublime btw, in case there is an issue there.

Thanks a lot in advance.

For example:

maximum' [] = error "maximum of empty list"  
maximum' [x] = x  
maximum' (x:xs)   
 | x > maxTail = x  
 | otherwise = maxTail  
 where maxTail = maximum' xs 

throws an indentation error

CodePudding user response:

It sounds like the OP's actual problem was a weird state in a particular file, but I thought I'd provide an answer to the general question here.

Most parts of Haskell syntax are completely insensitive to indentation (most of the common practices about laying out Haskell code are stylistic, rather than necessary). For example, all of these ways of writing the last equation in the OP's example work just fine:

-- guards indented more than where
maximum' (x:xs)   
         | x > maxTail = x  
         | otherwise = maxTail  
 where maxTail = maximum' xs

-- where indented more than guards
maximum' (x:xs)   
 | x > maxTail = x  
 | otherwise = maxTail  
              where maxTail = maximum' xs

-- all on one line, no indentation at all!
maximum' (x:xs) | x > maxTail = x | otherwise = maxTail where maxTail = maximum' xs

Even something horrifying like this works:

-- please, no
maximum' (x:xs) |
              x
  > maxTail = x
                  | otherwise
 = maxTail where
                      maxTail
                        = maximum' xs

There are exactly 2 things you can to do mess up indentation in the code shown:

  1. Use more than one line to define the equation, with any of the continuation lines not starting with at least one whitespace character
  2. Use more than one line to define the where clause, with any of the continuation lines starting at a character position less than that of the first character after the where keywords (i.e. the m in maxTail)

Otherwise, the whitespace in this example does not matter at all (apart from separating identifiers and keywords).

There is basically only one general way in which indentation matters in Haskell. And it's actually not indentation as such, but alignment that matters. That happens in the context of "blocks" containing a variable number of entries:

  1. a let <decls> in <expr> expression contains 1 or more declarations in the <decls> part
  2. a where clause introduces 1 or more declarations
  3. an instance definition's where part has zero or more method definitions
  4. a do block has 1 or more statements
  5. a case expression has 1 or more cases (zero or more with the EmptyCase extension)
  6. etc, etc

The variable number of entries in these blocks are the only places where alignment matters. There is always a keyword introducing the block, and the character position of the first entry in the block sets the alignment; after that every line that starts exactly at this character position is taken as the beginning of the next entry in the block, every line that starts past this position is taken as a continuation line of the previous entry, and the first line that starts before alignment position is taken as ending the block (and the contents of this line are not part of the block).

As an aside, you may at this point be wondering why it's possible to get an error with code like this:

bar x y
= x   y

I haven't used any where, let, etc blocks above, but it's possible to get an alignment error here by continuing the bar definition onto a new line without indentation? Didn't I promise indentation only matters in blocks? Well, actually the entire global scope of a module is an aligned block! We just usually don't notice it because it's conventional to use alignment position 0 for this block. But technically, that's what's going on (thus you can't have a continuation line for one of the declarations in the global block that starts at alignment 0).

This layout based on alignment rather than indentation is why tabs are often considered difficult to use to layout Haskell code. As an example, consider this:

foo x y z = xy   yz
    where xy = x * y
          yz = y * z

Here I have used 4 spaces to indent the where part, and this is one of those places where the whitespace is completely irrelevant, so I could have used anything I like. Therefore, if I'm accustomed to using tabs as indentation in other programming languages, I might have been tempted to use a tab rather than 4 spaces.

Where things get nasty is that the correct indentation of the yx = y * z line is not "2 indent levels in", but rather "lining up exactly with the xy = x * y definition". So if I had used a tab to indent the where, the only correct way to indent the following line is to use a tab followed by 6 spaces. In my experience this is something that even smart formatting code editors never get right (let alone humans doing it manually); it is far more likely that if my view settings have a tab take up less space than the where keyword (such as the common 4 spaces) that I will get at least 2 leading tabs, followed by enough spaces to make the yz = y * z line appear to line up with the definition above.

Haskell compilers, by the spec, treat tab stops as eight spaces apart. So the situation I described above (where the first definition in the where is at 1 tab plus 6 normal characters and the second is at 2 tabs plus 2 normal characters) results in an invisible error. The compiler thinks these definitions are at positions 14 and 18, but to me they look the same. This sort problem is not fun. Hence the upvoted comment "When to use tabs? Never! That was an easy one."

Technically you can set your editor to show tabs stops at 8 spaces, and then it doesn't matter whether a given amount of indentation is all spaces or any mix of tabs and spaces that looks the same. However, most people don't like to have their editor set to show tabs as 8 spaces, and fixing any particular number defeats the entire point of indenting using tabs (having the visual appearance of "indent levels" be something that each user can configure independently in their editor).

It is also possible to adopt a code style that avoids the problem. Basically: always end the line immediately after a keyword introducing a block, so that the block starts on a new line (which you bump up the next indent level). You would then write (for the OP's example):

maximum' (x:xs)   
    | x > maxTail = x  
    | otherwise = maxTail  
    where
        maxTail = maximum' xs 

If you do that then your alignment positions will always be an exact number of tabs and zero normal characters, so you will not end up forced to use leading space that is a mix of tabs and spaces. In fact Haskell's alignment rules become extremely similar to Python's indentation rules if you code like this (the major reason they are different is that Haskell allows you to start an aligned block on the same line as preceding code, whereas Python's blocks are always preceded by a line ending in a colon).

But by far the most common approach to using tabs in Haskell is: simply don't do it. Configure your editor to insert spaces up to the next "tab-stop" when you press the tab key, if you like. But make sure the physical source code file is saved with spaces.


Gratuitous soapbox time!

For me personally, the reasoning above is why I don't like to use tabs in any language. Because sooner or later someone always ends up wanting to make something on one line visually align with something on another line, and this often needs indentation that is a mix of tabs and spaces. The tab and space mix is almost never correctly handled by the editor (to do so in general requires the editor to be able to tell when a line is a continuation line or the start of a new syntactic construct, before the coder has finished typing it, which is at best language-dependent and at-worst just impossible). So they write code that is simply incorrectly formatted as soon as someone uses a different tab-width preference than they used.

An example would be this fairly common layout (in no particular language):

class Foo {
    public int foo(int x, char y, long listOfParameters,
                   bool z, double ooopsRanOutOfLetters) {
        codeStartsHere();

If indent levels are tabs, then the correct indentation for the continuation of the parameter list is 1 tab and 15 spaces, but someone is just as likely to get 4 tabs and 3 spaces, which throws off the alignment completely at any other tab-width setting. Basically, if indent levels are to be configured for each coder's preference (by setting the tab-width), then there is a fundamental difference between inserting an indent level and inserting a visually-equivalent number of spaces, requiring you to think about which you intend every time you hit the tab key. Even if the formatting is purely a visual aid to human readers and causes no change in how the compiler/interpreter will read the code, aiding human readers is arguably more important than merely writing something that the machine will accept.

And again, this problem can be addressed by rigidly adhering to a style guide that is carefully constructed to avoid layouts like the above ever happening. But I just don't want to have to think about that when I'm designing or evaluating a style guide, nor when I'm writing code. "Always indent with spaces" is an incredibly simple rule to put in a style guide, and then it's never an issue regardless of the other rules you adopt (and regardless of whether those other rules are strictly followed or there are exceptions).

CodePudding user response:

Only use spaces, because Haskell is indentation-sensitive and implicit layout blocks must start on a column greater than their layout keywords, so it's important to keep track of columns exactly.

Furthermore, tab stops are 8 columns apart according to Haskell2010, which is huge by today's indentation standards which are usually at most 4 spaces.

  • Related