Home > Net >  Need assistance on function to determine largest list Haskell
Need assistance on function to determine largest list Haskell

Time:02-11

I need to create a function that reads from a list of courses (such as the one shown below), and return which courses has the largest number of programming languages.

Here is the list

progLanguages = 
     [ ("CptS121" , ["C"]), 
     ("CptS122" , ["C  "]), 
     ("CptS223" , ["C  "]), 
     ("CptS233" , ["Java"]), 
     ("CptS321" , ["C#"]), 
     ("CptS322" , ["Python", "JavaScript"]), 
     ("CptS355" , ["Haskell", "Python", "PostScript", "Java"]), 
     ("CptS360" , ["C"]), 
     ("CptS370" , ["Java"]), 
     ("CptS315" , ["Python"]), 
     ("CptS411" , ["C", "C  "]), 
     ("CptS451" , ["Python", "C#", "SQL"]), 
     ("CptS475" , ["Python", "R"]) 
     ] 

It needs to be compatible with the following

The type of the max_count function should be compatible with one of the following:   
max_count :: [(a1, [a2])] -> (a1, Int) 
max_count :: Foldable t => [(a1, t a2)] -> (a1, Int) 

So far, I have attempted the following code

max_count [] = error "bad"
max_count [x] = x
max_count (x:xs) = x max_helper (max_count xs)
     where
     max_helper (a,b) (a',b')
          | length b > length b' = (a, length b)
          | otherwise = (a', length b')

This has not worked in the slightest, and I am at a blank for what to do. Any help is appreciated.

CodePudding user response:

Your main problem is that you are assuming that max_helper takes a class/language-list pair as its second argument, when it takes a class/language-count as returned by max_count. (You are also incorrectly applying x to max_helper, rather than the other way around.

max_count :: [(a1, [a2])] -> (a1, Int)
max_count [] = error "bad"
max_count [(a, b)] = (a, length b)
max_count (x:xs) = max_helper x (max_count xs)
        where
        max_helper (a,bs) (a', b')
            | length bs > b' = (a, length bs)
            | otherwise = (a', b')

However, you can define this more declaratively. First, you don't care about the actual languages, only their count. Replace the lists with their lengths by taking advantage of the fact that fmap f (a, b) == (a, f b):

> map (fmap length) progLanguages
[("CptS121",1),("CptS122",1),("CptS223",1),("CptS233",1),("CptS321",1),("CptS322",2),("CptS355",4),("CptS360",1),("CptS370",1),("CptS315",1),("CptS411",2),("CptS451",3),("CptS475",2)]

Now what you want is the maximum of this list, but using a comparison function that ignores the class name and only looks at the lengths. You can construct such a function using Data.Ord.comparing:

-- "b" > "a", but 1 < 2
> comparing snd ("b", 1), ("a", 2)
LT

You can use this comparison function with Data.List.maximumBy:

> import Data.List
> import Data.Ord
> max_count = maximumBy (comparing snd) . map (fmap length)
> max_count progLanguages
("CptS355",4)
> :t max_count
max_count :: Foldable t => [(a1, t a2)] -> (a1, Int)

CodePudding user response:

You can use foldr and hold the information of max entry in accumulator variable:

maxCount :: Foldable t => [(a1, t a2)] -> (a1, Int)
maxCount [] = error "bad"
maxCount ((c,cs):l) = foldr opr (c, length cs) ((c,cs):l)
  where
    opr (c', xs) (c'', n)
      | length xs > n = (c', length xs)
      | otherwise = (c'', n)

I think error "bad" is bad practice here, we can use Maybe:


maxCount' :: Foldable t => [(a1, t a2)] -> Maybe (a1, Int)
maxCount' [] = Nothing
maxCount' ((c,cs):l) = Just $ foldr opr (c, length cs) ((c,cs):l)
  where
    opr (c', xs) (c'', n)
      | length xs > n = (c', length xs)
      | otherwise = (c'', n)

CodePudding user response:

In this case, function foldr is best. Your approach is also feasible. It is only necessary to thoroughly understand the error messages of the compiler. In this case, I would advise you not to use the where clause and implement the function max_helper globally. I don't know if you intended

max_helper1 :: Foldable t => (a1, t a2) -> (a1, Int) -> (a1, Int)

or

max_helper2 :: (a1, Int) -> (a1, Int) -> (a1, Int)

There are many possibilities. The second parameter is good to have the same as the output value of the max_count, because of recursion.

The left side of the max_count can then look like this:

max_count (x:xs)

or

max_count ((x,y):xs)

.

The right side could probably contain:

max_helper12 ... (max_count xs)

The periods must be replaced with x or (x,length y) according to the variant.

I also noticed that for max_helper1, there could be a double calculation of the length, and it isn't efficient in Haskell. Therefore, I would recommend counting only once in where:

max_helper1 (a,b) (a',b')
          | b'' > b' = (a, b'')
          | otherwise = (a', b')
  where
    b'' = length b

Also, try to correct the termination rule:

max_count [(x,y)] = (x,length y)

  • Related