Home > other >  Compare two list of strings and return element that doesn't match in Haskell
Compare two list of strings and return element that doesn't match in Haskell

Time:08-27

I'm trying to figure out the logic of what I need to do. I have read in two files and save them as lists of strings. One list is sample text and the other list is a dictionary of words. I want to compare each element in the sample list with the dictionary and if the word in the sample list is not in the dictionary to return saying that its not there.

Can I use the filter function for this, e.g. take the first element of sample list and check against the dictionary, if true move onto next element of sample list, or if false return element.

The reason I'm thinking to use filter is it can do stuff like:

>filter (==3) [1,2,3,4] 
[3]
// But move to next element when true

I also found this Compare two lists and return the first element that is in both lists. It's not exactly what I want but would be a better approach over filter maybe.

I don't know to code this yet as I'm trying to figure out the best way first. Just want to know if I'm going down the correct path or open to any suggestions that I would be better of looking into.

CodePudding user response:

Yes, filter is a good choice. Here's a quick sketch:

type Dictionary = {- up to you -}

isInDictionary :: Dictionary -> String -> Bool
isInDictionary = {- up to you -}

misspellings :: Dictionary -> [String] -> [String]
misspellings dict = filter (not . isInDictionary dict)

For example, using a simple [String] as the dictionary and elem (well, flip elem) as the isInDictionary, we could try this out in the interpreter:

> dict = ["brown", "blue", "red", "slow", "stupid", "quick", "smart", "lazy", "cat", "dog", "elephant", "fox", "ran", "walked", "jumped", "a", "the", "in", "over"]
> misspellings dict ["the", "quikc", "brown", "fox", "jumped", "over", "teh", "lazy", "dog"]
["quikc","teh"]

Of course, in a real development, you'd want to use a better type for your dictionary for efficiency reasons -- perhaps a trie (that's not a typo), DAWG, or bloom filter.

CodePudding user response:

You could do this with a list comprehension.

list1 :: [String]
list1 = ["I'm", "trying", "to", "figure", "out", "the", 
         "logic", "of", "what", "I", "need", "to", "do.", 
         "I", "have", "read", "in", "two", "files", "and", 
         "save", "them", "as", "lists", "of", "strings."]

list2 :: [String]
list2 = ["the", "a", "not", "in", "there", "blue", "orange"]

checkElem :: (Foldable t, Eq a) => [a] -> t a -> [Bool]
checkElem xs ys = [x `elem` ys | x <- xs]

Note that this doesn't account for upper and lowercase or punctuation in the first list. You may want to preprocess to uppercase both list and remove punctuation, if that's more what you need. For example:

(map . map) toUpper $ splitOneOf ".,; " "Hello, my friend. There."
-- ["HELLO","","MY","FRIEND","","THERE",""]
  • Related