Home > Net >  Check if two lists of lists are the same size, including all elements of the inner lists
Check if two lists of lists are the same size, including all elements of the inner lists

Time:11-30

Let's say, I have two lists of lists like this:

["hello", "hi", "hey"] and ["apple", "an", "hey"]

And I want to check if the two lists contain the same amount of lists and the corresponding inner lists contain the same amount of elements. In this case, this function should return True. It should also work for endless lists (so in a case like [[1..], [1..]] and [[1..], [3,4,5]] it should return False, since only the first list's second element is endless, the second list's second elemnt isn't).

So far, I've come up with this:

listsSize (x:xs) (y:ys)
  | length x == length y && length xs == length ys = True
  | otherwise = False

But it only checks if the lists contain the same number of lists, it doesn't evaluate the inner lists.

CodePudding user response:

You can't* check whether a list is endless without restricting how you can make endless lists. Assuming this, it is also impossible to check if two lists are both endless - let's say we had such a function f :: [a] -> [a] -> Bool. Then I could write isEndless xs = f xs xs to implement the "is this list endless" check.

If we restrict the question to be "at most one of the lists can be endless", then it becomes possible, via the same method that you would use to check whether they have the same length, which is to walk both of them simultaneously until one of them runs out. This will, as expected, loop if you give it two infinite lists.

P.S.:

if b then True else False

is the same as just b. Similarly, your guard can be converted to just length x == length y && length xs == length ys

* Assuming I had an isEndless :: [a] -> Bool which only takes finite time, I could solve the halting problem. Sketch: assume I have some types Program and Step, respectively representing a program and a single step of its execution (e.g. in a turing machine). Then I could write steps :: Program -> [Step] giving me a list of the "steps" of execution a program will make, because I can simulate the program running. Note how the result of steps is potentially infinite. Then,

halts :: Program -> Bool
halts p = not (isEndless (steps p))

telling me for an arbitrary program whether it halts.

CodePudding user response:

A simple version that's not quite fully featured is to start by implementing a simpler function that just checks whether two lists have the same shape:

sameShape :: [a] -> [b] -> Bool

Since length never finishes for infinite lists, you'll need to use pattern matching directly:

sameShape []     []     = ...
sameShape []     (y:ys) = ...
sameShape (x:xs) []     = ...
sameShape (x:xs) (y:ys) = ...

The caveat is that sameShape can't really ever return True if both lists are infinite; it should be able to get the right answer in a finite amount of time for all other kinds of inputs, though.

To check that a nested list has the same shape, you can check that the outer list has the same shape and that each pairing of inner lists has the same shape:

sameShapes :: [[a]] -> [[b]] -> Bool
sameShapes xss yss = and (sameShape xss yss : zipWith sameShape xss yss)

This is a nice compositional solution, but it suffers from a problem, which is that it's pretty hard to know which order to make the calls to sameShape in when some lists are infinite and others not. Here's the possibilities, and an example of an input where they could in principle compute the answer but spin forever:

  • Check the outer lists' shapes first. (This is what sameShapes above does.) Then checking the lists repeat [] and "abc" : repeat [] will never finish.
  • Check the first elements' shapes first. Checking the lists [repeat 'a', ""] and [repeat 'a', "abc"] will never finish.
  • In fact, the previous bullet generalizes: if you pick any particular element position's shapes first, there's a counterexample. If you look at index n first, then checking the lists formed by putting repeat 'a' at index n in both inputs and ""/"abc" at index n 1 will never finish.

So, unfortunately, these checks need to be interleaved with each other. There's a fairly standard trick called "diagonalization" that makes this kind of thing possible. Think of the outer list being the collection of rows and each inner list being a collection of column values for that row:

   1  2  3  4 ...
a a1 a2 a3 a4 ...
b b1 b2 b3 b4 ...
c c1 c2 c3 c4 ...
d d1 d2 d3 d4 ...

.  .  .  .  . .
.  .  .  .  .  .
.  .  .  .  .   .

You can make sure you visit every position by ordering them according to how far away they are from the top left, so in this order:

  _/    _/    _/
|/_   _/    _/
    _/    _/
  _/    _/
|/_   _/
    _/
  _/
|/_

(etc.) These are the diagonals that justify the name "diagonalization". If our two nested lists have different shapes, this guarantees that eventually we will try to look at an index that is out of bounds for one but not the other.

A very efficient implementation is kind of complicated. Fortunately, an inefficient version isn't too hard. We'll slowly look at more and more rows, tracking the visible ones in xss and yss and the ones we haven't started looking at in xss' and yss'. Each time we take a look, we'll look at one column in each row. You can see some more description of a similar solution to another problem in another answer of mine. Here's how it looks for this problem*:

sameShapesDiagonal :: [[a]] -> [[b]] -> Bool
sameShapesDiagonal = go [] [] where
    go xss yss xss' yss' = done || noMismatchYet where
        done = and ([null xss', null yss']    map null xss    map null yss)
        noMismatchYet = True
            && null xss' == null yss'
            && and (zipWith (\xs ys -> null xs == null ys) xss yss)
            && go (xssh    map (drop 1) xss) (yssh    map (drop 1) yss) xsst ysst
        (xssh, xsst) = splitAt 1 xss'
        (yssh, ysst) = splitAt 1 yss'

If given any mismatching shapes, this function will return False in finite time. If given matching finite shapes, this function will return True in finite time. If given matching shapes with some infinite list, it will never return; but it is not possible to both return and satisfy the previous two correctness properties in such a case.

Here's some examples of using it:

> sameShapesDiagonal ["hello", "hi", "hey"] ["apple", "an", "hey"]
True
> sameShapesDiagonal [[1..], [1..]] [[1..], [3,4,5]]
False
> sameShapesDiagonal [[1..], [1..]] [[3,4,5], [1..]]
False
> sameShapesDiagonal (repeat []) ("abc" : repeat [])
False
> sameShapesDiagonal [repeat 'a', ""] [repeat 'a', "abc"]
False

* Lest you copy and paste this as your own solution to a school exercise, let me assure you that there are multiple things about my implementation that would raise the eyebrows of somebody expecting to grade beginners' code. To avoid suspicion, you will need to understand this answer, then write your own implementation (either of the simple first algorithm or the more complicated second algorithm) using only techniques that are comfortable to you and your peers.

  • Related