Find a desired index in a binary matrix in 3n − ⌊lg n⌋ − 3-CodePudding

Given an n×n matrix M in which every entry is either a 0 or 1. Present an algorithm that determines if ∃i, 1 ≤ i ≤ n, such that M[i,j] = 0 and M[j,i] = 1, ∀j, 1 ≤j≤ n ∧ j!=i, using examining an entry of M as the key operation. Your algorithm must examine at most 3n − ⌊lg n⌋ − 3 entries of M.

Hint: Relate ‘examining M[i, j]’ with comparing i with j’. Initially, every index is potentially the desired index i. Eliminate the number of potential index i from n to 1. Then verify if the index is indeed the desired index i.

Any smart folks out there to shade more lights or hints to solve this issue? I am new in this area and any approach that comes to mind ends in O(n^2). Any recommendations?

I've considered basic cases looking for any patterns:

2-by-2 matrix M, considering each entry as a comparison, then we have 4-2(diagonal elts) = 2 comparisons. If we verify with desired running time T(n) we have 3n - floor(lgn)-3 = 3*2 - lg2 -3 = 6-4 = 2comparisons
3 -by 3 M, 9 (entries) - 3 (diagonal etls) = 6 comparisons And desired T(n) = 3*3 - floor(lg3) -3 = 9-4=5 comparisons
4-by 4 will give 16-4 = 12 comparisons, and T(n) = 3*4 - lg4 -3 = 12-5 = 7 comparisons Here we observe a big difference and the idea collapses. But if I can find an approach to pair the matrix entries, then I am good. The base cases above will be reduced to 1, 3, and 6 comparisons and will stay within T(n).

Next, I thought of reducing the problem into proving that M is or is not symmetric, which means there exist i such that Mij != Mji (or Mij = Mji) and the condition will be satisfied since M is binary. The idea was to see if I could prove or disprove it in linear time, but i'm yet to find an algorithm for it.

CodePudding user response：

Break up your rules into two parts. Index i is valid if and only if it satisfies both of:

Row condition: M[i, j] = 0 for all j != i
Column condition: M[k, i] = 1 for all k != i

This leads us to notice that the condition being true for index i is equivalent to having the i'th row be all zeros, and having the i'th column be all ones, except where they intersect at M[i, i], which can have any value.

For an illustration with i=3 valid (and non-numbered entries being irrelevant):

. . . 1 . .
. . . 1 . .
0 0 0 x 0 0
. . . 1 . .
. . . 1 . .
. . . 1 . .

Now, consider any pair i, j with i != j.

If M[i, j] = 1, we know i is not a valid index, as it fails the row condition.
If M[i, j] = 0, we know j is not a valid index, as it fails the column condition.

So with every query of a matrix element M[i, j], we can eliminate one index from consideration. This also implies that at most one index is valid. Otherwise, if i1 and i2 were both valid, we'd have a contradiction, based on testing M[i1, i2] according to the rules above.

Here's an example of the whole algorithm on a matrix of length 3. The tournament tree structure only depends on the number of leaf nodes, as detailed later.

0 0 1
1 0 1
0 1 1

Here, the first matchup (i.e. sibling leaf nodes) is (0, 1). The query M[0, 1] returns 0, so the second index, 1, is eliminated. We delete the parent node of the leafs, and replace it with the remaining index's node, 0. Then (0, 2) are matched up as new sibling leaf nodes. The query M[0, 2] returns 1, so the first index, 0, is eliminated. 2 is the only potentially valid index remaining.

Now, to test whether 2 is valid, we need to know the values of M[2, 0], M[2, 1], M[0, 2], M[1, 2]. We already queried the value of M[0, 2], so we don't repeat that query. This gives us 2 3 = 5 total queries, exactly meeting your bound of 3*n - 3 - floor(log_2(n)) = 9 - 3 - 1 = 5.

Once we have at most one potential valid index, we need 2n - 2 queries of its column and row to test both conditions. This currently gives us n-1 queries to eliminate all but one index, then 2n-2 queries to test that index, for a total of 3n-3 which is too many.

However, we can use the initial 'elimination' queries to count against the later 'test' queries. Create a full and complete binary tree, where the leaf nodes are the indices 0, 1, ... n-1. Use this as a tournament tree to determine the initial elimination queries: each leaf node is paired up against its sibling leaf node repeatedly, until one node remains.

With n leaf nodes, the smallest depth of a leaf node (and thus the smallest number of matchups/elimination queries any index participates in) in any full, complete binary tree is always floor(lg_2(n)). So the total number of queries we have to make is at most
(n-1) (2n-2) - floor(lg_2(n)), which is just (3n-3) - floor(lg_2(n)).

Here's a Python implementation of the algorithm, which shouldn't be far from imperative pseudocode. It's especially verbose, and broken up into small functions, so that all accesses to M are done through a special query function and logged. This should make it clearer that your bound for total queries is in fact met.

def find_valid_index(matrix: List[List[int]]) -> Optional[int]:
    """Given square binary matrix, return the index i
    such that, for all j != i,
    matrix[i, j] = 0 and matrix[j, i] = 1, or None if none exist."""

    num_rows = len(matrix)
    assert num_rows == len(matrix[0])

    if num_rows == 1:
        return 0

    fulfilled_queries = {}

    def query(i: int, j: int) -> int:
        if (i, j) not in fulfilled_queries:
            fulfilled_queries[i, j] = matrix[i][j]

        return fulfilled_queries[i, j]

    def index_to_eliminate(i: int, j: int) -> int:
        assert i != j

        return j if query(i, j) == 0 else i

    def is_valid_index(i: int) -> bool:
        """Test full row and column for validity"""
        for j in range(num_rows):
            if j == i:
                continue
            if query(i, j) == 1 or query(j, i) == 0:
                return False
        return True

    candidate_indices = list(range(num_rows))

    # Find distance from nearest power of two at most num_rows
    leftover_leafs = num_rows - 2**(math.floor(math.log2(num_rows)))

    if leftover_leafs > 0:
        eliminated_indices = set()
        for k in range(leftover_leafs):
            index1 = candidate_indices[2*k]
            index2 = candidate_indices[2*k 1]
            eliminated_indices.add(index_to_eliminate(index1, index2))

        candidate_indices = [x for x in candidate_indices
                             if x not in eliminated_indices]

    while len(candidate_indices) > 1:
        assert len(candidate_indices) % 2 == 0

        eliminated_indices = set()
        for k in range(0, len(candidate_indices), 2):
            index1 = candidate_indices[k]
            index2 = candidate_indices[k   1]
            eliminated_indices.add(index_to_eliminate(index1, index2))

        candidate_indices = [x for x in candidate_indices
                             if x not in eliminated_indices]

    if len(candidate_indices) == 0:
        return None

    if is_valid_index(candidate_indices[0]):
        return candidate_indices[0]

    return None