Construct chains of pairs of numbers with one common member-CodePudding

I need to construct a chain of pair of numbers where:

In each pair, the first one is smaller than the second
In order to form a chain between two consecutive nodes, they must have one number in common. In other words, the link (a,b) -- (c,d) can be made if and only if either a==c, b==c, a==d or b==d
A pair cannot be made of the same number. In other words, if (a,b) exists, then a!=b

This may look like a Longest increasing subsequence but I actually want to chain consecutive pairs that have one equal member.

Example:

Initial list (unordered):
(0,1)
(2,3)
(1,6)
(4,6)
(8,9)
(2,8)
Result:
----- chain #1
(0,1)
(1,6)
(4,6)
----- chain #2
(2,3)
(2,8)
(8,9)

I could do an algorithm that will iterate over the entire list for each cell (O(n^2)), but I want to make it faster and I have the flexibility of ordering my initial array in any way I want (std::set, std::map, std::unordered_map, etc.). My list is made of tens of thousands of pairs so I need an efficient solution in terms of processing time.

CodePudding user response：

You can solve it in O(N * log(N)) when you manage two lists, one sorted with respect to first the other sorted with respect to second.

The code has some duplication that I didnt bother to clean up yet.

#include <iostream>
#include <list>
#include <algorithm>
#include <tuple>
#include <any>

struct pair_and_iter {
    int first;
    int second;
    std::any other_iter;
};

struct compare_first {
    bool operator()(int x,pair_and_iter p){ return x < p.first; }
    bool operator()(pair_and_iter p, int x){ return p.first < x; }
};
struct compare_second {
    bool operator()(int x,pair_and_iter p){ return x < p.second; }
    bool operator()(pair_and_iter p, int x){ return p.second < x; }
};
template <typename Iter,typename Comp>
Iter my_find(Iter first,Iter last,int x, Comp comp) {
    auto it = std::lower_bound(first,last,x,comp);
    if (it != last && (!comp(x,*it) && !comp(*it,x))){
        return it;
    } else {
        return last;
    }
}

int main() {
    std::list<pair_and_iter> a {{0,1},{2,3},{1,6},{4,6},{8,9},{2,8}};
    std::list<pair_and_iter> b;

    for (auto it = a.begin(); it != a.end();   it){
        b.push_back({it->first,it->second,it});
        it->other_iter = std::prev(b.end());
    }

    a.sort([](const auto& x,const auto& y){ 
        return std::tie(x.first,x.second) < std::tie(y.first,y.second); });
    b.sort([](const auto& x,const auto& y){ 
        return std::tie(x.second,x.first) < std::tie(y.second,y.first); });
    
    std::vector<std::vector<pair_and_iter>> result;
    std::vector<pair_and_iter> current_result;
    current_result.push_back(a.front());
    auto current = current_result.begin();
    b.erase(std::any_cast<std::list<pair_and_iter>::iterator>(current->other_iter));
    a.erase(a.begin());        

    while (a.size() && b.size()) {
        // look for an element with same first
        auto it = my_find(a.begin(),a.end(),current->first,compare_first{});
        if (it == a.end()) {
            // look for element where current->second == elem.first
            it = my_find(a.begin(),a.end(),current->second,compare_first{});
        }
        if (it != a.end()){
            current_result.push_back(*it);
            current = std::prev(current_result.end());
            b.erase(std::any_cast<std::list<pair_and_iter>::iterator>(it->other_iter));
            a.erase(it);
            continue;
        }

        // look for element with current->first == elem.second
        it = my_find(b.begin(),b.end(),current->first,compare_second{});
        if (it == b.end()) {
            // look for element with same second
            it = my_find(b.begin(),b.end(),current->second,compare_second{});
        }
        if (it != b.end()) {
            current_result.push_back(*it);            
            current = std::prev(current_result.end());
            a.erase(std::any_cast<std::list<pair_and_iter>::iterator>(it->other_iter));
            b.erase(it);
            continue;
        }        
        // no matching element found
        result.push_back(current_result);
        current_result.clear();
        current_result.push_back(a.front());
        current = current_result.begin();
        b.erase(std::any_cast<std::list<pair_and_iter>::iterator>(current->other_iter));
        a.erase(a.begin());
    }
    result.push_back(current_result);
    for (const auto& chain : result){
        for (const auto& elem : chain){
            std::cout << elem.first << " " << elem.second << "\n";
        }
        std::cout << "\n";
    }
}

Output:

I used std::list because it has stable iterators and constant time erase. std::any for type erasure because each list contains iterators to the other list. a is sorted with respect to first and b is sorted with respect to second. Hence std::lower_bound can be used to to find a match in O(logN). A single linear search is traded against 2 binary searchs to find either current->first or current->second in a first of a and 2 binary searchs to find either current->first or current->second in a second of b. In total it is O(N log(N)) for sorting plus O( log(N) log(N-1) log(N-2) .... log(1)) which equals O(log( n! )) if I am not mistaken.

PS: You didn't mention that you are looking for a longest chain, and this algorithm is not finding the longest chain. It just picks the first element of the remaining ones and uses the next element it finds to continue the chain.