I have 4 integer two dimensional arrays of the form:
{{1,2,3},{1,2,3},{1,2,3}}
The first 3(A,B,C) are of size/shape [1000][3] and the 4th(D) [57100][3].
The combination of integers in the 3 elements sub-arrays in A,B,C are all unique, while combinations of integers in the 3 elements sub-arrays in D are not.
What I have to do is to find in what array from A,B or C a sub-array from D is present and do something with it after, depending in which array I find it.
I tried using nested for loops with if statements and check each array one at a time (A->B->C) and break out when I find it but it takes to long.
Thank you
CodePudding user response:
As suggested by my comments, let's assume your approach is a naive, look at one item at a time in a for
loop to see if a match is found.
Instead of doing that, another approach is to store the arrays as a std::set<std::tuple<int, int, int>>
and do a lookup on the sets. Doing this reduces the time complexity from linear to logarithmic. For example, a 1000 item set will have at most 10 tests to finally determine if the item exists instead of having to go through all 1000 items.
Here is a somewhat untested example. Note that the data in the arrays has not been populated (the assumption is that the data has been populated into the arrays):
#include <tuple>
#include <set>
#include <array>
#include <iostream>
using TupleSet = std::set<std::tuple<int, int, int>>;
int A[1000][3];
int B[1000][3];
int C[1000][3];
int D[57100][3];
void printTuple(const std::tuple<int,int,int>& ts)
{
std::cout << std::get<0>(ts) << "," <<
std::get<1>(ts) << "," <<
std::get<2>(ts) << "\n";
}
// Initialize the tuple set with the 3 column 2D array of items
void initializeTuple(int pArray[][3], int numItems, TupleSet& ts)
{
for (int i = 0; i < numItems; i)
ts.insert({pArray[i][0], pArray[i][1], pArray[i][2]});
}
int main()
{
// Assume that A, B, C, and D have been filled in with data
//...
static constexpr char *arrayName = "ABC";
// Declare an array of 3 tuple sets that represent A,B,C
std::array<TupleSet, 3> allTuples;
// This is for the "special" D Array
TupleSet tD;
//Store the data in the tuples
initializeTuple(A, 1000, allTuples[0]);
initializeTuple(B, 1000, allTuples[1]);
initializeTuple(C, 1000, allTuples[2]);
initializeTuple(D, 57000, tD);
// Now go through each tuple in D to see if it's
// in A, B, or C
for (auto& t : tD)
{
// print the tuple we're searching for
printTuple(t);
// Now see if it's in A, B, or C
for (int i = 0; i < 3; i)
{
auto iter = allTuples[i].find(t);
if ( iter != allTuples[i].end() )
{
std::cout << "Found a match in Array " <<
arrayName[i] << "\n";
break;
}
}
}
}
Note how each 2D array is placed in a tuple set. Also, since a std::set
does not store duplicates, the D
array will have all of its duplicates removed when placed in the tuple set tD
. Thus right there, the number of items that D
represents will be minimized.
However, there are some issues you should be aware of:
The initial setup time of the tuples is part of this solution. However the setup time is done only once.
It may be even faster to use a
std::unordered_set
, but that would require you to write a hash function, and I leave that as an exercise to you.