Home > Back-end >  Using element in vector as index in array?
Using element in vector as index in array?

Time:03-20

I'm writing code that randomly generates a vector of indices, fetches a random one, then uses that index to fetch another index, and so on. However, my code seems to repeat a cycle of indices. Here is my full code:

vector<uint16_t>* genBuffer() {
    vector<uint16_t>* buffer = new vector<uint16_t>(256);
    std::random_device rd;
    std::mt19937 gen(rd());
    std::uniform_int_distribution<> distr(0, 255);
    for (uint16_t i = 0; i < 256; i  ) {
        (*buffer)[i] = distr(gen);
    }
    shuffle(buffer->begin(), buffer->end(), gen);
    return buffer;
}

double timeAccess(vector<uint16_t>* buff, uint64_t buffSize) {
    struct timespec start, stop;
    random_device rd;
    mt19937 gen(rd());
    uniform_int_distribution<> distr(0, 255);
    auto index = distr(gen);
    auto nextIndex = (*buff)[index];
    clock_gettime(CLOCK_MONOTONIC, &start);
    for (uint64_t i = 0; i <= buffSize; i  ) {
        cout << nextIndex << endl;
        nextIndex = (*buff)[nextIndex];
    }    
    clock_gettime(CLOCK_MONOTONIC, &stop);
    double time_taken = (stop.tv_sec - start.tv_sec) - (double)(stop.tv_nsec - start.tv_nsec);
    double avgTime = time_taken/buffSize;
    return avgTime;
}

int main(int argc, char* argv[]) {
    if (argc != 2) {
        cout << "Please enter only one numerical argument." << endl;
        return -1;
    }
    uint64_t buffSize = atoi(argv[1]);
    auto randBuff = genBuffer();
    auto timeTaken = timeAccess(randBuff, buffSize);
    cout << "Average time per buffer read = " << timeTaken << " ns" << endl;
    return 0;
}

Here is an example run with an argument of 25:

35
218
157
9
4
214
225
246
123
92
195
114
200
33
138
13
17
35
218
157
9
4
214
225
246
123

As you can see, the pattern eventually repeats, although it shouldn't do that.

This code is part of a cache benchmark I was asked to write for class. Here is the full code for anyone willing to try:

https://github.com/aabagdi/CacheBenchmark

As well, I'm trying to time the average time per read in ns. Am I doing that correctly? Thanks!

CodePudding user response:

The generation of your traversal vector is flawed, and can easily generate loops. An example of what you're actually trying to generate for maximal traversal is similar to an interesting (albeit outdated) interview question concerning dropping a stack of bus tickets, each of which has a starting city, ending city, and as a city pair is unique among all others.

Consider holding four tickets, each of which goes from some city Sn to some other city Sm.

S4->S2
S1->S5
S3->S4
S5->S3

the proper order can be thought of like this:

S1->S5
    S5->S3
        S3->S4
            S4->S2

The only difference here is that you need to get back from S2 to S1 if needed, since you don't really know where you're starting from. By adding one faux ticket, S2->S1, you can produce a maximal loop. Now it doesn't matter where you start from; eventually you'll hit every city and be back where you started (to do it again if you want).

One way to do that is by building a pairing sequence, then patching in the final return-to-origin as a last step. First, n indexes, build an index array of values 0..(n-1).

// build index vector and shuffle
std::vector<uint16_t> idx(n);
std::iota(idx.begin(), idx.end(), 0);
std::mt19937 gen{ std::random_device{}() };
std::shuffle(idx.begin(), idx.end(), gen);

This is the rock on which we will build our church. We're going to traverse this random ordered sequence, and store the i 1'th index in the i'th slot of our final result. Like this:

// build chain
std::vector<uint16_t> result(n);
for (size_t i=0; i<(n-1);   i)    
    result[idx[i]] = idx[i 1];

Finally, the last step is to link the last element of this chain to the initial element of the chain (wherever it is).

// set loop
result[idx[n-1]] = idx[0];

That's it. Now, it doesn't matter where you start in result, it will eventually lead to touching each index before returning to your starting point where it loops (maximally).

The full genBuffer looks like this:

std::vector<uint16_t> genBuffer(size_t n) 
{
    // build index vector and shuffle
    std::vector<uint16_t> idx(n);
    std::iota(idx.begin(), idx.end(), 0);
    std::mt19937 gen{ std::random_device{}() };
    std::shuffle(idx.begin(), idx.end(), gen);

    // build chain
    std::vector<uint16_t> result(n);
    for (size_t i=0; i<(n-1);   i)    
        result[idx[i]] = idx[i 1];

    // set loop
    result[idx[n-1]] = idx[0];

    return result;
}

To prove this actually works, a simple test harness to generate a 32-element chain with a maximal loop.

int main() 
{
    auto randBuff = genBuffer(32);

    // pick some random starting point.
    std::mt19937 gen{ std::random_device{}() };
    std::uniform_int_distribution<size_t> dist(0, randBuff.size()-1);
    auto idx = dist(gen);

    for (size_t i=0; i<randBuff.size();   i)
    {
        std::cout << idx << " -> " << randBuff[idx] << '\n';
        idx = randBuff[idx];
    }
}

Output (varies, obviously)

19 -> 10
10 -> 21
21 -> 12
12 -> 22
22 -> 28
28 -> 6
6 -> 16
16 -> 1
1 -> 7
7 -> 3
3 -> 27
27 -> 8
8 -> 14
14 -> 15
15 -> 17
17 -> 2
2 -> 9
9 -> 25
25 -> 31
31 -> 5
5 -> 18
18 -> 13
13 -> 29
29 -> 24
24 -> 26
26 -> 4
4 -> 20
20 -> 11
11 -> 0
0 -> 30
30 -> 23
23 -> 19

Note the last production leads to the first production.

In summary, your index vector generation is susceptible to loops. You want a loop, but only one and should be maximal. Doing what I showed above can give you that.

CodePudding user response:

Couple of thoughts

  • your generate function will generate the same data each time since you do not seed your (p)rng

  • You randomly pick an entry and then follow the links, why wouldnt there be a loop? WHy not simply print them out and see if there is a loop.

I mean your generate function can create in the list

1,2,3,0

Thats a loop

  • Related