access to synchronized data with multiple mutex-CodePudding

In multithreaded environment, let's say I am adding data to a vector(V0) which is shared among different threads. I synchronized the access to the vector using a mutex. At one point, once the vector reaches a certain threshold, I take another mutex and swap the vector with an empty vector(V1). Now V1 contains the shared data.

My understanding is that if I access V1 by taking second mutex only, then I should always see the correct original V0 contents(before swap). So I can synchronize V1 by using second mutex only.

Is this understanding correct?

std::vector<int> primary;
std::vector<int> secondary;

std::mutex p1;
std::mutex s1;

// called on thread1, thread2, thread3
void f1(int x) { 
    std::lock_guard l1{p1};
    primary.push_back(x);
}
// could be called on thread1, thread2, thread3
void f2() { 
    std::lock_guard l1{p1};
    std::lock_guard l2{s1};
    std::swap(primary, secondary);
}

// called on thread4
void f3() { 
    std::lock_guard l2{s1};
    std::copy(secondary.begin(), secondary.end(), std::ostream_iterator<int>(std::cout)); 
}

In the example above, f3() is always called last, my understanding is that the usage inside f3 is correct because we have synchronized access to secondary memory by using mutex1 and mutex2 in f2(). So we can use either of the two mutexes to access the secondary state.

Thanks, DDG

CodePudding user response：

In the shown code:

primary is accessed only when the m1 mutex is locked.
secondary is accessed only when the s1 mutex is locked.
When both mutexes get locked at the same time, m1 gets locked first, followed by s1.

As long as the rest of the program holds to the same rules, exactly primary and secondary access is thread safe.

So we can use either of the two mutexes to access the secondary state.

No, in order for secondary to have thread-safe access the secondary mutex must be locked. A lock on the primary mutex is immaterial. If some other part of the code locks primary, and then access secondary, this will not be thread-safe in conjunction with the shown code.

CodePudding user response：

To keep things sane, I find that it's usually best to associate a mutex closely with the data it's protecting.

class buffer {
    mutable std::mutex protector;
    std::vector<int> data;
public:
    void push_back(int i) {
       std::lock_guard<std::mutex> L(protector);
       data.push_back(i);
    }

    void swap(buffer &other) { 
        std::lock_guard<std::mutex> L(protector);
        std::lock_guard<std::mutex> O(other.protector);
        std::swap(data, other.data);
    }

    void write() const {
        std::lock_guard<std::mutex> L(protector);
        std::copy(data.begin(), data.end(), std::ostream_iterator<int>(std::cout));
        // And after processing the vector, we probably want to empty it:
        data.clear();
    }
};

For the most part, each buffer protects itself. The only place they need to deal with both is where you swap them, since that modifies both simultaneously. Personally, I'd probably name the two buffers input and output, rather than primary and alternate. That way, the input threads (T1, 2 and 3) all work with the input buffer, and the output thread always works with the output buffer.

One mildly tricky point here though. You've assured that you won't modify a vector while it's in use--but you haven't assured that data will be written out in order (buffers could get swapped multiple times, and data pushed more or less randomly to each before one of them gets written out).

There are a couple of ways of fixing this. The easiest is probably to get rid of the swap as a separate function, and just do the swapping at the beginning of the write function:

void write(buffer &src) { 
     std::lock_guard<std::mutex> L(protector);
     {
         std::lock_guard<std::mutex> L(src.protector);
         std::swap(data, other.data);
     }
     std::copy(data.begin(), data.end(), std::ostream_iterator<int>(std::cout));
     data.clear();
}

This can lead to further simplification: the output buffer is only ever used inside of write, and only ever used in one thread. As such, we don't really need a mutex to protect it. Instead we can do something like this:

void write() const { 
    std::vector<int> temp;
    {
        std::lock_guard<std::mutex> L(protector);
        std::swap(temp, data);
    }
    std::copy(data.begin(), data.end(), std::ostream_iterator<int>(std::cout));
}

This lets us create one buffer object. T1, T2, and T3 write to it as needed. Occasionally T4 writes it out. And (most importantly) the threads each only deal with what they care about--the input threads push data to the buffer. The output thread writes data from the buffer. And neither one needs to know details of the buffer itself, like locking and swapping.