I have the following minimally reproducible code of using several child processes to append strings to a shared vector. But at some executions, my prorgam either freezes or goes into segmentation fault when all the child process finish. At other times, it works with no issues.
When the segfault does happen, it seems to take place when the parent process accesses the shared vector. Can anyone more experienced with boost help me debug please.
...
namespace bip = boost::interprocess;
typedef bip::managed_shared_memory msm;
bip::shared_memory_object::remove("shmem_streak");
bip::managed_shared_memory shm(bip::open_or_create, "shmem_streak", 10000000);
bip::allocator<char, msm::segment_manager> chr_altr(shm.get_segment_manager());
typedef bip::basic_string<char, char_traits<char>, decltype(chr_altr)> str;
bip::allocator<str, msm::segment_manager> str_altr(shm.get_segment_manager());
typedef vector<str, decltype(str_altr)> vec;
shm.construct<vector<str, decltype(str_altr)>>("res_vec")(str_altr);
for (int i = 0, pid; i < num_procs; i )
{
if ((pid = fork()) == 0)
{
bip::interprocess_mutex mutex;
bip::scoped_lock<bip::interprocess_mutex> lock(mutex);
auto child_res_vec = shm.find<vec>("res_vec").first;
cout << " pushing " << i << " lock: " << endl; //<< lock << endl;
str tmp_str(chr_altr);
tmp_str = to_string(i).c_str();
child_res_vec->push_back(tmp_str);
exit(0);
}
else
cout << "pid: " << pid << " created" << endl;
}
while (wait(NULL) > 0)
;
auto child_res_vec = shm.find<vec>("res_vec").first;
cout << "test res pq: " << endl;
for (auto elem : *child_res_vec)
cout << elem << endl;
cout << child_res_vec->back() << endl;
...
here is the buggy output:
pid: 91625 created
pid: 91626 created
pid: 91627 created
pid: 91628 created
pushing 1 lock:
pushing 0 lock:
pushing 2 lock:
pushing 3 lock:
pid: 91629 created
pid: 91630 created
pushing 4 lock:
pid: 91631 created
pushing 5 lock:
pid: 91632 created
pushing 6 lock:
pid: 91633 created
pushing 7 lock:
pid: 91634 created
pushing 8 lock:
pid: 91635 created
pushing 9 lock:
pushing 10 lock:
pid: 91636 created
pushing 11 lock:
test res pq:
zsh: segmentation fault python app.py
correct output:
pid: 91819 created
pushing 0 lock:
pid: 91820 created
pushing 1 lock:
pid: 91821 created
pushing 2 lock:
pid: 91822 created
pushing 3 lock:
pid: 91823 created
pushing 4 lock:
pid: 91824 created
pushing 5 lock:
pushing 6 lock:
pid: 91825 created
pid: 91826 created
pushing 7 lock:
pid: 91827 created
pushing 8 lock:
pid: 91828 created
pushing 9 lock:
pushing 10 lock:
pid: 91829 created
pid: 91830 created
pushing 11 lock:
test res pq:
0
1
2
3
4
5
6
7
8
9
10
11
CodePudding user response:
The problem is with the mutex. You are creating a new mutex for every process. You have to make sure there is a single mutex that is shared by all processes. Just moving the declaration of mutex
outside the for
-loop isn't enough though; the mutex has to be stored inside the shared memory segment for this to work, see the description of boost::interprocess::interprocess_mutex
.
Some other things to note:
- Prefer
'\n'
overstd::endl
. wait()
can return-1
for other reasons than just that no child processes are left. Be sure to check thaterrno == ECHILD
before exiting thewhile
-loop.- Use
for(auto& elem: ...)
; the reference will avoid unnecessary copies of the elements being made.
CodePudding user response:
Each of your locks are created in your forked processes and are completely unrelated to each other, so they don't actually synchronize access to the vector.
You need to construct it in the shared memory:
shm.construct<vector<str, decltype(str_altr)>>("res_vec")(str_altr);
shm.construct<bip::interprocess_mutex>("mtx")();
for (int i = 0, pid; i < num_procs; i )
{
if ((pid = fork()) == 0)
{
auto* mtx = shm.find<bip::interprocess_mutex>("mtx").first;
bip::scoped_lock<bip::interprocess_mutex> lock(*mtx);
Or you can use a named_mutex
:
for (int i = 0, pid; i < num_procs; i )
{
if ((pid = fork()) == 0)
{
bip::named_mutex mtx(bip::open_or_create, "your_globally_unique_mutex_name");
bip::scoped_lock<bip::named_mutex> lock(mtx);
// When finished (e.g., in the parent process)
bip::named_mutex::remove("your_globally_unique_mutex_name");