i wrote a simple c program to make every thread multiplate its index by 1000000 and add it to sum , i created 5 threads so the logic answer would be (0 1 2 3 4)*1000000 which is 10000000 but it throws 14000000 instead .could anyone helps me understanding this?
#include<pthread.h>
#include<stdio.h>
typedef struct argument {
int index;
int sum;
} arg;
void *fonction(void *arg0) {
((arg *) arg0) -> sum = ((arg *) arg0) -> index * 1000000;
}
int main() {
pthread_t thread[5];
int order[5];
arg a;
for (int i = 0; i < 5; i )
order[i] = i;
a.sum = 0;
for (int i = 0; i < 5; i ) {
a.index = order[i];
pthread_create(&thread[i], NULL, fonction, &a);
}
for (int i = 0; i < 5; i )
pthread_join(thread[i], NULL);
printf("%d\n", a.sum);
return 0;
}
CodePudding user response:
It is 140.. because the behavior is undefined. The results will differ on different machines and other environmental factors. The undefined behavior is caused as a result of all threads accessing the same object (see &a
given to each thread) that is modified after the first thread is created.
When each thread runs it accesses the same
index
(as part of accessing a member of the same object (&a
)). Thus the assumption that the threads will see [0,1,2,3,4] is incorrect: multiple threads likely see the same value ofindex
(eg. [0,2,4,4,4]1) when they run. This depends on the scheduling with the loop creating threads as it also modifies the shared object.When each thread updates
sum
it has to read and write to the same shared memory. This is inherently prone to race conditions and unreliable results. For example, it could be lack of memory visibility (thread X doesn’t see value updated from thread Y) or it could be a conflicting thread schedule between the read and write (thread X read, thread Y read, thread X write, thread Y write) etc..
If creating a new arg object for each thread, then both of these problems are avoided. While the sum issue can be fixed with the appropriate locking, the index issue can only be fixed by not sharing the object given as the thread input.
// create 5 arg objects, one for each thread
arg a[5];
for (..) {
a[i].index = i;
// give DIFFERENT object to each thread
pthread_create(.., &a[i]);
}
// after all threads complete
int sum = 0;
for (..) {
sum = a[i].result;
}
1 Even assuming that there is no race condition in the current execution wrt. the usage of sum
, the sequence for the different threads seeing index
values as [0,2,4,4,4], the sum of which is 14, might look as follows:
- a.index <- 0 ; create thread A
- thread A reads a.index (0)
- a.index <- 1 ; create thread B
- a.index <- 2 ; create thread C
- thread B reads a.index (2)
- a.index <- 3 ; create thread D
- a.index <- 4 ; create thread E
- thread D reads a.index (4)
- thread C reads a.index (4)
- thread E reads a.index (4)