Home > Enterprise >  Computing total sum of word frequency by using pthreads in C
Computing total sum of word frequency by using pthreads in C

Time:07-23

I am running three threads and trying to computer total frequency of each word.
I need put mutex to protect from counting issue.

57848 index is 0
37389 index is 1
8447 index is 2
10016 index is 3
2756 index is 4

The results are counted as follows : 
he -> 57848
she -> 37389
they -> 8447
him -> 10016
me -> 2756



void threadfunc(char *path, char *filetowrite, long specialfreq[], int num_threads){

  DIR* dir = opendir(path);
  if(dir == NULL){ return; }
  struct dirent* entity;
  entity = readdir(dir);

  
  for (int i = 0; i < num_threads; i  ) {
    if (pthread_join(thread_id[i],  NULL) != 0) {
      perror("Failed to join thread");
    }
  }
}

CodePudding user response:

The threads are reporting the values of data->freq[i], which is shared among all threads.

To count locally in each threads, you should first create local counters for each threads before the loop:

long threadCounter[SPECIALSIZE] = {0};

Then, count using this counter instead of data->freq:

        while(tkn != NULL){  
          if((strcasecmp(tkn, "he") == 0)){
            threadCounter[0] = threadCounter[0]  1;
          }else if((strcasecmp(tkn, "she") == 0)){
            threadCounter[1] = threadCounter[1]  1;
          }else if((strcasecmp(tkn, "they") == 0)){
            threadCounter[2] = threadCounter[2]  1;
          }else if((strcasecmp(tkn, "him") == 0)){
            threadCounter[3] = threadCounter[3]  1;
          }else if((strcasecmp(tkn, "me") == 0)){
            threadCounter[4] = threadCounter[4]  1;
          }
          tkn = strtok(NULL, " ");
        }

(note: threadCounter is local to this thread, so no lock is required for updating counter)

And finally report the value of the counter and add that to data->freq:

  pthread_mutex_lock(&mutex);
  for (size_t i = 0; i < 5; i  )
  {
    printf("%d index is %d\n", threadCounter[i], i);
    data->freq[i]  = threadCounter[i];
  }
  pthread_mutex_unlock(&mutex);

Also note that the function strtok() is not thread-safe (cannot be called from multiple threads simultaneously). Therefore, pthread_mutex_lock(&mutex); should be added before char* tkn = strtok(line, " "); and pthread_mutex_unlock(&mutex); should be added before }fclose(fp); for correct processing.

  • Related