Home > Software design >  remove duplicates in a string with a buffer
remove duplicates in a string with a buffer

Time:11-12

I’m trying to remove duplicates in a string using a map. Running it through GDB I'm not able to figure out where the code is failing, though the logic to me seems right. Could anyone please point out the mistake?

int main() {

    char *str="I had my morning tea";
    int len = strlen(str);
    int dupArr[256] = {0};

    //Build the map
    int i=0;
    for(i;i<256;i  )
        dupArr[str[i]]  ;

    //If the count is 1 then print that value.
    i=0;
    for(i;i<256;i  ) {
        if(dupArr[str[i]] == 1) {
            printf("%c\n",str[i]);
        }
    }
}

output

I h y o r i g t % c 4  @ } ` 8 � F J

I get up to 't' ,which is correct but then i see magic chars.

CodePudding user response:

Your string has length of len but you are traversing till 256 which is out of bound. Use len when inserting into the hash.

    int i=0;
    for(i;i<LEN;i  )
      dupArr[str[i]]  ;

Also if your are checking the duplicates then it should be bigger than 1 since your are the first encountered char

if(dupArr[str[i]] > 1)

CodePudding user response:

In addition to Mark Ezberg's good answer, note that dupArr[str[i]] ; poses a problem when str[i] < 0.

Better to treat the characters as unsigned char:

int dupArr[UCHAR_MAX   1] = {0};
....
dupArr[(unsigned char) str[i]]  ;

Rolling this and other ideas together:

int main(void) {
  char *str="I had my morning tea";

  size_t dupArr[UCHAR_MAX   1] = {0};

  unsigned char *s = (unsigned char *) str;
  while (*s) {
    dupArr[*s]  ;
    s  ; 
  }

  for(unsigned i = 0; i <= UCHAR_MAX; i  ) {
    // A duplicate is when dupArr[i] is _more_ than 1.
    if(dupArr[i] > 1) {
      printf("%c\n",str[i]);
    }
  }
}
  •  Tags:  
  • c
  • Related