Home > OS >  GNU memmem vs C strstr
GNU memmem vs C strstr

Time:08-01

Is there any use case that can be solved by memmem but not by strstr? I was thinking of able to parse a string raw bytes (needle) inside a bigger string of raw bytes(haystack). Like trying to find a particular raw byte pattern inside a blob of raw bytes read from C's read function.

CodePudding user response:

There is the case you mention: raw binary data.

This is because raw binary data may contain zeroes, which are interpreted and string terminator by strstr, making it ignore the rest of the haystack or needle.

Additionally, if the raw binary data contains no zero bytes, and you don't have a valid (inside the same array or buffer allocation) extra zero after the binary data, then strstr will happily go beyond the data and cause Undefined Behavior via buffer overflow.


Or, to the point: strstr can't be used if the data is not strings. memmem doesn't have this limitation.

CodePudding user response:

In addition to searching in non-string data, memmem() can be used to look for substrings in just a portion of a longer string, something strstr() can't do:

char somestr[] = "a long string with the word apple";
// Look in just the 5th through 15 characters
// (Haystack must have at least 15 characters or else) 
char *loc = memmem(somestr   4, 10, "pp", 2);

and if you already know the lengths of the strings, it might be faster than strstr() when used on the entire haystack string, but that depends a lot on the implementation and should be benchmarked.

  •  Tags:  
  • c
  • Related