Home > OS >  Find exact word match in C string
Find exact word match in C string

Time:09-25

I have the following strings:

std::string s1 = "IAmLookingForAwordU and I am the rest of the phrase";
std::string keyWord = "IAmLookingForAword";

I want to know if the keyWord as an exact match in s1

I used:

   if ( s1.find(keyWord) != std::string::npos )
    {
        std::cout << "Found " << keyWord << std::endl;
    }

but the find function catches the IAmLookingForAword in IAmLookingForAwordU and the if statement is set to true. However, I would like to only catch the exact match of the keyWork I am looking for.

Any way to do this with C strings?

CodePudding user response:

If you want to stay with std::string::find you can test if the characters before and after the word are beyond the bounds of the string, a punctuation character or a space:

bool find_word(const std::string& haystack,const std::string& needle){
    auto index = haystack.find(needle);
    if (index == std::string::npos) return false;

    auto not_part_of_word = [&](int index){ 
        if (index < 0 || index >= haystack.size()) return true;
        if (std::isspace(haystack[index]) || std::ispunct(haystack[index])) return true;
        return false;
    };
    return not_part_of_word(index-1) && not_part_of_word(index needle.size());
}
 

int main()
{
    std::cout << find_word("test","test") << "\n";    // 1
    std::cout << find_word(" test ","test") << "\n";  // 1
    std::cout << find_word("AtestA","test") << "\n";  // 0
    std::cout << find_word("testA","test") << "\n";   // 0
    std::cout << find_word("Atest","test") << "\n";   // 0
}

CodePudding user response:

One idea is to use regular expressions. Here's a quick example. The regular expression uses \b on both sides of the word "exact". In a regular expression, \b means that it should match only at a word boundary (e.g. a space or punctuation). This regular expression will only match the word "exact" and not the word "exactly". N.b. it's often easier to use raw-string literals with regular expressions because the backslash character has special meaning to both C strings and regular expressions.

#include <string>
#include <regex>
#include <iostream>

int main() {
    std::regex re(R"(\bexact\b)");
    std::smatch m;

    std::string string1 = "Does this match exactly?";
    std::string string2 = "Does this match with exact precision?";

    if (std::regex_search(string1, m, re))
    {
        // this shouldn't print
        std::cout << "It matches string1" << std::endl;
    }

    if (std::regex_search(string2, m, re))
    {
        // this should print
        std::cout << "It matches string2" << std::endl;
    }

    return 0;
}

If the word you are searching for is variable (i.e. the word you are looking for is different every time) then using regular expressions becomes a lot more complicated, as you have to ensure you properly validate input as well as properly escape characters with special meaning in regular expressions. Due to this, I would probably opt for other solutions.

CodePudding user response:

The find function catches the IAmLookingForAword in IAmLookingForAwordU and the if statement is set to true. However, I would like to only catch the exact match of the keyWork I am looking for.

Any way to do this with C strings?

You could define a helper function for this:

#include <string>
#include <cctype>
// ...

bool has_word(std::string const& s, std::string const& key_word) {
    auto const found_at = s.find(key_word);
    return found_at != std::string::npos
        && (!found_at || (found_at && !isalpha(s[found_at - 1])))
        && found_at <= s.size() - key_word.size() && !isalpha(s[found_at   key_word.size()]);
}

Then use it like so:

if (has_word(s1, keyWord))
    std::cout << "Found " << keyWord << std::endl;
  • Related