I have a string in my program that contains certain values for parameters. I need to extract the values from the parameters using regex.
The regex looks like this:
std::smatch param;
std::string str = "--name=AName --age=AnAge --gender=AGender"
if (std::regex_match(str, param, std::regex(".*--name=(\\w ) .*--age=(\\d ) .*--gender=(\\w ) .*")))
{
//if it finds the order of the regex will come here and the values for each will be stored in param[1-3]
}
The problem is the order of the params can come in different orders, for example:
std::string str = "--gender=AGender --name=AName --age=AnAge"
std::string str = "--age=AnAge --gender=AGender --name=AName"
std::string str = "--name=AName --gender=AGender --age=AnAge "
Is there a way to express in a single regex expression to be able to capture values despite of the order instead of doing on regex per parameter I want to find? If so how can I access such value? In python is possible to add an <id> before the desired group to then later access it using same identifier. In my example code I do that using smatch type variable but the access to it depends on the order that the string has and I cannot rely on that.
CodePudding user response:
Use this regex:
"^(?=.*--name=(\\w ))(?=.*--age=(\\d ))(?=.*--gender=(\\w )). "
The one problem you'll run into is the fact that params
won't be able to determine which item belongs to which parameter.
The way I would solve this problem would be to use std::string::find.
For example:
std::string str = "--name=AName --age=AnAge --gender=AGender";
size_t namePos = str.find("--name=");
size_t agePos = str.find("--age=");
size_t genderPos = str.find("--gender=");
std::string name = "";
std::string gender = "";
std::string age = "";
if(namePos != std::string::npos)
{
// Add 7 to namePos since the size of "--name=" is 7.
// Assuming that the delimiter of the name is whitespace so find the first
// whitespace after --name=
name = str.substr(namePos 7, str.find_first_of(" \n\r", namePos 7) - (namePos 7));
}
if(agePos != std::string::npos)
{
// Add 6 to agePos since the size of "--age=" is 6.
// Assuming that the delimiter of the age is whitepace so find the first
// whitespace after --age=
age = str.substr(agePos 6, str.find_first_of(" \n\r", agePos 6) - (agePos 6));
}
if(genderPos != std::string::npos)
{
// Add 9 to genderPos since the size of "--gender=" is 9.
// Assuming that the delimiter of the gender is whitespace so find the first
// whitespace after --gender=
gender = str.substr(genderPos 9, str.find_first_of(" \n\r", genderPos 9) - (genderPos 9));
std::cout << name << " " << gender << " " << age << std::endl;
}
Output:
AName AGender AnAge
CodePudding user response:
There are better tools to parse commandlines, but if you really want to use regex, you will find that Boost::Regex makes this much easier than the std::regex.
In particular, it supports named groups (see e.g. Boost Regular Expression: Getting the Named Group) which is the feature you request in your question title.
You can combine that with BOOST_REGEX_MATCH_EXTRA
to keep all matches for all named groups (by default, only the last match for each capture group is accessible after the search.)
Then you can just make a big disjunction ((?<group1>...)|(?<group2>...)|...
) in your regex for all the groups you may encounter, and you will be able to get all values out regardless of their order.