I want to find a front(first) part before some symbols in a string. For example, "ABC, ZXC"
, "AB.QWE,CV"
, I want to get the result, "ABC"
and "AB"
. By the way, if there is some chinese character in this sentence, like "1月1日,天气晴" how to get the front part(1月1日)?
It is easier to reach in Python by
import re
front_part = re.findall(r'(.*?)[.,]', content)[0] if re.findall(r'[.,]',sentence) else content #if it didn't find symbols, then return the whole content.
However, I tried to use following codes in Cpp but it still returns the whole content:
#include<regex>
#include<string>
#include<iostream>
int main()
{
std::string content = "AB.AC";
std::string front_part;
std::smatch frt_pt_sm;
std::regex frt_pt_patt(".*[.,]");
if (std::regex_match(content, frt_pt_sm, frt_pt_patt))
{
for(unsigned i = 0;i < frt_pt_sm.size(); i)
{
std::cout<< frt_pt_sm[i] << std::endl;
}
front_part = frt_pt_sm[0];
}
return 0;
}
I am a novice in cpp so any suggesion is helpful for me!
CodePudding user response:
C regexes don't have an equivalent to the (.*?)
that you're using in Python.
In C you'll want to use something like: [^.,]
to match the part up to (but not including) the first .
or ,
.
On the other hand, given how simple of a pattern you're looking for, you could easily forego using regexes altogether:
std::string input = "AB.QWE,CV";
auto pos = input.find_first_of(".,");
auto front = input.substr(0, pos);
CodePudding user response:
Note that the first element of std::smatch is whole string. If you just want to find the first part of string before .
or ,
, you can use this.
std::vector<std::string> contents {"AB.AC", "AB.QWE,CV", "ABC, ZXC"};
std::string front_part;
std::smatch frt_pt_sm;
std::regex frt_pt_patt(R"((\w )(.|,) (\s*\w ))");
for(auto content: contents) {
std::cout << "content: " << content << std::endl;
if (std::regex_match(content, frt_pt_sm, frt_pt_patt))
{
for(unsigned i = 0;i < frt_pt_sm.size(); i)
{
std::cout<< frt_pt_sm[i] << std::endl;
}
front_part = frt_pt_sm[1];
std::cout << "front part: " << front_part << std::endl;
}
}
Result as below.
content: AB.AC
AB.AC
AB
A
C
front part: AB
content: AB.QWE,CV
AB.QWE,CV
AB
C
V
front part: AB
content: ABC, ZXC
ABC, ZXC
ABC
X
C
front part: ABC
More elegant, you can use regex iterator to split string by delimiter .
or ,
like this.
std::vector<std::string> contents {"AB.AC", "AB.QWE,CV", "ABC, ZXC"};
std::string front_part;
std::smatch frt_pt_sm;
std:;regex frt_pt_patt("([.,]|[^.,] )");
for(auto content: contents) {
std::cout << "content: " << content << std::endl;
std::regex_iterator<std::string::iterator> rit ( content.begin(), content.end(), frt_pt_patt );
std::regex_iterator<std::string::iterator> rend;
while (rit != rend) {
std::cout << rit->str() << std::endl;
rit;
}
}
Result.
content: AB.AC
AB
.
AC
content: AB.QWE,CV
AB
.
QWE
,
CV
content: ABC, ZXC
ABC
,
ZXC