I have to remove some chars from a string, but I have some problems. I found this part of code online, but it does not work so well, it removes the chars but it even removes the white spaces
string messaggio = "{Questo e' un messaggio} ";
char chars[] = {'Ì', '\x1','\"','{', '}',':'};
for (unsigned int i = 0; i < strlen(chars); i)
{
messaggio.erase(remove(messaggio.begin(), messaggio.end(), chars[i]), messaggio.end());
}
Can someone tell me how this part of code works and why it even removes the white spaces?
CodePudding user response:
Because you use strlen
on your chars
array. This function stops ONLY when it encounters a \0
, and you inserted none... So you're parsing memory after your array - which is bad, it should even provoke a SEGFAULT.
Also, calling std::remove
is enough.
A correction could be:
char chars[] = {'I', '\x1','\"','{', '}',':'};
for (unsigned int i = 0; i < sizeof(chars); i)
{
std::remove(messaggio.begin(), messaggio.end(), chars[i]) ;
}
CodePudding user response:
Answer for Wissblade is more or less correct, it just lacks of some technical details.
As mentioned strlen
searches for terminating character: '\0'
.
Since chars
do not contain such character, this code invokes "Undefined behavior" (buffer overflow).
"Undefined behavior" - means anything can happen, code may work, may crash, may give invalid results.
So first step is to drop strlen
and use different means to get size of the array.
There is also another problem. Your code uses none ASCII character: 'Ì'
.
I assume that you are using Windows and Visual Studio. By default msvc
compiler assumes that file is encoded using your system locale and uses same locale to generate exactable. Windows by default uses single byte encoding specific to your language (to be compatible with very old software). Only in such chase you code has chance to work. On platforms/configuration with mutibyte encoding, like UTF-8
this code can't work even after Wisblade
fixes.
Wisblade
fix can take this form (note I change order of loops, now iteration over characters to remove is internal loop):
bool isCharToRemove(char ch)
{
constexpr char skipChars[] = {'Ì', '\x1','\"','{', '}',':'};
return std::find(std::begin(skipChars), std::end(skipChars), ch) != std::end(skipChars);
}
std::string removeMagicChars(std::string message)
{
message.erase(
std::remove_if(message.begin(), message.end(), isCharToRemove),
message.end());
}
return message;
}
Let me know if you need solution which can handle more complex text encoding.