Home > Back-end >  How to erase a string if it contains any character in a set [c ]
How to erase a string if it contains any character in a set [c ]

Time:08-08

I am new to C and could not find a solution in any post for this.

I have a vector of strings and I wish to erase a string from this vector if it contains any of these symbols: {'.', '%', '&','(',')', '!', '-', '{', '}'}.

I am aware of find(), which only takes one character to search for; however, I want to go through each word in the string vector and erase them if they contain any of these characters. E.g. find('.') does not suffice.

I have tried multiple routes such as creating a char vector of all these characters and looping through each one as a find() parameter. However, this logic is very flawed, as it will cause an abort trap if the vector only has one line with a '.' in it, or leaves some strings with the unwanted character inside.

vector<std::string> lines = {"hello..","Hello...", "hi%", "world","World!"}
vector<char> c = {'.', '%', '&','(',')', '!', '-', '{', '}'};

    for (int i=0; i < lines.size(); i  ){
        for(int j=0; j < c.size(); j  ){
            if (lines.at(i).find(c.at(j)) != string::npos ){
                lines.erase(lines.begin()   i);
            }
        }
    }

I have also tried find_first_of() inside a loop of vector 'lines', which yields the same result as there above code.

if (lines.at(i).find_first_of(".%&()!-{}") != string::npos ){
                lines.erase(lines.begin()   i);

Can someone please help me with this logic?

EDIT:

when I put in --i after erasing the line, instead nothing is displayed and I have an abort trap because it loops outside vector range.

CodePudding user response:

There are two issues in your code, both 'inside' the inner for loop when a match is found.

First, you keep checking the same vector element for a (further) match, even after you erase it; to fix this, add a break; statement inside the if block, to prevent further runs of that inner loop after a match has been found and the erase() call has been made.

Second, when you do erase an element, you need to decrement the i index (which will be incremented before the start of the next outer loop), so that you aren't skipping the check for the element that i will index after the erasure.

Here's a fixed version of your code:

#include <iostream>
#include <vector>
#include <string>

int main()
{
    std::vector<std::string> lines = { "hello..", "Hello...", "hi%", "world", "World!" };
    std::vector<char> c = { '.', '%', '&','(',')', '!', '-', '{', '}' };

    for (size_t i = 0; i < lines.size(); i  ) {
        for (size_t j = 0; j < c.size(); j  ) {
            if (lines.at(i).find(c.at(j)) != std::string::npos) {
                lines.erase(lines.begin()   static_cast<ptrdiff_t>(i));
                i--; // Decrement the i index to avoid skipping next string
                break; // Need to break out of inner loop as "i" is now wrong!
            }
        }
    }

    for (auto l : lines) {
        std::cout << l << std::endl;
    }
    return 0;
}

However, as pointed out in other answers, you can improve your code significantly by making more use of the functions offered by the Standard Library.

CodePudding user response:

You have a bug where you increment the vector index after the vector has shrunk – after you remove the i:th element, the element that used to be at i 1 is now at i, so you step over it.
If you removed the last element, you step outside the vector.

You can avoid this kind of issue by raising the abstraction level and making more use of algorithm.

Something like this:

const std::set<char> symbols = {'.', '%', '&','(',')', '!', '-', '{', '}'};

bool invalid(const std::string& s)
{
    return std::find_if(s.begin(),
                        s.end(),
                        // From C   20, use 'contains' instead of 'count'.
                        [](char c) { return symbols.count(c) != 0; })
        != s.end();
}

int main()
{
    std::vector<std::string> data = {"abc", "abc.", "def", "d&ef", "!ghi", "ghi"};
    auto end = std::remove_if(data.begin(), data.end(), invalid);
    data.erase(end, data.end());
    for (const auto& s: data)
    {
        std::cout << s << std::endl;
    }
}

CodePudding user response:

You can use std::remove_if to solve this.

vector<std::string> lines = {"hello..","Hello...", "hi%", "world","World!"};
vector<char> ign_c = {'.', '%', '&','(',')', '!', '-', '{', '}'};

std::transform(lines.begin(), lines.end(), lines.begin(), [&](std::string str){
    str.erase(std::remove_if(str.begin(), str.end(), 
        [&](char c){
            return std::find(ign_c.begin(), ign_c.end(), c) != ign_c.end();
        }), 
    str.end());
    return str;
});

CodePudding user response:

The logic you look the strings with is not optimal, but still correct, however you alter the container you are iterating through (lines) without altering the iterator traits. Instead you should make the iterator into for-loop argument and update it accordingly on each erase. Also there is no need to keep inner iteration if you already erased an element:

std::vector<std::string> lines { "hello..","Hello...", "hi%", "world","World!" };
constexpr std::array chars { '.', '%', '&','(',')', '!', '-', '{', '}' };

for (auto it = lines.cbegin(); it != lines.cend();) {
    auto str = *it;
    auto found = false;
    for(auto &ch : chars) {
        if (str.find(ch) != std::string::npos) {
            // If element is found erase it and update the iterator
            it = lines.erase(it);
            found = true;
            // breaks inner loop
            break;
        }
    }
    // If nothing was erased, increment the iterator to point to the next element
    if (!found) {
          it;
    }
}

CodePudding user response:

If you use the right abstractions, you can get to code as readable as this:

auto result = lines | remove_if([](std::string s){
                                    return any_of(c, is_contained_in(s));
                                });

That's not far from remove the string s if any of the items in c is contained in s, which is what you want to do.

For that thing to work without writing any special code yourself, you need a couple of libraries.

Here's the complete example:

#include <boost/hana/functional/curry.hpp>
#include <iostream>
#include <vector>
#include <string>
#include <range/v3/view/remove_if.hpp>
#include <range/v3/algorithm/contains.hpp>
#include <range/v3/algorithm/any_of.hpp>

std::vector<std::string> lines = {"hello..","Hello...", "hi%", "world","World!"};
std::vector<char> c = {'.', '%', '&','(',')', '!', '-', '{', '}'};

using namespace ranges;
using namespace ranges::views;
using namespace boost::hana;

auto is_contained_in = curry<2>(ranges::contains);

int main() {

    auto result = lines | remove_if([](std::string s){
        return any_of(c, is_contained_in(s));
    });

    for (auto i : result) {
        std::cout << i << std::endl;
    }
}

Notice that there's no low level logic anywhere in this code. I haven't written one single function. I've just plugged together well tested functions from existing libraries.

  • Related