Home > Software design >  How can I remove specific words from a string - Ruby
How can I remove specific words from a string - Ruby

Time:10-29

I have the following string, from which I want to extract any 'words' which do not contain numbers or special characters. For now, commas, question marks or full stops are accepted:

b? Dl )B 4(V! A. MK, YtG ](f 1m )CNxuNUR {PG?

Desired output:

b? Dl A. MK, YtG

5

Current output:

b? Dl A. MK, YtG 1m

6

At the moment, the function below successfully removes numbers from the string, however, words which include both numbers and letters are not omitted. Thus, the '1m' being included in my current output.

Current function:

def howMany(sentence)

    if sentence.is_a? String
        
        output = sentence.split
        count = 0

        test_output = []

        output.each {|word| 

            if word !~ /\D/ || word =~ /[!@#$%^&*()_ {}\[\]:;'"\/\\><]/
                count
            else
                test_output.push(word)
                count  = 1
            end

        }   

        puts test_output 
        puts count 
    
    else
        puts "Please enter a valid string" 
    end

end 

My assumption is I'll have to somehow iterate through each word in the string in order to find whether it includes numbers, however, I'm not sure how to go about that specific solution. I thought about using .split("") inside my output.each function but was unsuccessful after a few attempts.

Any suggestions would be hugely appreciated.

Thanks in advance!

CodePudding user response:

I would suggest trying something like this.

Turn the sentence into an array using split sentence.split(' '). Then allow only the ones that match the pattern using filter Then use the filtered list for both puts operations. It should look something like this.

def how_many(sentence)
  sentence.split(' ').filter { |word| matches_pattern?(word) }.tap do |words|
    puts words.size
    puts words # or words.join(' ')
  end
end

def matches_pattern?(word)
  word.matches? /some_regular_expression/
end

You can of course modify accordingly to add any side cases, et c. This would be a more idiomatic solution.

Note than you can also use .filter(&method(:matches_pattern?)) but that might be confusing to some.

Edit: rubular.com is a good place to try your regexps.

Edit: when things get hard, try making them in smaller chunks (i.e. try not to make methods longer than 5 lines).

CodePudding user response:

This is a job for String#scan using a regular expression.

str = "b? Dl )B 4(V! A. MK, YtG ](f 1m )CNxuNUR {PG?"

r = /(?<!\S)[a-z.,\?\r\n] (?!\S)/i

str.scan(r)
  #=> ["b?", "Dl", "A.", "YtG"]
  • Related