Trouble figuring out how to delete hash from array based on conditions-CodePudding

I am trying to delete some number of hashes from an array if the particular keys in the hash contain or include some certain words. Find array below:

BANNED_WORDS = ['Hacked', 'hack', 'fraud', 'hacked']

    data = [
       {
           "news_url": "https://www.benzinga.com/markets/cryptocurrency/21/10/23391043/north-vancouver-to-heat-buildings-with-bitcoin-mining",
           "image_url": "https://crypto.snapi.dev/images/v1/m/v/fw-69939.jpeg",
           "title": "North Vancouver To Heat Buildings With Bitcoin Mining",
           "text": "Canadian hack Bitcoin (CRYPTO: BTC) mining firm MintGreen has partnered with state-owned Lonsdale Energy Corporation (LEC) to heat 100 residential and commercial buildings in North Vancouver with recovered energy from crypto mining.",
           "source_name": "Benzinga",
           "date": "Fri, 15 Oct 2021 12:16:19 -0400",
           "topics": [
               "mining"
           ],
           "sentiment": "Neutral",
           "type": "Article",
           "tickers": [
               "BTC"
           ]
       },
       {
           "news_url": "https://u.today/ethereum-20-next-steps-to-mainnet-shared-by-ethereum-foundation",
           "image_url": "https://crypto.snapi.dev/images/v1/b/t/10169-69937.jpg",
           "title": "Ethereum 2.0 Next Steps to Mainnet Shared by Ethereum Foundation",
           "text": "Ethereum (ETH) developers have entered final phase of testing before hotly anticipated ETH1-ETH2 transition",
           "source_name": "UToday",
           "date": "Fri, 15 Oct 2021 12:11:00 -0400",
           "topics": [],
           "sentiment": "Neutral",
           "type": "Article",
           "tickers": [
               "ETH"
           ]
       }
    ]

I am trying to delete any hash that either the text or title contains/include any word in the BANNED_WORDS array above.

I have tried the below and some other variations but none seem to be working. I am new to ruby, can someone please point me to what I am doing wrong, thanks.

data.select{|coin| coin[:text].split(" ").select{ |word| !BANNED_WORDS.include?(word) || coin[:title].split(" ").select{ |word| !BANNED_WORDS.include?(word)}}

So the result should be:

filtered_result = [
           {
               "news_url": "https://u.today/ethereum-20-next-steps-to-mainnet-shared-by-ethereum-foundation",
               "image_url": "https://crypto.snapi.dev/images/v1/b/t/10169-69937.jpg",
               "title": "Ethereum 2.0 Next Steps to Mainnet Shared by Ethereum Foundation",
               "text": "Ethereum (ETH) developers have entered final phase of testing before hotly anticipated ETH1-ETH2 transition",
               "source_name": "UToday",
               "date": "Fri, 15 Oct 2021 12:11:00 -0400",
               "topics": [],
               "sentiment": "Neutral",
               "type": "Article",
               "tickers": [
                   "ETH"
               ]
           }
        ]

CodePudding user response：

This is a job for a regular expression.

R = /\b(?:#{BANNED_WORDS.join('|')})\b/
  #=> /\b(?:Hacked|hack|fraud|hacked)\b/

data.reject { |h| h[:title].match?(R) || h[:text].match?(R) }
  #=> [{:news_url=>"https://u.today/ethereum-20-next-steps...,
  #     ...
  #     :tickers=>["ETH"]}]

See Regexp#match?.

\b in the regular expression is a word boundary. They are there to prevent matches of, say, 'haskintosh' and 'defraud'.