I am trying to delete some number of hashes from an array if the particular keys in the hash contain or include some certain words. Find array below:
BANNED_WORDS = ['Hacked', 'hack', 'fraud', 'hacked']
data = [
{
"news_url": "https://www.benzinga.com/markets/cryptocurrency/21/10/23391043/north-vancouver-to-heat-buildings-with-bitcoin-mining",
"image_url": "https://crypto.snapi.dev/images/v1/m/v/fw-69939.jpeg",
"title": "North Vancouver To Heat Buildings With Bitcoin Mining",
"text": "Canadian hack Bitcoin (CRYPTO: BTC) mining firm MintGreen has partnered with state-owned Lonsdale Energy Corporation (LEC) to heat 100 residential and commercial buildings in North Vancouver with recovered energy from crypto mining.",
"source_name": "Benzinga",
"date": "Fri, 15 Oct 2021 12:16:19 -0400",
"topics": [
"mining"
],
"sentiment": "Neutral",
"type": "Article",
"tickers": [
"BTC"
]
},
{
"news_url": "https://u.today/ethereum-20-next-steps-to-mainnet-shared-by-ethereum-foundation",
"image_url": "https://crypto.snapi.dev/images/v1/b/t/10169-69937.jpg",
"title": "Ethereum 2.0 Next Steps to Mainnet Shared by Ethereum Foundation",
"text": "Ethereum (ETH) developers have entered final phase of testing before hotly anticipated ETH1-ETH2 transition",
"source_name": "UToday",
"date": "Fri, 15 Oct 2021 12:11:00 -0400",
"topics": [],
"sentiment": "Neutral",
"type": "Article",
"tickers": [
"ETH"
]
}
]
I am trying to delete any hash that either the text or title contains/include any word in the BANNED_WORDS array above.
I have tried the below and some other variations but none seem to be working. I am new to ruby, can someone please point me to what I am doing wrong, thanks.
data.select{|coin| coin[:text].split(" ").select{ |word| !BANNED_WORDS.include?(word) || coin[:title].split(" ").select{ |word| !BANNED_WORDS.include?(word)}}
So the result should be:
filtered_result = [
{
"news_url": "https://u.today/ethereum-20-next-steps-to-mainnet-shared-by-ethereum-foundation",
"image_url": "https://crypto.snapi.dev/images/v1/b/t/10169-69937.jpg",
"title": "Ethereum 2.0 Next Steps to Mainnet Shared by Ethereum Foundation",
"text": "Ethereum (ETH) developers have entered final phase of testing before hotly anticipated ETH1-ETH2 transition",
"source_name": "UToday",
"date": "Fri, 15 Oct 2021 12:11:00 -0400",
"topics": [],
"sentiment": "Neutral",
"type": "Article",
"tickers": [
"ETH"
]
}
]
CodePudding user response:
This is a job for a regular expression.
R = /\b(?:#{BANNED_WORDS.join('|')})\b/
#=> /\b(?:Hacked|hack|fraud|hacked)\b/
data.reject { |h| h[:title].match?(R) || h[:text].match?(R) }
#=> [{:news_url=>"https://u.today/ethereum-20-next-steps...,
# ...
# :tickers=>["ETH"]}]
See Regexp#match?.
\b
in the regular expression is a word boundary. They are there to prevent matches of, say, 'haskintosh'
and 'defraud'
.