tweets = [
"Wow, what a great day today!! #sunshine",
"I feel sad about the things going on around us. #covid19",
"I'm really excited to learn Python with @JovianML #zerotopandas",
"This is a really nice song. #linkinpark",
"The python programming language is useful for data science",
"Why do bad things happen to me?",
"Apple announces the release of the new iPhone 12. Fans are excited.",
"Spent my day with family!! #happy",
"Check out my blog post on common string operations in Python. #zerotopandas",
"Freecodecamp has great coding tutorials. #skillup"
]
happy_words = ['great', 'excited', 'happy', 'nice', 'wonderful', 'amazing', 'good', 'best']
Question : Determine the number of tweets in the dataset that can be classified as happy.
MY CODE :
number_of_happy_tweets = 0
for i in tweets:
for x in i:
if x in happy_words:
number_of_happy_tweets = len(x)
Why this code is not working???????
CodePudding user response:
You are iterating over letters in tweets and checking if that letter is in happy_words
what you need to do is this:
for tweet in tweets:
number_of_happy_tweets = any(word in tweet for word in happy_words)
which means you increase number_of_happy_tweets
by one whenever any of the happy words is found in the tweet.
CodePudding user response:
hi in the second loop you are iterating over elements(alphabets), not words. to iterate over words use split() as below also note that number_of_happy_tweets
increases by one each time not but length:
for i in tweets:
for x in i.split():
if x in happy_words:
number_of_happy_tweets = 1
but notice that if in one tweet you have two(or more) happy words the code counts it as two or even if a happy word combines with other symbols like # it does not count it in this way so I suggest using the following code:
for tweet in tweets:
if any(happy_word in tweet for happy_word in happy_words):
number_of_happy_tweets = 1
CodePudding user response:
The problem is in your code.
The first line of your code is fine for i in tweets
.
But in the second line, you use
for i in tweets:
for x in i: # Try to print `x`
print(x) #
W
o
w
,
w
h
a
t
a
.
.
.
Here you got the letter from tweets.
After that, you try to check these letters in your happy_words list.
Try this code.
happy_words = ['great', 'excited', 'happy', 'nice', 'wonderful', 'amazing', 'good', 'best']
number_of_happy_tweets = 0
number_of_happy_tweets = 0
for i in tweets:
for x in happy_words:
if x in i:
number_of_happy_tweets =1
break
CodePudding user response:
Form a regex alternation of terms and then use a list comprehension along with re.search
:
happy_words = ['great', 'excited', 'happy', 'nice', 'wonderful', 'amazing', 'good', 'best']
regex = r'(?:' r'|'.join(happy_words) r')'
num_tweets = len([x for x in tweets if re.search(regex, x)])
print(num_tweets) # 6