I am scraping a list of values from 20 of the most recent social media posts. For each post, I am attempting to record the number of likes and comments associated with it.
My loop either skips every like
value (except the first) for each post while successfully returning comments or prints all twenty instances of likes and comments, but all the like
values are duplicates of the first.
In cases where no like or comment value exists, I'd like the script to return 0, as attempted in the except clause.
Here is my latest attempt:
data = []
likes = driver.find_elements_by_css_selector(".v-align-middle.social-details-social-counts__reactions-count")
comments = driver.find_elements_by_css_selector(".social-details-social-counts__comments.social-details-social-counts__item")
counter = 1
for like in likes:
for comment in comments:
if counter <= 20:
try:
data.append({
"Post Likes": like.text,
"Post Comments": comment.text
})
counter = counter 1
time.sleep(2)
except (ElementNotVisibleException, NoSuchElementException):
data.append({
"Post Likes": 0,
"Post Comments": 0
})
pass
I am looking to produce an outcome like the following; however, the issue is that my script has duplicated the first post's like
value:
[{'Post Likes': '435', 'Post Comments': '8 comments'},
{'Post Likes':` '435', 'Post Comments': '1 comment'},
{'Post Likes': '435', 'Post Comments': '62 comments'},
{'Post Likes': '435', 'Post Comments': '2 comments'},
{'Post Likes': '435', 'Post Comments': '4 comments'},
{'Post Likes': '435', 'Post Comments': '6 comments'},
{'Post Likes': '435', 'Post Comments': '3 comments'},
{'Post Likes': '435', 'Post Comments': '45 comments'},
{'Post Likes': '435', 'Post Comments': '17 comments'},
{'Post Likes': '435', 'Post Comments': '30 comments'},
{'Post Likes': '435', 'Post Comments': '56 comments'},
{'Post Likes': '435', 'Post Comments': '31 comments'},
{'Post Likes': '435', 'Post Comments': '40 comments'},
{'Post Likes': '435', 'Post Comments': '74 comments'},
{'Post Likes': '435', 'Post Comments': '1 comment'},
{'Post Likes': '435', 'Post Comments': '29 comments'},
{'Post Likes': '435', 'Post Comments': '1 comment'},
{'Post Likes': '435', 'Post Comments': '37 comments'},
{'Post Likes': '435', 'Post Comments': '25 comments'},
{'Post Likes': '435', 'Post Comments': '3 comments'}]
If anyone can help point me in the right direction, I'd really appreciate it.
CodePudding user response:
I believe you intend to iterate through two lists simultaneously where the zip
iterator could help you. To limit it to just 20, the break as discussed above would be best. In your inner loop, like
never progress for the second like.
data = []
def make_entry(like_text, comments_text):
return {'Post Likes': like_text, 'Post Comments': comments_text}
for like, comment in zip(likes, comments):
try:
data.append(make_entry(like.text, comment.text))
time.sleep(2) # what is this for?
except (ElementNotVisibleException, NoSuchElementException):
data.append(make_entry('0', '0 comments'))
# I assume these should count against the 20
if len(data) >= 20:
break
CodePudding user response:
number of likes and comments
Do you really need a nested loop for this? If you have a list of each, then just get the length.
posts = [] # some list of more than 20 elements
if len(posts) > 20:
posts = posts[:-20]
data = [{'likes': len(p.likes), 'comments': len(p.comments)} for p in posts]
Regarding the problem, if you have 20 or more comments on the first like element, then your counter is preventing you from adding any more data, regardless of the remaining likes or comments. Ideally, you'd use a break
statement for that, not a counter and if