Home > Mobile >  Getting text in span
Getting text in span

Time:08-12

HTML code:

span  data-testid="post_timestamp" data-click-id="timestamp" style="color: rgb(129, 131, 132);">8 months ago</span

I want to extract "8 months ago". I code I am using is not giving any result.

data.find_all('span', attrs={'data-testid': True,'data-click-id' : True,'color':True})

CodePudding user response:

There are different approaches - Select element by an exact attributes value:

soup.select_one('[data-testid="post_timestamp"]').text

by a containing text ago if it is lways available:

soup.select_one('span:-soup-contains("ago")').text

by color in style attribute:

soup.select_one('span[style*="color"]').text

Example

from bs4 import BeautifulSoup
html='''
<span  data-testid="post_timestamp" data-click-id="timestamp" style="color: rgb(129, 131, 132);">8 months ago</span>
'''
soup = BeautifulSoup(html)

soup.select_one('[data-testid="post_timestamp"]').text
#soup.select_one('span:-soup-contains("ago")').text
#soup.select_one('span[style*="color"]').text

CodePudding user response:

Currently, you are only saying give me an element that has these attributes, which might give you more elements than you'd like. But you could also say give me an element that has an attribute with this value.

The data-testid="post_timestamp" attribute could come in handy to identify the span, as it seems that it identifies the timestamp of the post.

data.find_all('span', attrs={'data-testid': 'post_timestamp'})

Now you are telling BeautifulSoup to return a <span> with data-testid="post_timestamp", which should match the element you want.

If you want to find a single element instead of multiple, you can also use data.find instead of data.find_all. data.find basically returns you only the first element of data.find_all.

You can get the text of the element by accessing the text property of an element.

timestamp_element = data.find("span", {"data-testid": "post_timestamp"})
text = timestamp_element.text
  • Related