I am trying to grab some data from a url. My code does not work as it produces error.
import requests
from bs4 import BeautifulSoup as bs
headers = {"User-Agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:92.0) Gecko/20100101 Firefox/92.0"}
r = requests.get('https://bscscan.com/tx/0x6cf01dda40e854b47117bb51c8d2148c6ae851e7d79d792867d1fa66d5ba6bad', headers = {'User-Agent':'Mozilla/5.0'})
soup = bs(r.content, 'lxml')
test = soup.find('div', class_='rawtab').get_text()
print (test)
Wanted Output:
Function: lockTokens(address _token, address _withdrawer, uint256 _amount, uint256 _unlockTimestamp) ***
MethodID: 0x7d533c1e
[0]: 000000000000000000000000109fddb8d49ae139449cb061883b685827b23937
[1]: 000000000000000000000000d824567e3821e99e90930a84d3e320a0fdb6f867
[2]: 00000000000000000000000000000000000000000000a768da96dc82245ed943
[3]: 00000000000000000000000000000000000000000000000000000000628d8187
Current Output:
AttributeError: 'NoneType' object has no attribute 'get_text'
CodePudding user response:
Your near to your goal, but have to switch the attribute type.
<div data-target-group="inputDataGroup" id="rawtab">
The <div>
do not contain a class
it is the id
you should select and more specific its textarea
:
soup.find('div', id='rawtab').textarea.text
or with css selector
:
soup.select_one('div#rawtab textarea').text
Example
import requests
from bs4 import BeautifulSoup as bs
headers = {"User-Agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:92.0) Gecko/20100101 Firefox/92.0"}
r = requests.get('https://bscscan.com/tx/0x6cf01dda40e854b47117bb51c8d2148c6ae851e7d79d792867d1fa66d5ba6bad', headers = {'User-Agent':'Mozilla/5.0'})
soup = bs(r.content, 'lxml')
test = soup.select_one('div#rawtab textarea').text
print (test)
Output
Function: lockTokens(address _token, address _withdrawer, uint256 _amount, uint256 _unlockTimestamp) ***
MethodID: 0x7d533c1e
[0]: 000000000000000000000000109fddb8d49ae139449cb061883b685827b23937
[1]: 000000000000000000000000d824567e3821e99e90930a84d3e320a0fdb6f867
[2]: 00000000000000000000000000000000000000000000a768da96dc82245ed943
[3]: 00000000000000000000000000000000000000000000000000000000628d8187
Getting the items as list:
textArea = soup.select_one('div#rawtab textarea').text.split('\r\n')
data = [n.split()[-1] for n in textArea if ']:' in n]
Output data
(you can pick by index):
['000000000000000000000000109fddb8d49ae139449cb061883b685827b23937',
'000000000000000000000000d824567e3821e99e90930a84d3e320a0fdb6f867',
'00000000000000000000000000000000000000000000a768da96dc82245ed943',
'00000000000000000000000000000000000000000000000000000000628d8187']