Home > Enterprise >  'NoneType' object has no attribute 'find_all' with Beautifulsoup
'NoneType' object has no attribute 'find_all' with Beautifulsoup

Time:06-18

I have tried this code I found, however it gives me the error message of AttributeError: 'NoneType' object has no attribute 'find_all' I am not familiar with Beautifulsoup and dont know how to fix this. tried to find a solution where I ignore the tabpane part, but could not figure it out. Do you have any sugggestion?

import datetime
import pandas as pd # pip install pandas
import requests # pip install requests
from bs4 import BeautifulSoup # pip install beautifulsoup4

headers = {
    'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:87.0) 
Gecko/20100101 Firefox/87.0',
}
url = 'https://www.marketwatch.com/tools/earningscalendar'
response = requests.get(url, headers=headers)
soup = BeautifulSoup(response.content, 'html.parser')

tabpane = soup.find('div', 'tabpane')
earning_tables = tabpane.find_all('div', {'id': True})

dfs = {}
current_datetime = datetime.datetime.now().strftime('%m-%d-%y %H_%M_%S')
xlsxwriter = pd.ExcelWriter('Earning Calendar 
({0}).xlsx'.format(current_datetime), index=False)

for earning_table in earning_tables:
    if not 'Sorry, this date currently does not have any earnings 
announcements scheduled' in earning_table.text:
        earning_date = earning_table['id'].replace('page', '')
        earning_date = earning_date[:3]   '_'   earning_date[3:]
        print(earning_date)
        dfs[earning_date] = pd.read_html(str(earning_table.table))[0]
        dfs[earning_date].to_excel(xlsxwriter, sheet_name=earning_date, 
index=False)

xlsxwriter.save()
print('earning tables Excel file exported')

CodePudding user response:

To grap all tables in page:

tables = pd.read_html("https://www.marketwatch.com/tools/earnings-calendar")

Just look at the first:

print(tables[0].head())

If you are sure all tables have same columns, you can concat them to have only one dataframe:

df = pd.concat(pd.read_html("https://www.marketwatch.com/tools/earnings-calendar"))

CodePudding user response:

If your desired output is thus way, then you can follow the next example

from bs4 import BeautifulSoup
import requests
import pandas as pd
import openpyxl

url='https://www.marketwatch.com/tools/earnings-calendar'

req=requests.get(url)
#print(req)
soup = BeautifulSoup(req.content,"lxml")
data = []
for tr in soup.select('table[] tbody tr'):
    t = list(tr.stripped_strings)
    data.append(t)
    #print(t)

df=pd.DataFrame(data)#.to_excel('out.xlsx',index=False)
print(df)

Output:

0                        1     2  ...      4      5                6 
0  Alithya Group Inc. Cl A  Alithya Group Inc. Cl A  ALYA  ...  -0.01  -0.06  -0.05 (680.28%) 
1              Allego N.V.              Allego N.V.  ALLG  ...  -0.04  -0.03   0.01 (-25.00%) 

[2 rows x 7 columns]
  • Related