So I have an Excel dataset with 2 sheets that I want to read using pandas. To make the code more dynamic, I decided to store the sheet names list into a variable which I later pass into the 'get' method to create separate DataFrames for both.
import pandas as pd
# access all sheets as one DataFrame
excel_file = pd.ExcelFile(r"C:\Users\user\Desktop\example.xlsx")
email_list = pd.read_excel(excel_file)
sheets = excel_file.sheet_names
# separate sheets into separate DataFrames.
sheet1 = email_list.get(sheets[0])
sheet2 = email_list.get(sheets[1])
Then from these DataFrames I exract a column of emails as a Series
sheet1list = sheet1['Email']
sheet2list = sheet2['Email']
But when I run it, I run into the following error:
Traceback (most recent call last):
File "c:\Users\ga201\Desktop\My Python Projects\main.py", line 30, in <module>
sheet1list = sheet1['Email']
TypeError: 'NoneType' object is not subscriptable
Why does it make the
sheet1
variable a NoneType?Is there a way to actually use it through variables or is the only way to type in the sheet name?
Just curious. I've tried using f strings, but they don't work either.
CodePudding user response:
You're misinterpreting the default behaviour of pd.read_excel. If you don't specify the parameter sheet_name
, the function will only return the first sheet as a DataFrame.
Use email_list = pd.read_excel(excel_file, sheet_name=None)
to get DataFrames for all sheets.
The return value will now be a dictionary of DataFrames with the sheet_names as keys.