I would like to read specific cells from multiple excel spreadsheets into single pandas dataframe.
so far, I have tried this. (without a success)
import pandas as pd
import glob
import xlrd
file_list = glob.glob("*.xls")
df = pd.DataFrame()
for f in file_list:
wb = xlrd.open_workbook(f)
sheet = wb.sheet_by_index(0)
name = sheet.cell_value(rowx=9, colx=2)
city = sheet.cell_value(rowx=15, colx=2)
df = df.append([name,city])
Desired output is pandas dataframe as this
name city
Tom NY
Alex Toronto
Anne Atlanta
... ...
Thanks
CodePudding user response:
I think you need two sets of [[]]
around what is being appended. With one set of brackets, it tries to add name as a row and city as a row, rather than as columns in the same row.
import pandas as pd
import glob
import xlrd
file_list = glob.glob("*.xls")
df = pd.DataFrame()
for f in file_list:
wb = xlrd.open_workbook(f)
sheet = wb.sheet_by_index(0)
name = sheet.cell_value(rowx=9, colx=2)
city = sheet.cell_value(rowx=15, colx=2)
df = df.append([[name,city]])
This will have columns named 0
and 1
, though (since you didn't define names in the creation of the dataframe), so a last step would be to rename those:
df = df.rename(columns={0:'name',1:'city'})