How to read excel with several dataframes in one sheet and common index-CodePudding

I have an .xlsx file with several dataframes with common index in one sheet. An example:

What is a good way to load it and have two separate dataframes, one being first dataframe and second being second dataframe both with the same index as in the .xlsx file?

The results are:

df1 = pd.DataFrame(data={'col1': [aa, aa, aa], 'col2': [bb, bb, bb], 'col3': [cc, cc, cc]}, index=[0,1,2])
df2 = pd.DataFrame(data={'col1': [dd, dd, dd], 'col2': [ee, ee, ee], 'col3': [ff, ff, ff]},index=[0,1,2])

CodePudding user response：

try something like this

import pandas as pd

df0 = pd.read_excel('mybook.xlsx', header=[0,1])
df0.head()

outputs:

    Unnamed: 0_level_0    firstdataframe          seconddataframe
           index        col1    col2    col3    col1    col2    col3
0              0          aa    bb        cc    dd      ee      ff
1              1          aa    bb        cc    dd      ee      ff
2              2          aa    bb        cc    dd      ee      ff

so, multindex might help (header=[0,1] means first and second row is used as col index)

then,

df1=df0.loc[:,'firstdataframe']
df1

outputs

    col1    col2    col3
0     aa      bb    cc
1     aa      bb    cc
2     aa      bb    cc

and then,

df2=df0.loc[:,'seconddataframe']
df2

ouputs

    col1    col2    col3
0     dd      ee    ff
1     dd      ee    ff
2     dd      ee    ff

read https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_excel.html

CodePudding user response：

Use the usecols parameter from read_excel
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_excel.html