Split Pandas dataframe into multiple dataframes based on empty column delimiter-CodePudding

I'm reading the following excel sheet into a dataframe.

I want to split it into three dataframes by product. The tables will always be delimited by a single blank column in between, but each table can have different number of columns.

CodePudding user response：

Based on the article introduced in the comment, you can process it as follows.

import pandas as pd

#### Read excel file to dataframe
df = pd.read_excel('test.xlsx', index_col=None, header=None)

#### Find empty column and listed
empcols = [col for col in df.columns if df[col].isnull().all()]
df.fillna('', inplace=True)

#### Split into consecutive columns of valid data
allcols = list(range(len(df.columns)))
start = 0
colslist = []
for sepcol in empcols:
    colslist.append(allcols[start:sepcol])
    start = sepcol 1

colslist.append(allcols[start:])

#### Extract consecutive columns of valid data and store them in a dictionary
dfdic = {}
for i in range(len(colslist)):
    wkdf = df.iloc[:, colslist[i]]
    title = ''.join(wkdf.iloc[0].tolist())
    wkcols = wkdf.iloc[1].tolist()
    wkdf.drop(wkdf.index[[0,1]], inplace=True)
    wkdf.columns = wkcols
    dfdic[title] = wkdf.reset_index(drop=True)

#### Display each DataFrame stored in the dictionary
dfkeys = dfdic.keys()
for k in dfkeys:
    print(k)
    print(dfdic[k])
    print()