Loop Through Pandas Dataframe and split into multiple dataframes based on same column Name-CodePudding

Master File example:

d = pd.read_excel(path, header = 1,sheet_name='Master File example')
dct={}
for i in d.filter(like= "Customer"):
   dct[f'df {i}'] = d.loc[:,i:'TAMBA']
print(dct)

Objective: Want to slice dataframe from Customer to Customer-1, and create multiple dataframe to compare each other. So if any col missed for Customer.1 which is present in Customer can create same col for customer.1 and comparison is continue till last customer.

When we read excel file that which have same column name then it consider second as Customer.1,Customer.2 and so on..

Result: Display only one Customer

CodePudding user response：

# mocking data
df = pd.DataFrame(np.arange(5*6).reshape(5,6), columns=['customer',*'ab']*2)

# grouping horizontally
grouper = df.columns.str.match('.*customer.*', case=False).cumsum()
groups = {f'Frame {i}: {gr.columns[0]}': gr for i, gr in df.groupby(grouper, axis=1)}

for k, v in groups.items():
    print(f'Name: {k}')
    print(v, '\n')

Output:

Name: Frame 1: customer
   customer   a   b
0         0   1   2
1         6   7   8
2        12  13  14
3        18  19  20
4        24  25  26 

Name: Frame 2: customer
   customer   a   b
0         3   4   5
1         9  10  11
2        15  16  17
3        21  22  23
4        27  28  29