I have a excel with multiple sheets in the below format. I need to create a single dataframe by concatenating all the sheets, unmerging the cell and then transposing them into a column based on the sheet
Final Dataframe should look like below Result expected - I need the below format with an extra coulmn as below
Code So far: Reading File:
df = pd.concat(pd.read_excel('/Users/john/Desktop/Internal/Raw Files/Med/Dig/File_2017_2022.xlsx', sheet_name=None, skiprows=1))
Creating Column :
df_1 = pd.concat([df.assign(name=n) for n,df in dfs.items()])
CodePudding user response:
I created my own version of your excel, which looks like this
The code below is far from perfect but it should do fine as long as you do not have millions of sheets
# First, obtain all sheet names
full_df = pd.read_excel(r'C:\Users\.\Downloads\test.xlsx',
sheet_name=None, skiprows=0)
# Store them into a list
sheet_names = list(full_df.keys())
# Create an empty Dataframe to store the contents from each sheet
final_df = pd.DataFrame()
for sheet in sheet_names:
df = pd.read_excel(r'C:\Users\.\Downloads\test.xlsx', sheet_name=sheet, skiprows=0)
# Get the brand name
brand = df.columns[1]
# Remove the header columns and keep the numerical values only
df.columns = df.iloc[0]
df = df[1:]
df = df.iloc[:, 1:]
# Set the brand name into a new column
df['Brand'] = brand
# Append into the final dataframe
final_df = pd.concat([final_df, df])
Your final_df should look like this once exported back to excel
EDIT: You might need to drop the dataframe's index upon saving it by using the df.reset_index(drop=True) function, to remove the first column shown in the image right above.
CodePudding user response:
Use:
dfs = pd.read_excel('Input.xlsx',sheet_name=None, header=[0,1], index_col=[0,1])
df_1 = (pd.concat([df.stack(0).assign(name=n) for n,df in dfs.items()])
.rename_axis(index=['Date','WK','Brand'], columns=None)
.reset_index())
print (df_1)
Date WK Brand A1 A10 A11 A12 A13 A2 \
0 1 20 XYZ 0.879300 0.0261 0.231800 0.2412 0.27180 0.7796
1 2 21 XYZ 0.807900 0.8426 0.386300 0.8667 0.84040 0.8457
2 3 22 XYZ 0.872900 0.6538 0.681700 0.0752 0.68900 0.4728
3 4 23 XYZ 0.954200 0.4246 0.642500 0.4989 0.69840 0.6284
4 5 24 XYZ 0.110200 0.2384 0.817600 0.1307 0.70820 0.6780
5 6 25 XYZ 0.162800 0.6447 0.004100 0.0528 0.33490 0.1759
6 7 26 XYZ 0.594600 0.9878 0.711700 0.9426 0.79770 0.0709
7 8 27 XYZ 0.191000 0.9067 0.800300 0.0115 0.77450 0.3929
8 9 28 XYZ 0.495700 0.5614 0.612100 0.3440 0.83600 0.7441
9 10 29 XYZ 0.732700 0.8441 0.190500 0.3554 0.77730 0.3197
10 11 30 XYZ 0.182700 0.0233 0.312400 0.1103 0.14030 0.7587
11 1 20 ABC 0.452776 0.9780 0.709827 0.2569 0.82034 0.0330
12 2 21 ABC 0.639173 0.5810 0.224544 0.9578 0.31603 0.8100
13 3 22 ABC 0.781147 0.5610 0.393751 0.2752 0.89457 0.1560
14 4 23 ABC 0.267159 0.0140 0.510502 0.8702 0.02365 0.0590
15 5 24 ABC 0.533062 0.9880 0.423679 0.8909 0.19700 0.5450
16 6 25 ABC 0.212764 0.5850 0.295257 0.8627 0.02916 0.6440
17 7 26 ABC 0.210353 0.0290 0.344722 0.5893 0.58645 0.2710
18 8 27 ABC 0.400770 0.4960 0.657140 0.1964 0.24827 0.4680
19 9 28 ABC 0.784346 0.6200 0.389639 0.6611 0.96997 0.7160
20 10 29 ABC 0.563395 0.9440 0.111782 0.9427 0.97576 0.8460
A3 A4 A5 A6 A7 A8 A9 name
0 0.9385 0.7753 0.7786 0.07560 0.6403 0.436500 0.02470 Sheet1
1 0.8395 0.7836 0.6573 0.25450 0.9042 0.105200 0.86310 Sheet1
2 0.2821 0.1440 0.2007 0.22530 0.6513 0.469500 0.95240 Sheet1
3 0.6310 0.9538 0.0459 0.64120 0.3661 0.255400 0.32200 Sheet1
4 0.2798 0.4732 0.7211 0.05150 0.3435 0.820600 0.65000 Sheet1
5 0.4093 0.7066 0.0728 0.05840 0.3661 0.429500 0.42950 Sheet1
6 0.3238 0.0201 0.3456 0.14760 0.9303 0.133900 0.22700 Sheet1
7 0.6638 0.9480 0.0811 0.34470 0.0816 0.089800 0.17560 Sheet1
8 0.8123 0.7700 0.3594 0.13080 0.7899 0.206200 0.44610 Sheet1
9 0.7114 0.3807 0.8601 0.60490 0.6659 0.157800 0.56270 Sheet1
10 0.3607 0.2414 0.6609 0.92080 0.9176 0.740000 0.13420 Sheet1
11 0.0389 0.6040 0.6508 0.57646 0.6231 0.985285 0.88505 Sheet2
12 0.7297 0.1390 0.2412 0.30669 0.3714 0.572789 0.73314 Sheet2
13 0.7449 0.8730 0.0929 0.24519 0.6180 0.464157 0.09081 Sheet2
14 0.9603 0.6090 0.9586 0.42426 0.5402 0.383903 0.56418 Sheet2
15 0.5817 0.8250 0.6945 0.19948 0.0507 0.071899 0.70657 Sheet2
16 0.9808 0.3340 0.8652 0.09141 0.5028 0.400358 0.19238 Sheet2
17 0.0821 0.2770 0.0378 0.55174 0.1446 0.291331 0.11745 Sheet2
18 0.9705 0.7380 0.3926 0.78333 0.8904 0.887290 0.84794 Sheet2
19 0.3167 0.6270 0.7094 0.01363 0.7816 0.143793 0.49700 Sheet2
20 0.4576 0.0220 0.5176 0.17527 0.2589 0.223100 0.54703 Sheet2