I have a Dataframe df
like the following:
Warehouse Date Count
0 Delhivery Goa Warehouse 2022-05-12 83
1 Delhivery Goa Warehouse 2022-05-15 1
2 Delhivery Goa Warehouse 2022-05-18 100
3 Delhivery Tauru Warehouse 2022-05-19 100
4 Delhivery Tauru Warehouse 2022-05-20 100
and another dataframe df_orig
like the following:
index Goa Tauru
0 2022-05-12Delhivery Goa Warehouse 100.0 0.0
1 2022-05-15Delhivery Goa Warehouse 100.0 0.0
2 2022-05-18Delhivery Goa Warehouse 100.0 0.0
3 2022-05-20Delhivery Tauru Warehouse 0.0 50.0
4 2022-05-19Delhivery Tauru Warehouse 0.0 70.0
How can I pick values from the df_orig
columns based on combination of warehouse and Date columns of the df
?
Expected output:
Warehouse Date Count original
0 Delhivery Goa Warehouse 2022-05-12 83 100
1 Delhivery Goa Warehouse 2022-05-15 1 100
2 Delhivery Goa Warehouse 2022-05-18 100 100
3 Delhivery Tauru Warehouse 2022-05-19 100 70
4 Delhivery Tauru Warehouse 2022-05-20 100 50
My approach:
df['index1'] = str(df['Date']) str(df['Warehouse'])
original = []
for index, row in df.iterrows():
if row['index1'] == df_orig['index']:
original.append(????)
CodePudding user response:
Concatenate Date
and Warehouse
columns in first dataframe, and calculate the sum of values of all the columns except first using iloc[:,1:]
, then merge two dataframes, and finally take only the columns of interest:
(df
.assign(index=df['Date'] df['Warehouse'])
.merge(df_orig.assign(original=df_orig.iloc[:,1:].sum(1)))
)[['Warehouse', 'Date', 'Count', 'original']]
OUTPUT:
Warehouse Date Count original
0 Delhivery Goa Warehouse 2022-05-12 83 100.0
1 Delhivery Goa Warehouse 2022-05-15 1 100.0
2 Delhivery Goa Warehouse 2022-05-18 100 100.0
3 Delhivery Tauru Warehouse 2022-05-19 100 70.0
4 Delhivery Tauru Warehouse 2022-05-20 100 50.0
CodePudding user response:
merge()
works here. You can also map the sum of the values in df_orig
to df
rows using map()
method. As index
column in df_orig
is the same as the concatenation of Date
and Warehouse
columns in df
, first concatenate those columns to make the mapping keys match.
# map the sum of the values in df_orig to df.Warehouse via df_orig.index
df['original'] = (df['Date'].astype(str) df['Warehouse']).map(df_orig.set_index('index').sum(1))