The following data has been extracted from two larger dataframes. df1
{'NID': {7: '212017463',
8: '212017463',
9: '212017463',
10: '212017463',
11: '212017463',
12: '212017463',
13: '212017463'},
'Lact': {7: 1, 8: 2, 9: 3, 10: 4, 11: 5, 12: 6, 13: 7},
'M-305': {7: 5018, 8: 5479, 9: 7134, 10: 9046, 11: 8972, 12: 9360, 13: 5295},
'PI-M': {7: 94, 8: 83, 9: 96, 10: 116, 11: 117, 12: 112, 13: 0}}
df 2
{'NID': {1: '212017463', 2: '212017463', 3: '212017463'},
'Lact': {1: 7, 2: 7, 3: 7}}
I would like to merge the df1 series M-305 and PI-M from the previous lactation (Lact == 6) with df2 (Lact =7).
if I do a simple merge
df3 = pd.merge(df2, df1[['NID', 'Lact', 'M-305', 'PI-M']], on=['NID', 'Lact'], how='left')
df3
I get M-305 and PI-M merged for lactation (Lact) 7 giving
NID Lact M-305 PI-M 0 212017463 7 5295 0 1 212017463 7 5295 0 2 212017463 7 5295 0
The output I would like to get is M-305 = 9360 and PI-M = 110 (Lact 6) values
Is there a way to merge lact with lact-1
Thanks
John
CodePudding user response:
IF you are used to SQL and the flexibility of its JOIN
operations, then unfortunately Pandas doesn't match those. Pandas only offers equi-join (where the join columns on both frames must match) and limited range join (in the form of merge_asof
).
You can calculate the join column manually:
pd.merge(
df2.assign(Lact_Merge = df2['Lact'] - 1),
df1[['NID', 'Lact', 'M-305', 'PI-M']],
left_on=['NID', 'Lact_Merge'],
right_on=['NID', 'Lact'],
how='left'
).drop(columns=['Lact_Merge', 'Lact_y']).rename(columns={'Lact_x': 'Lact'})
(I'm using "join" and "merge" interchangably. I have a background SQL Server where such operations are equivalent to INNER JOIN
, LEFT JOIN
, ...)