I've got two dataframes and need to merge some data from the second one into the first but the column to pull data from changes from row to row.
df1 contains columns 'IDs' 'dates' 'choices'
ID dates choices
0 1 2022-07-05 choice1
1 2 2022-07-05 choice2
2 3 2022-07-05 choice2
3 4 2022-01-01 choice2
4 5 2022-07-10 choice7
df1 = pd.DataFrame({'ID': ['1', '2', '3','4','5'],
'dates': ['2022-07-05', '2022-07-05', '2022-07-05','2022-01-01','2022-07-10'],
'choices': ['choice1', 'choice2', 'choice2','choice2','choice7']})
df2 contains columns 'dates' 'value of choice1' 'value of choice2' 'value of choice3' etc
dates choice1 choice2 choice3 choice7
0 2022-01-01 122.4 34 20 33
1 2022-07-05 111.0 202.1 31.1 31.1
2 2022-07-10 115.6 1000.2 34.2 51.4
df2 = pd.DataFrame({'dates': ['2022-01-01', '2022-07-05', '2022-07-10'],
'choice1': ['122.4', '111.0', '115.6'],
'choice2': ['34', '202.1', '1000.2'],
'choice3': ['20', '31.1', '34.2'],
'choice7': ['33', '31.1', '51.4']})
I need to create a column in df1 that contains the value of the choice in which df1'dates'==df2'dates'
eg, at the end df1 would look like this
ID dates choices values
0 1 2022-07-05 choice1 111.0
1 2 2022-07-05 choice2 202.1
2 3 2022-07-05 choice2 202.1
3 4 2022-01-01 choice2 34.0
4 5 2022-07-10 choice7 51.4
df1_final = pd.DataFrame({'ID': ['1', '2', '3','4','5'],
'dates': ['2022-07-05', '2022-07-05', '2022-07-05','2022-01-01','2022-07-10'],
'choices': ['choice1', 'choice2', 'choice2','choice2','choice7'],
'values': ['111.0', '202.1', '202.1','34.0','51.4']})
Currently I can merge them with a series of masks for each of the choices, Im hoping to find a cleaner/more efficient way to do this (there are a lot of rows and more than 10 choices)
edit: used a different method to display df1 output, added in df1 and df2 tables/code
CodePudding user response:
you can use melt
to unpivot your df2 and then join with df1:
df2 = df2.melt(id_vars='dates', var_name='choices')
out = df1.merge(df2 , on=['dates','choices'])
output:
>>
ID dates choices value
0 1 2022-07-05 choice1 111.0
1 2 2022-07-05 choice2 202.1
2 3 2022-07-05 choice2 202.1
3 4 2022-01-01 choice2 34
4 5 2022-07-10 choice7 51.4