This is quite a weird question, but I don't come along with it.
I do have two dataframes, named df1 and df2. There structure is:
df1:
Eval Lang Average Model
df2:
Eval Lang Mean
The model column has exactly six different values available and there exists exactly one of these six for every possible combination (Eval, Lang). I want these six values to be six new columns in df2, all having the value of df1['Average'] in that position.
So the resulting structure then is:
Eval Lang Mean ModelAverage1 ModelAverage2 ModelAverage3 ModelAverage4 ModelAverage5 ModelAverage6
Is there an easy clever way to do this? I am not sure anymore if what I am doing here is something which normal panda commands should support.
==== To visualise that, here is an example: df1.csv:
Eval,Lang,Average,Model
F1,German,0.62,flairmulti
F1,German,0.7363,flairsingle
F1,German,0.72,bertmulti
F1,German,0.7527,bertsingle
F1,German,0.78,robertamulti
F1,German,0.7349,robertasingle
df2.csv
Eval,Lang,Mean
F1,German,0.54
Precision,German,0.54
Recall,German,0.53
The result would then be like: total.csv
Eval,Lang,Mean,flairmulti,flairsingle,bertmulti,bertsingle,robertamulti,robertasingle
F1,German,0.54,0.62,0.7363,0.72,0.7527,0.78,0.7349
Precision,German,0.54,...
Recall,German,0.53,...
Of course there are also other values for Lang and Eval, but in fact, every pair of them has exactly these six named values of "Model" which I have above.
CodePudding user response:
You can pivot
df2
to get the values from the Model
column as the new columns and the values from the Average
column as the new values:
df3 = df2.merge(df1.pivot(index=['Eval', 'Lang'], columns='Model', values='Average').reset_index())
Output:
>>> df3
Eval Lang Mean bertmulti bertsingle flairmulti flairsingle robertamulti robertasingle
0 F1 German 0.54 0.72 0.7527 0.62 0.7363 0.78 0.7349