I have a dataframe which I had to convert between long and wide formats and perform certain calculations.
After the convertions I have ended up with a dataframe, as below:
df={'Time':['0', '1', '0', '1','0', '1','0', '1'],
'Parameter':['down_A', 'down_A', 'up_A','up_A','down_A', 'down_A', 'up_A','up_A'],
'NodeA':['2.56', '0.06', '0.14', '1.005','NaN', 'NaN', 'NaN','NaN'],
'NodeB':['NaN', 'NaN','NaN', 'NaN', '1.44', '1.11','0.56','1.98'],}
and I would like to multiply the values of NodeA and NodeB for the same Time-Parameter pairs.
The output I would ultimately like to have is:
df = {'Time':['0', '1', '0', '1'],
'Parameter':['down_A', 'down_A', 'up_A','up_A'],
'NodeA':['2.56', '0.06', '0.14', '1.005'],
'NodeB':['1.44', '1.11','0.56','1.98'],
'Multiplied':['3.6864','0.0666','0.0784','1.9899']}
How can I remove the duplicated rows and have a single row for each Time-Parameter pair?
CodePudding user response:
First convert values to floats and then aggregate first
or sum
, min
, 'max'..., last multiple columns:
df[['NodeA','NodeB']] = df[['NodeA','NodeB']].astype(float)
df= df.groupby(['Time','Parameter'], as_index=False, sort=False).first()
df['Multiplied'] = df['NodeA'].mul(df['NodeB'])
print (df)
Time Parameter NodeA NodeB Multiplied
0 0 down_A 2.560 1.44 3.6864
1 1 down_A 0.060 1.11 0.0666
2 0 up_A 0.140 0.56 0.0784
3 1 up_A 1.005 1.98 1.9899