Home > Blockchain >  Repeated rows filled with NaNs after convertion to wide format in pandas
Repeated rows filled with NaNs after convertion to wide format in pandas

Time:06-27

I have a dataframe which I had to convert between long and wide formats and perform certain calculations.

After the convertions I have ended up with a dataframe, as below:

df={'Time':['0', '1', '0', '1','0', '1','0', '1'],
    'Parameter':['down_A', 'down_A', 'up_A','up_A','down_A', 'down_A', 'up_A','up_A'],
    'NodeA':['2.56', '0.06', '0.14', '1.005','NaN', 'NaN', 'NaN','NaN'],
    'NodeB':['NaN', 'NaN','NaN', 'NaN', '1.44', '1.11','0.56','1.98'],}

and I would like to multiply the values of NodeA and NodeB for the same Time-Parameter pairs.

The output I would ultimately like to have is:

df  =  {'Time':['0', '1', '0', '1'],
        'Parameter':['down_A', 'down_A', 'up_A','up_A'],
        'NodeA':['2.56', '0.06', '0.14', '1.005'],
        'NodeB':['1.44', '1.11','0.56','1.98'],
        'Multiplied':['3.6864','0.0666','0.0784','1.9899']}

How can I remove the duplicated rows and have a single row for each Time-Parameter pair?

CodePudding user response:

First convert values to floats and then aggregate first or sum, min, 'max'..., last multiple columns:

df[['NodeA','NodeB']] = df[['NodeA','NodeB']].astype(float)

df= df.groupby(['Time','Parameter'], as_index=False, sort=False).first()
df['Multiplied'] = df['NodeA'].mul(df['NodeB'])
print (df)

  Time Parameter  NodeA  NodeB  Multiplied
0    0    down_A  2.560   1.44      3.6864
1    1    down_A  0.060   1.11      0.0666
2    0      up_A  0.140   0.56      0.0784
3    1      up_A  1.005   1.98      1.9899
  • Related