Home > Enterprise >  How to accumulate not null values of each row into separate column/series?
How to accumulate not null values of each row into separate column/series?

Time:06-24

I have a dataframe with a subset of columns with the form:

   col_A  col_B  col_C
0    NaN    1.0    NaN
1    NaN    NaN    NaN
2    NaN    NaN    2.0
3    3.0    NaN    4.0

I want to create a series of the notnull values for each row. For rows with multiple not null values, I want either the average or first value e.g.

   new_col
0      1.0
1      NaN
2      2.0
3      3.5

I'll eventually want to add this back to the original dataframe as a separate column so I need to persist rows with all NaN values by forward filling them e.g.

   new_col
0      1.0
1      1.0
2      2.0
3      3.5

I know how to determine if there is a not null value in the dataframe, but I don't know how to select for it:

df[['col_A', 'col_B', 'col_C']].count(axis=1) >= 1

CodePudding user response:

You can use:

df.mean(axis=1).ffill()

Or to restrict the columns:

df[['col_A', 'col_B', 'col_C']].mean(axis=1).ffill()

Output:

0    1.0
1    1.0
2    2.0
3    3.5
dtype: float64
  • Related