Home > OS >  Python: Mask is replacing column values with NaNs
Python: Mask is replacing column values with NaNs

Time:10-11

I am trying to replace the zeroes in ~15000 columns [columns 6:14844] with the group mean, while leaving the group label column [column 1] and a couple of other identifying columns [columns 2:5] untouched . This is the code I came up with, which works, other than that it replaces the columns that I want to be skipped over [1:5] with NaNs

df = df.mask(df.iloc[:, np.r_[1, 6:14844]].eq(0), df.iloc[:, np.r_[1, 6:14844]].groupby('group_label').transform('mean'))

Thanks in advance.

CodePudding user response:

You need to assign to only the subset of columns, not the whole DataFrame.

df.iloc[:, 6:14844] = (df.iloc[:, 6:14844]
                         .mask(df.iloc[:, 6:14844].eq(0),
                               df.iloc[:, np.r_[1, 6:14844]]
                                 .groupby('group_label')
                                 .transform('mean')))
  • Related