Home > Back-end >  Replace a row in Pandas Dataframe with NaN if an ID value in that row isn't a certain value?
Replace a row in Pandas Dataframe with NaN if an ID value in that row isn't a certain value?

Time:03-04

I apologize for the potentially confusing title, but I will try to explain my situation as best I can.

Let's say I have a hypothetical Dataframe df, which has an id column and is arranged like...

  time  id   x    y
  1.0    0   5    9 
  2.0    1   6    8
  3.0    2   7    7
  4.0    1   8    6

Now lets say I want only the data from rows with df[id] = 1, but instead of dropping the other rows I fill it with NaN like this...

  time  id   x    y
  1.0    0   NaN  NaN
  2.0    1   6    8
  3.0    2   NaN  NaN
  4.0    1   8    6

Note that I specifically want to keep the time and id columns, just change the values of x and y to NaN for any rows that don't have the id column set to 1

My first attempt was to use DataFrame.groupby(), but this leads to any rows without the specific id value being dropped entirely, which I don't want. My first instinct is to go into df row by row, checking the id column, and changing the values to NaN manually if id != 1, but this seems like a very cumbersome and un-Pythonic way of doing this.

Any ideas? Thanks in advance!

CodePudding user response:

You can use simple selection with a boolean mask:

df.loc[df['id'].ne(1), ['x', 'y']] = float('nan')

Output:

   time  id    x    y
0   1.0   0  NaN  NaN
1   2.0   1  6.0  8.0
2   3.0   2  NaN  NaN
3   4.0   1  8.0  6.0

CodePudding user response:

from numpy import NAN
df.loc[df['id']!=1 , ['x','y']] = NAN
  • Related