Home > database >  Change cell value with condition
Change cell value with condition

Time:03-30

I have a dataframe:

df = pd.DataFrame(
        {'a': ['banana', 'coconut', 'banana', 'apple'],
         'b': ['rice', 'bean', 'rice', 'soap'],
         'c': ['mouse', 'dog', None,'apple'],
         'd': ['cat', 'soap', 'beef', 'rabbit']}
    )


         a     b     c       d
0   banana  rice   mouse  cat
1  coconut  bean   dog    soap
2   banana  rice  None    cat
3    apple  soap  apple   rabbit

If a line contains the value None (here in index 2), we look for the line whose values ​​are exactly the same and change the value of None by that of the same column. So here the row of index 0 and the row of index 2 have the same values ​​except in column 'c'. We then change None by 'cat' The expected result is therefore:

         a     b     c       d
0   banana  rice   mouse   cat
1  coconut  bean   dog     soap
2   banana  rice   mouse   cat
3    apple  soap   apple   rabbit

Quelqu'un à une solution à cette probleme, merci

CodePudding user response:

df.loc[df['c'].isnull(), 'c'] = df[df.duplicated(subset = ['a', 'b'], keep = 'last')]['c'].values

df

Output:

|index|    a    | b  |  c  |  d   |
|-----|---------|----|-----|------|
|  0  | banana  |rice|mouse| cat  |
|  1  | coconut |bean| dog | soap |
|  2  | banana  |rice|mouse| beef |
|  3  | apple   |soap|apple|rabbit|

CodePudding user response:

This code would do the trick for any number of Nones:

In [183]: df = pd.DataFrame(
     ...:         {'a': ['banana', 'coconut', 'banana', 'apple', None],
     ...:          'b': ['rice', 'bean', 'rice', 'soap', 'soap'],
     ...:          'c': ['mouse', 'dog', None, 'apple', 'apple'],
     ...:          'd': ['cat', 'soap', 'cat', 'rabbit', None]}
     ...:     )

In [184]: df
Out[184]: 
         a     b      c       d
0   banana  rice  mouse     cat
1  coconut  bean    dog    soap
2   banana  rice   None     cat
3    apple  soap  apple  rabbit
4     None  soap  apple    None

In [185]: rows = df.isnull().any(axis=1).to_numpy().nonzero()[0] # rows with None
     ...: for i in rows:
     ...:     row = df.iloc[i]
     ...:     cols = df.columns[row.notnull()] # columns without None
     ...:     replacement = (df[cols] == row[cols]).all(axis=1).to_numpy().nonzero()[0]
     ...:     for j in replacement:
     ...:         if i != j:
     ...:             df.loc[i] = df.loc[j]
     ...:             break

In [186]: df
Out[186]: 
         a     b      c       d
0   banana  rice  mouse     cat
1  coconut  bean    dog    soap
2   banana  rice  mouse     cat
3    apple  soap  apple  rabbit
4    apple  soap  apple  rabbit
  • Related