Home > database >  Setting Values with pandas DataFrame.loc
Setting Values with pandas DataFrame.loc

Time:11-20

Consider I have a data frame :

>>> data
   c0  c1  c2   _c1   _c2
0   0   1   2  18.0  19.0
1   3   4   5   NaN   NaN
2   6   7   8  20.0  21.0
3   9  10  11   NaN   NaN
4  12  13  14   NaN   NaN
5  15  16  17   NaN   NaN

I want to update the values in the c1 and c2 columns with the values in the _c1 and _c2 columns whenever those latter values are not NaN. Why won't the following work, and what is the correct way to do this?

>>> data.loc[~(data._c1.isna()),['c1','c2']]=data.loc[~(data._c1.isna()),['_c1','_c2']]
>>> data
   c0    c1    c2   _c1   _c2
0   0   NaN   NaN  18.0  19.0
1   3   4.0   5.0   NaN   NaN
2   6   NaN   NaN  20.0  21.0
3   9  10.0  11.0   NaN   NaN
4  12  13.0  14.0   NaN   NaN
5  15  16.0  17.0   NaN   NaN

For completeness's sake I want the result to look like

>>> data.loc[~(data._c1.isna()),['c1','c2']]=data.loc[~(data._c1.isna()),['_c1','_c2']]
>>> data
   c0    c1    c2   _c1   _c2
0   0  18.0  19.0. 18.0  19.0
1   3   4.0   5.0   NaN   NaN
2   6  20.0  21.0  20.0  21.0
3   9  10.0  11.0   NaN   NaN
4  12  13.0  14.0   NaN   NaN
5  15  16.0  17.0   NaN   NaN

CodePudding user response:

I recommend update after rename

df.update(df[['_c1','_c2']].rename(columns={'_c1':'c1','_c2':'c2'}))
df
Out[266]: 
   c0    c1    c2   _c1   _c2
0   0  18.0  19.0  18.0  19.0
1   3   4.0   5.0   NaN   NaN
2   6  20.0  21.0  20.0  21.0
3   9  10.0  11.0   NaN   NaN
4  12  13.0  14.0   NaN   NaN
5  15  16.0  17.0   NaN   NaN

CodePudding user response:

You can use np.where:

df[['c1', 'c2']] = np.where(df[['_c1', '_c2']].notna(),
                            df[['_c1', '_c2']],
                            df[['c1', 'c2']])
print(df)

# Output:
   c0    c1    c2   _c1   _c2
0   0  18.0  19.0  18.0  19.0
1   3   4.0   5.0   NaN   NaN
2   6  20.0  21.0  20.0  21.0
3   9  10.0  11.0   NaN   NaN
4  12  13.0  14.0   NaN   NaN
5  15  16.0  17.0   NaN   NaN
  • Related