I have the following dataframes:
df1 = pd.DataFrame(data={'col1': ['a', 'd', 'g', 'j'],
'col2': ['b', 'c', 'i', np.nan],
'col3': ['c', 'f', 'i', np.nan],
'col4': ['x', np.nan, np.nan, np.nan]},
index=pd.Series(['ind1', 'ind2', 'ind3', 'ind4'], name='index'))
index | col1 | col2 | col3 | col4 |
---|---|---|---|---|
ind1 | a | b | c | x |
ind2 | d | c | f | NaN |
ind3 | g | i | i | NaN |
ind4 | j | NaN | NaN | NaN |
df2 = pd.Series(data=[True, False, True, False],
index=pd.Series(['ind1', 'ind2', 'ind3', 'ind4']))
ind1 | True |
ind2 | False |
ind3 | True |
ind4 | False |
How do I make the last 2 values for each row in df1
into NA, based on the boolean values of df2
?
In this case, since ind1
and ind3
are True, it would impact the same indices in df1
.
index | col1 | col2 | col3 | col4 |
---|---|---|---|---|
ind1 | a | b | NaN | NaN |
ind2 | d | c | f | NaN |
ind3 | g | i | NaN | NaN |
ind4 | j | NaN | NaN | NaN |
CodePudding user response:
A possible solution, based on pandas.DataFrame.mask
:
df1[['col3', 'col4']] = df1[['col3', 'col4']].mask(df2)
Output:
col1 col2 col3 col4
index
ind1 a b NaN NaN
ind2 d c f NaN
ind3 g i NaN NaN
ind4 j NaN NaN NaN
CodePudding user response:
You can use boolean indexing:
N = 2
df1.iloc[df2, -N:] = np.nan
NB. what you call df2
is actually a Series, s
/ser
might be more appropriate as a name.
output:
col1 col2 col3 col4
index
ind1 a b NaN NaN
ind2 d c f NaN
ind3 g i NaN NaN
ind4 j NaN NaN NaN