I am trying to replace a value based on the row index, and for only certain columns in a dataframe.
for columns b and c, I want to replace the value 1 with np.nan
, for rows 1, 2 and 3
df = pd.DataFrame(data={'a': ['"dog", "cat"', '"dog"', '"mouse"', '"mouse", "cat", "bird"', '"circle", "square"', '"circle"', '"triangle", "square"', '"circle"'],
'b': [1,1,3,4,5,1,2,3],
'c': [3,4,1,3,2,1,0,0],
'd': ['a','a','b','c','b','c','d','e'],
'id': ['group1','group1','group1','group1', 'group2','group2','group2','group2']})
I am using the following line but its not updating in place, and if I try assigning it, returns only the subset of amended rows, rather than an update version of the original dataframe.
df[df.index.isin([1,2,3])][['b','c']].replace(1, np.nan, inplace=True)
CodePudding user response:
You could do it like this:
df.loc[1:3, ['b', 'c']] = df.loc[1:3, ['b', 'c']].replace(1, np.nan)
Output:
>>> df
a b c d id
0 "dog", "cat" 1.0 3.0 a group1
1 "dog" NaN 4.0 a group1
2 "mouse" 3.0 NaN b group1
3 "mouse", "cat", "bird" 4.0 3.0 c group1
4 "circle", "square" 5.0 2.0 b group2
5 "circle" 1.0 1.0 c group2
6 "triangle", "square" 2.0 0.0 d group2
7 "circle" 3.0 0.0 e group2
A more dynamic version:
cols = ['b', 'c']
rows = slice(1, 3) # or [1, 2, 3] if you want
df.loc[rows, cols] = df.loc[rows, cols].replace(1, np.nan)