I have the following df:
colA colB colC
12 33 66
13 35 67
14 44 77
15 55 79
18 56 81
I would like to replace the values of colB and colC with None starting from index 2 all the way to the end of df. The expected output is:
colA colB colC
12 33 66
13 35 67
14 None None
15 None None
18 None None
CodePudding user response:
Use DataFrame.loc
with any index and columns names in list:
df.loc[df.index[2:], ['colB','colC']] = None
If there is default RangeIndex
use 2:
:
df.loc[2:, ['colB','colC']] = None
print (df)
colA colB colC
0 12 33.0 66.0
1 13 35.0 67.0
2 14 NaN NaN
3 15 NaN NaN
4 18 NaN NaN
Because numeric values are None
s converted to NaN
s.
If need integers with missing values use Int64
:
df[['colB','colC']] = df[['colB','colC']].astype('Int64')
print (df)
colA colB colC
0 12 33 66
1 13 35 67
2 14 <NA> <NA>
3 15 <NA> <NA>
4 18 <NA> <NA>
CodePudding user response:
You can do something like this -
df.loc[2:, "colB":] = None
Basically using the loc method to select the rows starting from index 2 and the columns colB and colC, and then assign the value None to them. This will replace the values of colB and colC with None starting from index 2.
CodePudding user response:
Apart from pandas.DataFrame.loc
(that jezrael's mentions), one can use pandas.DataFrame.iloc
as follows
df.iloc[2:, 1:] = None
[Out]:
colA colB colC
0 12 33.0 66.0
1 13 35.0 67.0
2 14 NaN NaN
3 15 NaN NaN
4 18 NaN NaN
Note that colB
and colC
are floats, because NaN
is a float. If one doesn't want those columns to be float64
, one approach would be to use pandas.Int64Dtype
as follows
df[['colB', 'colC']] = df[['colB', 'colC']].astype(pd.Int64Dtype())
[Out]:
colA colB colC
0 12 33 66
1 13 35 67
2 14 <NA> <NA>
3 15 <NA> <NA>
4 18 <NA> <NA>