I have a dataframe like this:
df = pd.DataFrame(columns=['Dog', 'Small', 'Adult'])
df.Dog = ['Poodle', 'Shepard', 'Bird dog','St.Bernard']
df.Small = [1,1,0,0]
df.Adult = 0
That will look like this:
Dog Small Adult
0 Poodle 1 0
1 Shepard 1 0
2 Bird dog 0 0
3 St.Bernard 0 0
Then I would like to change one column based on another. I can do that:
df.loc[df.Small == 0, 'Adult'] = 1
However, I just want to do so for the 3 first rows.
I can select the first three rows:
df.iloc[0:2]
But if I try to change values on the first three rows:
df.iloc[0:2, df.Small == 0, 'Adult'] = 1
I get an error.
I also get an error if I merge the two:
df.iloc[0:2].loc[df.Small == 0, 'Adult'] = 1
It tells me that I am trying to set a value on a copy of a slice.
How should I do this correctly?
CodePudding user response:
You could include the range as another condition in your .loc
selection (for the general case, I'll explicitly include the 0):
df.loc[(df.Small == 0) & (0 <= df.index) & (df.index <= 2), 'Adult'] = 1
Another option is to transform the index into a series to use pd.Series.between
:
df.loc[(df.Small == 0) & (df.index.to_series().between(0, 2)), 'Adult'] = 1