Home > Blockchain >  pandas - filter only works on 1 row when used as a stored variable?
pandas - filter only works on 1 row when used as a stored variable?

Time:10-24

Having a strange issue with using a filter in loc.

example df:

     Name trail
0    XYZ  True
1     A    True
2     B    True
3     C    True

# Trail filter
filter_trail = (df['trail'] == False)

# Set a row to False to check
df.at[3, 'trail'] = False

# use the filter, using loc because I will combine conditions
df.loc[filter_trail]
# I get the expected result

# Test further
df.at[0, 'trail'] = False
# use loc statement from earlier
# result only shows the 1st row i.e. the row with index 3
# No error in terminal

# decide to try dropping the column and setting column
df.drop('trail', axis=1, inplace=True)
df['trail'] = [True, False, False, False]

# run loc
df.loc[filter_trail]
# result still only shows row with index 3

# run without loc
df[filter_trail]
# result still only shows row with index 3

# run
df[df['trail'] == False]
# Get the desired result i.e. row index: 1,2,3

I am not sure what I am doing wrong here. Never seen this happen before.

CodePudding user response:

filter_trail is not created as a reference to the Dataframe, rather a boolean calculated value from one of trail column of the DF. Thereby, creating a new set of data, which was calculated from DF column, but not referencing it.

Two of the fellow stackoverflow contributors (comments above) and me as third tried out your code and we all received an empty filter_trail.

  • Related