Home > Mobile >  KeyError: False when I enter conditions in Pandas
KeyError: False when I enter conditions in Pandas

Time:11-26

I'm getting the KeyError: False when I run this line:

df['Eligible'] = df[('DeliveryOnTime' == "On-time") | ('DeliveryOnTime' == "Early")]

I've been trying to find a way to execute this condition using np.where and .loc() as well but neither work. Open to other ideas on how to apply the condition to the new column Eligible using data from DeliveryOnTime

I've tried these:

np.where

df['Eligible'] = np.where((df['DeliveryOnTime'] == "On-time") | (df['DeliveryOnTime'] == "Early"), 1, 1)

.loc()

df['Eligible'] = df.loc[(df['DeliveryOnTime'] == "On-time") & (df['DeliveryOnTime'] == "Early"), 'Total Orders'].sum()

Sample Data:

data = {'ID': [1, 1, 1, 2, 2, 3, 4, 5, 5],
      'DeliveryOnTime': ["On-time", "Late", "Early", "On-time", "On-time", "Late", "Early", "Early", "On-time"],
     }

df = pd.DataFrame(data)

#For the sake of example data, the count of `DeliveryOnTime` will be the total number of orders. 
df['Total Orders'] = df['DeliveryOnTime'].count() 

CodePudding user response:

The right syntax is:

df['Eligible'] = (df['DeliveryOnTime'] == "On-time") | (df['DeliveryOnTime'] == "Early")

# OR

df['Eligible'] = df['DeliveryOnTime'].isin(["On-time", "Early"])

Output:

>>> df
   ID DeliveryOnTime  Total Orders  Eligible
0   1        On-time             9      True
1   1           Late             9     False
2   1          Early             9      True
3   2        On-time             9      True
4   2        On-time             9      True
5   3           Late             9     False
6   4          Early             9      True
7   5          Early             9      True
8   5        On-time             9      True

CodePudding user response:

The df references are misplaced. Please try:

df['Elegible'] = (df['DeliveryOnTime'] == "On-time") | (df['DeliveryOnTime'] =="Early")

Output:

>>> df
   ID DeliveryOnTime  Elegible
0   1        On-time      True
1   1           Late     False
2   1          Early      True
3   2        On-time      True
4   2        On-time      True
5   3           Late     False
6   4          Early      True
7   5          Early      True
8   5          Early      True

CodePudding user response:

You cannot call these columns directly. Have a look at the solution that detects all rows that are either 'On-time' or 'Early'

df["eligible"] = df.DeliveryOnTime.isin(['On-time', 'Early'])
df['eligible'].groupby(df['ID']).transform('sum')
df
    ID  DeliveryOnTime  eligible    TotalOrders
0   1   On-time         True        2
1   1   Late            False       2
2   1   Early           True        2
3   2   On-time         True        2
4   2   On-time         True        2
5   3   Late            False       0
6   4   Early           True        1
7   5   Early           True        1
  • Related