Home > Blockchain >  How to handle KeyError(key)
How to handle KeyError(key)

Time:12-29

import pandas as pd

colnames = ['Date', 'Items', 'Quantity', 'Price']
df1 = pd.read_csv('data_assignment_1.txt',sep=" ",names=colnames, header=None)

print(df1)


Output:

         Date   Items  Quantity  Price

0  2020-09-23  Item_A         1    1.9
1  2020-09-23  Item_B         1    1.2
2  2020-09-23  Item_A         2    1.9
3  2020-09-23  Item_B         1    1.2
4  2020-09-24  Item_A         1    1.9
5  2020-09-24  Item_B         1    1.2
6  2020-09-24  Item_C         1    1.3
7  2020-09-25  Item_D         1    2.7

Now I groupby date and items to find the total Quantity:

groupby_date_item = df1.groupby(['Date', 'Items'])['Quantity'].sum()

print(groupby_date_item['2020-09-23','Item_A'])

Output result:
 3

Now the problem is if i put Item D with the date 2020-09-23 I will get an error:

print(groupby_date_item['2020-09-23','Item_D'])

Output result:
raise KeyError(key) from err
KeyError: ('2020-09-23', 'Item_d')

How do I handle the error if item does not exist on that date or input wrong date and item?

CodePudding user response:

As you have discovered, you will get an Error if you try to retrieve a value that does not exist. One way to resolve this issue is to use a try-except clause.

See example below - where I use a custom function to handle the data retrieval using a try-except clause. If the function generates an error, it will return np.nan.

Code:

import numpy as np
import pandas as pd

def get_Item(my_Date, my_Item, gb_date_item):
    try:
        result = gb_date_item[my_Date, my_Item]
    except:
        result = np.nan
        
    return result

df1 = pd.DataFrame({ 'Date': ['2020-09-23', '2020-09-23', '2020-09-23', '2020-09-23', '2020-09-24', '2020-09-24', '2020-09-24', '2020-09-25'],
                    'Items': ['Item_A', 'Item_B', 'Item_A', 'Item_B', 'Item_A', 'Item_B', 'Item_C', 'Item_D'],
                    'Quantity': [1, 1, 2, 1, 1, 1, 1, 1],
                    'Price': [1.9, 1.2, 1.9, 1.2, 1.9, 1.2, 1.3, 2.7]})

groupby_date_item = df1.groupby(['Date', 'Items'])['Quantity'].sum()


print(get_Item('2020-09-23', 'Item_A', groupby_date_item))
print(get_Item('2020-09-23', 'Item_D', groupby_date_item))

Output:

3
nan

CodePudding user response:

Here is an alternative where you check for the key. In simple terms direct to your question, you would just use if ('2020-09-23', 'Item_D') in groupby_date_item: ...

This is a more complete example:

import pandas as pd

df1 = pd.DataFrame({

    "Date":['2020-09-23','2020-09-23','2020-09-25'],
    "Items":['Item_A','Item_A','Item_D'],
    "Quantity":[1,1,1]
})

g = df1.groupby(['Date','Items'])['Quantity'].sum()

for tup in [('2020-09-23', 'Item_A'), ('2020-09-23', 'Item_D')]:
    if tup in g.index:
        print(f'{str(tup)}: {g[tup]}')
    else:
        print(f'{str(tup)}: not found')
  • Related