How to handle KeyError(key)-CodePudding

import pandas as pd

colnames = ['Date', 'Items', 'Quantity', 'Price']
df1 = pd.read_csv('data_assignment_1.txt',sep=" ",names=colnames, header=None)

print(df1)


Output:

         Date   Items  Quantity  Price

0  2020-09-23  Item_A         1    1.9
1  2020-09-23  Item_B         1    1.2
2  2020-09-23  Item_A         2    1.9
3  2020-09-23  Item_B         1    1.2
4  2020-09-24  Item_A         1    1.9
5  2020-09-24  Item_B         1    1.2
6  2020-09-24  Item_C         1    1.3
7  2020-09-25  Item_D         1    2.7

Now I groupby date and items to find the total Quantity:

groupby_date_item = df1.groupby(['Date', 'Items'])['Quantity'].sum()

print(groupby_date_item['2020-09-23','Item_A'])

Output result:
 3

Now the problem is if i put Item D with the date 2020-09-23 I will get an error:

print(groupby_date_item['2020-09-23','Item_D'])

Output result:
raise KeyError(key) from err
KeyError: ('2020-09-23', 'Item_d')

How do I handle the error if item does not exist on that date or input wrong date and item?

CodePudding user response：

As you have discovered, you will get an Error if you try to retrieve a value that does not exist. One way to resolve this issue is to use a try-except clause.

See example below - where I use a custom function to handle the data retrieval using a try-except clause. If the function generates an error, it will return np.nan.

Code:

import numpy as np
import pandas as pd

def get_Item(my_Date, my_Item, gb_date_item):
    try:
        result = gb_date_item[my_Date, my_Item]
    except:
        result = np.nan
        
    return result

df1 = pd.DataFrame({ 'Date': ['2020-09-23', '2020-09-23', '2020-09-23', '2020-09-23', '2020-09-24', '2020-09-24', '2020-09-24', '2020-09-25'],
                    'Items': ['Item_A', 'Item_B', 'Item_A', 'Item_B', 'Item_A', 'Item_B', 'Item_C', 'Item_D'],
                    'Quantity': [1, 1, 2, 1, 1, 1, 1, 1],
                    'Price': [1.9, 1.2, 1.9, 1.2, 1.9, 1.2, 1.3, 2.7]})

groupby_date_item = df1.groupby(['Date', 'Items'])['Quantity'].sum()


print(get_Item('2020-09-23', 'Item_A', groupby_date_item))
print(get_Item('2020-09-23', 'Item_D', groupby_date_item))

Output:

3
nan

CodePudding user response：

Here is an alternative where you check for the key. In simple terms direct to your question, you would just use if ('2020-09-23', 'Item_D') in groupby_date_item: ...

This is a more complete example:

import pandas as pd

df1 = pd.DataFrame({

    "Date":['2020-09-23','2020-09-23','2020-09-25'],
    "Items":['Item_A','Item_A','Item_D'],
    "Quantity":[1,1,1]
})

g = df1.groupby(['Date','Items'])['Quantity'].sum()

for tup in [('2020-09-23', 'Item_A'), ('2020-09-23', 'Item_D')]:
    if tup in g.index:
        print(f'{str(tup)}: {g[tup]}')
    else:
        print(f'{str(tup)}: not found')