I have a function that is giving "TypeError: unhashable type: 'DataFrame'" error. There are two calls to the function the first is successful while the second throws the error. The data is the only difference between the two calls. The second call's data is a subset of the first (amounts > 25k).
def loss_by_entity(data, col_list, sum_list):
agg_list = sum_list
for i, item in enumerate(sum_list, start=0):
agg_list[i] = data.groupby(col_list)[item].sum()
agg_list[i] = pd.DataFrame(agg_list[i])
agg_list[i]['Loss_Type'] = item
agg_data = pd.concat(agg_list)
return agg_data
These are the calls and adj to data
Loss_Type = ['Incurred Loss', 'Paid ALAE', 'Incurred Loss & ALAE']
Quarterly_Total = loss_by_entity(data, ['lg_loss_entity'], Loss_Type)
lg_loss_data = data[abs(data['Incurred Loss & ALAE']) >= 250000 ]
Large_Loss = loss_by_entity(lg_loss_data, ['lg_loss_entity'], Loss_Type)
This is the error in its entirety
Traceback (most recent call last):
File "H:\Python\Excel_Automation\Large_Loss_Report.py", line 47, in <module>
Large_Loss = loss_by_entity(lg_loss_data, ['lg_loss_entity'], Loss_Type)
File "H:\Python\Excel_Automation\Large_Loss_Report.py", line 12, in loss_by_entity
agg_list[i] = data.groupby(col_list)[item].sum()
File "H:\Python\Excel_Automation\venv\lib\site-packages\pandas\core\groupby\generic.py", line 1338, in __getitem__
return super().__getitem__(key)
File "H:\Python\Excel_Automation\venv\lib\site-packages\pandas\core\base.py", line 249, in __getitem__
if key not in self.obj:
File "H:\Python\Excel_Automation\venv\lib\site-packages\pandas\core\generic.py", line 1994, in __contains__
return key in self._info_axis
File "H:\Python\Excel_Automation\venv\lib\site-packages\pandas\core\indexes\base.py", line 5008, in __contains__
hash(key)
TypeError: unhashable type: 'DataFrame'
CodePudding user response:
I think the problem is with agg_list = sum_list
, which makes sum_list
and Loss_Type
also modified. So, after the first call to the function, Loss_Type
was inserted into the dataframe, and this caused an error on the second call. You can use agg_list = sum_list.copy()
to avoid it, or create an empty list for agg_list
and then insert new dataframe.
Here is a similar question you can refer to
How do I clone a list so that it doesn't change unexpectedly after assignment?