Home > Software engineering >  Using 'isin()' function to compare values in two different pandas series - unhashable type
Using 'isin()' function to compare values in two different pandas series - unhashable type

Time:11-14

I have the following code.

I am trying to check if a 'date-time' value in the column numberofeachconditiononthatdate['Date'] is in the column 'luckonthatdate['Date']'.

If it is, then I want that particular date-time value to be assigned to the variable 'value'.

If not, then I want the variable 'value' to equal 0.

In other words, I want to create a new column of values to the 'numberofeachconditiononthatdate' dataframe, indicating the number of 'luck' trials on a given date.

luckvalues = []

for idx in numberofeachconditiononthatdate.iterrows():
    if numberofeachconditiononthatdate['Date'][[idx]].isin(luckonthatdate['Date']):
       value = luckonthatdate['Date'][[idx]]
       luckvalues = luckvalues.append(value)
    else:
       value = 0
       luckvalues = luckvalues.append(value) 

print(luckvalues)

However, this gives me the error 'unhashable type: 'Series''.

I would be so grateful for a helping hand!

numberofeachconditiononthatdate['Date']

0   2020-04-06
1   2020-04-06
2   2020-04-06
3   2020-05-06
4   2020-05-06
5   2020-05-06
6   2020-06-06
7   2020-06-06
8   2020-06-06
9   2020-06-13

luckonthatdate['Date'].head(10)

0    2020-04-06
3    2020-05-06
6    2020-06-06
9    2020-06-13
16   2020-10-06
20   2020-11-06
23   2020-12-06

CodePudding user response:

Instead of an explicit for loop, you can optimise it using merge. You can do something like:

numberofeachconditiononthatdate = (numberofeachconditiononthatdate
                                  .merge(luckonthatdate[['Date', 'luck']], how='left', on='Date'))

numberofeachconditiononthatdate['luck'] = numberofeachconditiononthatdate['luck'].fillna(0)

This will add a new column Dummy_Date in numberofeachconditiononthatdate dataframe. Later you can rename it as you want.

CodePudding user response:

If you want to add a column with the amount of repeated values per each index, you should use value_counts() and pass it to map() and lastly, use fillna(). For easiness I am going to rename:

df1 = numberofeachconditiononthatdate.copy()
df2 = luckonthatdate.copy()

And then create the luck column using:

df2['Luck'] = df2['Date'].map(df1['Date'].value_counts()).fillna(0)

Returning:

         Date  Luck
0  2020-04-06   3.0
1  2020-05-06   3.0
2  2020-06-06   3.0
3  2020-06-13   1.0
4  2020-10-06   0.0
5  2020-11-06   0.0
6  2020-12-06   0.0

CodePudding user response:

I solved the issue by using a for loop with an 'if string is in row' command.

for i in range(0,len(numberofeachconditiononthatdate)): 
         if 'luck' in numberofeachconditiononthatdate['condition'].iloc[i]:
            newvalue = numberofeachconditiononthatdate['numberoftrials'].iloc[i]
            newvalues.append(newvalue)
         else:
            newvalue = 0 
            newvalues.append(newvalue)
    
print(newvalues)

[5, 0, 0, 1, 0, 0, 1, 0, 0, 3, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 3, 0, 0, 6, 0, 0]

Thanks so much for your help :)

  • Related