Home > Mobile >  Check which value from my list is not in my dataframe column
Check which value from my list is not in my dataframe column

Time:12-10

I need to check if any of the values in my list is missing in my df column. I used this :

data_xls['date'].isin([datetime(2015, 7, 20, 11,7),datetime(2015, 7, 20, 11,13),datetime(2015, 7, 20, 11,14),datetime(2015, 7, 20, 11,16)])

But I also want to know which one amongst my list is missing. How can I do that?

CodePudding user response:

If need difference between dates and data_xls['date'] columns use:

data_xls = pd.DataFrame({'date': pd.date_range(datetime(2015, 7, 20, 11,11), 
                                               freq='1Min', periods=5)})
print (data_xls)
                 date
0 2015-07-20 11:11:00
1 2015-07-20 11:12:00
2 2015-07-20 11:13:00
3 2015-07-20 11:14:00
4 2015-07-20 11:15:00

dates = [datetime(2015, 7, 20, 11,7),datetime(2015, 7, 20, 11,13),
         datetime(2015, 7, 20, 11,14),datetime(2015, 7, 20, 11,16)]

missing = [x for x in dates if x not in set(data_xls['date'])]
print (missing)
[datetime.datetime(2015, 7, 20, 11, 7), datetime.datetime(2015, 7, 20, 11, 16)]

missing = list(set(dates) - set(data_xls['date']))
print (missing)
[datetime.datetime(2015, 7, 20, 11, 7), datetime.datetime(2015, 7, 20, 11, 16)]

CodePudding user response:

You need the ~ symbol to index the dates that are not in that list:

lst = [datetime(2015, 7, 20, 11,7),datetime(2015, 7, 20, 11,13),datetime(2015, 7, 20, 11,14),datetime(2015, 7, 20, 11,16)]
data_xls['date'][~data_xls['date'].isin(lst)]

But since you want the dates in your list missing in data_xls, you can find that by:

set(lst).difference(data_xls['date'])
  • Related