I need to check if any of the values in my list is missing in my df column. I used this :
data_xls['date'].isin([datetime(2015, 7, 20, 11,7),datetime(2015, 7, 20, 11,13),datetime(2015, 7, 20, 11,14),datetime(2015, 7, 20, 11,16)])
But I also want to know which one amongst my list is missing. How can I do that?
CodePudding user response:
If need difference between dates
and data_xls['date']
columns use:
data_xls = pd.DataFrame({'date': pd.date_range(datetime(2015, 7, 20, 11,11),
freq='1Min', periods=5)})
print (data_xls)
date
0 2015-07-20 11:11:00
1 2015-07-20 11:12:00
2 2015-07-20 11:13:00
3 2015-07-20 11:14:00
4 2015-07-20 11:15:00
dates = [datetime(2015, 7, 20, 11,7),datetime(2015, 7, 20, 11,13),
datetime(2015, 7, 20, 11,14),datetime(2015, 7, 20, 11,16)]
missing = [x for x in dates if x not in set(data_xls['date'])]
print (missing)
[datetime.datetime(2015, 7, 20, 11, 7), datetime.datetime(2015, 7, 20, 11, 16)]
missing = list(set(dates) - set(data_xls['date']))
print (missing)
[datetime.datetime(2015, 7, 20, 11, 7), datetime.datetime(2015, 7, 20, 11, 16)]
CodePudding user response:
You need the ~
symbol to index the dates that are not in that list:
lst = [datetime(2015, 7, 20, 11,7),datetime(2015, 7, 20, 11,13),datetime(2015, 7, 20, 11,14),datetime(2015, 7, 20, 11,16)]
data_xls['date'][~data_xls['date'].isin(lst)]
But since you want the dates in your list missing in data_xls
, you can find that by:
set(lst).difference(data_xls['date'])