Home > OS >  Panda dataFrame check if value exists
Panda dataFrame check if value exists

Time:03-04

I have a dataframe that looks something like this.

Date         |   Hour
--------------------------
11/06/2020   |   13
17/06/2020   |   12
02/07/2020   |   9

I wanna check if the Date exists where the Hour is also below 10. Below is my code, but it was wrong.

if((dataFrame['Hour'] < 10).any() & (dataFrame['Date'] == date).any()):
            point = 1

Kindly need some helps from you guys. Thanks in advance!

CodePudding user response:

I would use isinstance as follows:

df = pd.DataFrame({'Date':[datetime.datetime(2020, 6, 11), datetime.datetime(2020, 6, 17), datetime.datetime(2020, 7, 2), datetime.datetime(2020, 7, 12), 0],
                    'Hour':[13, 12, 9, 5, 3]})

df.loc[df.Hour < 10, 'Date'].apply(lambda x: isinstance(x, datetime.datetime))

df:

Date Hour
0 2020-06-11 00:00:00 13
1 2020-06-17 00:00:00 12
2 2020-07-02 00:00:00 9
3 2020-07-12 00:00:00 5
4 0 3

Result:

2     True
3     True
4    False
Name: Date, dtype: bool

Issues:

If the only non-datetime values in column "Date" are of type None as in the case of None and np.nan, pandas will assume the entire column is type datetime and this method will not work as intended. To solves this problem replace None values with 0.

Solution:

df.loc[df.Hour < 10, 'Date'].replace(np.nan, 0).apply(lambda x: isinstance(x, datetime.datetime))

CodePudding user response:

You can filter a Data Frame with multiple conditions with & and | operators, or use the query() method, then check the length of the resulting Data Frame for the number of matching rows.

Method 1: Using & operator

import pandas as pd

dataFrame = pd.DataFrame({
    'Date': ['11/06/2020', '17/06/2020', '02/07/2020'],
    'Hour': [13, 12, 9]
})

date = '02/07/2020' # target date to match

if len(dataFrame[(dataFrame['Hour'] < 10) & (dataFrame['Date'] == date)]) != 0:
  # at least one row in dataframe matches both conditions
  point = 1
else:
  point = 0

Method 2: Using query() method.

results = dataFrame.query(f"Hour  < 10 and Date == '{date}'")
if len(results) != 0:
  print("match")
  • Related