I have DataFrame in Python Pandas like below:
ID | DATE | LOG |
---|---|---|
123 | 2021-12-31 | 2021-12-30 |
445 | 2021-12-31 | 2022-01-15 |
2232 | 2021-12-31 | NaN |
And I need to create function which argument will be date from column "DATE" and this function will return ID of clients from column "ID" who were logged (column "LOG") before date from column "DATE" or have NaN in column "LOG". So fo example:
my_function(df["DATE"])
will return because these clients who have LOG < DATE or LOG == NaN
ID
-----
123
2232
CodePudding user response:
You could write your condition and use boolean indexing:
def my_function(df):
msk = (df['DATE'] > df['LOG']) | df['LOG'].isna()
return df.loc[msk, 'ID']
>>> my_function(df)
0 123
2 2232
Name: ID, dtype: int64
CodePudding user response:
This function takes two arguments to print your clients ID
where LOG
is not NaN
and have logged before date time value from "DATE"
import numpy as np
def my_function(df, date):
return df["ID"].loc((not np.isnan(df["LOG"])) | (df["DATE"]>df["LOG"]))