I have three Series
trans = pd.Series([2020-01-01, 2022-03-15 ...]) #transaction
start = pd.Series([2020-02-03, 2021-05-13]) #valid dates start
end =pd.Series([2020-02-28,2021-06-13]) #valid dates end
and I want to see if a transaction is "valid" which it is, if it was made in a date-range defined by start
and end
i.e start[i]
and end[i]
composes one date-range of which the transaction could be valid, e.g for the first transaction, I want to check if 2020-01-01
is between (2020-02-03,2020-02-28)
or (2021-05-13,2021-06-13)
.
I have tried using between
e.g trans.between(start,end)
but since trans
is not the same size, it cannot be done in that way.
I have written a list comprehension but it loops through all start
,end
pairs for each element in trans
, which is not optimal, since it should terminate when it just finds one pair.
I can remove some of the transactions by checking, if the date is above min(start)
and below max(end)
, but it still takes a while by my own solution
CodePudding user response:
You can use numpy broadcasting for test all values between all values of both Series
and test if at least one value match in any
:
ser = trans.to_numpy()[:, None]
left = start.to_numpy()
right = end.to_numpy()
m = ((left <= ser) & (ser <= right)).any(axis=1)
out = trans[m]
print (out)
CodePudding user response:
You could do it like this. Hope it helps:
import pandas as pd
import datetime
trans = pd.Series(["2020-01-01", "2022-03-15", "2021-05-20", "2022-02-10", "2009-09-09"])
start = pd.Series(["2020-02-03", "2021-05-13", "2020-05-06"])
end =pd.Series(["2020-02-28","2021-06-13", "2020-05-08"])
start = list(map(datetime.date.fromisoformat, start))
end = list(map(datetime.date.fromisoformat, end))
trans = list(map(datetime.date.fromisoformat, trans))
for elem in trans:
print(elem)
for lft,rgt in zip(start,end):
if lft <= elem <= rgt:
print(f'{elem} is between {lft} & {rgt}')
break
output:
2020-01-01
2022-03-15
2021-05-20
2021-05-20 is between 2021-05-13 & 2021-06-13
2022-02-10
2009-09-09
it checks all elements but stops looking in start/end when finding a match