I am extracting data and I am getting two separate lists. One list has the dates of when this test was done and the other list has if it's approved or not. For example
TestDate_lst = ['2022-03-24', '2022-03-24', '2022-03-24', '2022-03-04', '2022-03-24',
'2021-04-20', '2021-04-20']
Action_lst = ['Decline', 'Decline', 'Decline', 'Approve', 'Approve', 'Ignore', 'Decline']
What I am trying to do is that every time there is a Decline response in Action_lst you extract the date that it was declined.
The code that I have tried is:
DateTest = []
for i in TestDate_lst:
for a in Action_lst:
if a == "Decline":
DateTest.append(i)
else:
pass
Output:
DateTest_lst = ['2022-03-24', '2022-03-24', '2022-03-24', '2022-03-24', '2022-03-24',
'2022-03-24', '2022-03-24', '2022-03-24', '2022-03-24', '2022-03-24', '2022-03-24',
'2022-03-24', '2022-03-04', '2022-03-04', '2022-03-04', '2022-03-04', '2022-03-24',
'2022-03-24', '2022-03-24', '2022-03-24', '2021-04-20', '2021-04-20', '2021-04-20',
'2021-04-20']
Expected:
DateTest_lst = ['2022-03-24', '2022-03-24', '2022-03-24', '2021-04-20']
I know the for loop is what is causing issues, and I thought about doing a dictionary to solve this problem but wasn't sure if this was the right path to take.
CodePudding user response:
You don't need a double loop:
for i in range(len(TestDate_lst)):
if Action_lst[i] == 'Decline':
DateTest.append(TestDate_lst[i])
print(DateTest)
CodePudding user response:
You can use zip
.
TestDate_lst = ['2022-03-24', '2022-03-24', '2022-03-24', '2022-03-04', '2022-03-24',
'2021-04-20', '2021-04-20']
Action_lst = ['Decline', 'Decline', 'Decline', 'Approve', 'Approve', 'Ignore', 'Decline']
result = [t for t,a in zip(TestDate_lst, Action_lst) if a=='Decline']
print(result)
Output:
['2022-03-24', '2022-03-24', '2022-03-24', '2021-04-20']
Check Performance:
TestDate_lst = ['2022-03-24', '2022-03-24', '2022-03-24', '2022-03-04', '2022-03-24','2021-04-20', '2021-04-20']*100_000
Action_lst = ['Decline', 'Decline', 'Decline', 'Approve', 'Approve', 'Ignore', 'Decline']*100_000
Benchmark on colab
%timeit [t for t,a in zip(TestDate_lst, Action_lst) if a=='Decline']
# 68.8 ms ± 18.5 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
second appeoach:
%%timeit
res = []
for i in range(len(TestDate_lst)):
if Action_lst[i] == 'Decline':
res.append(TestDate_lst[i])
# 108 ms ± 1.27 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)