I have thousands of csv files. Each file has something about 1k lines in the following structure
Hour Val1 Val2
9:00 2 3
9:05 1 4
9:10 5 6
9:15 4 8
9:20 6 4
What I need is: verify if Val2
in row x
is bigger then Val1
in row x-1
and row x 1
A basic output for this problem is:
Hour Cond
9:00 False
9:05 False
9:10 True
9:15 True
9:20 False
Of course I know that I can do this using a for loop, but I dont know if is the most optimized way. Searching in other Stack OverFlow references, I found this post and the author of the second answer says explicitly to not iterate over a pandas DF.
So, finally, my doubts are:
1. Considering the answer of the reference post, is there a way to solve this problem without iterating on dataframes rows?
2. Is pandas the best approach to do this?
CodePudding user response:
Iterate the list of dfs and compute cond as follows. dont iterate the df rows please. Its the anticlimax
df =df.assign(cond=(df['Val2'].gt(df['Val1'].shift()))&(df['Val2'].gt(df['Val1'].shift(-1))))