Home > Blockchain >  How to efficiently iterate over a pandas dataframe dynamically looking to its rows
How to efficiently iterate over a pandas dataframe dynamically looking to its rows

Time:11-04

I have thousands of csv files. Each file has something about 1k lines in the following structure

Hour  Val1 Val2
9:00  2    3
9:05  1    4
9:10  5    6 
9:15  4    8
9:20  6    4

What I need is: verify if Val2 in row x is bigger then Val1 in row x-1 and row x 1

A basic output for this problem is:

Hour Cond
9:00 False
9:05 False
9:10 True
9:15 True
9:20 False

Of course I know that I can do this using a for loop, but I dont know if is the most optimized way. Searching in other Stack OverFlow references, I found this post and the author of the second answer says explicitly to not iterate over a pandas DF.

So, finally, my doubts are:

1. Considering the answer of the reference post, is there a way to solve this problem without iterating on dataframes rows?

2. Is pandas the best approach to do this?

CodePudding user response:

Iterate the list of dfs and compute cond as follows. dont iterate the df rows please. Its the anticlimax

df =df.assign(cond=(df['Val2'].gt(df['Val1'].shift()))&(df['Val2'].gt(df['Val1'].shift(-1))))
  • Related