Home > Enterprise >  Read rows from middle pandas df
Read rows from middle pandas df

Time:09-29

Can I read rows from in-between in pandas df, like I have dataframe with 10 Million records and I want to read the records between 2 Million to 3 Million records. I know I can use skiprows but that won't solve my problem.

CodePudding user response:

You can pass a callable function to skiprows so that pd.read_csv knows where to start and stop (this can be useful if you want a more complicated row selecting operation). For your question, this works:

pd.read_csv(filepath, skiprows=lambda x: x not in range(2000000,3000000))

Edit: as you suggested, the following also works:

pd.read_csv(file, skiprows=2000000, nrows=1000000) 
  • Related