Can I read rows from in-between in pandas df, like I have dataframe with 10 Million records and I want to read the records between 2 Million to 3 Million records. I know I can use skiprows but that won't solve my problem.
CodePudding user response:
You can pass a callable function to skiprows so that pd.read_csv
knows where to start and stop (this can be useful if you want a more complicated row selecting operation). For your question, this works:
pd.read_csv(filepath, skiprows=lambda x: x not in range(2000000,3000000))
Edit: as you suggested, the following also works:
pd.read_csv(file, skiprows=2000000, nrows=1000000)