Home > Back-end >  select only some of the rows of a dataframe
select only some of the rows of a dataframe

Time:10-28

I have a pandas dataframe with some info. I know how to select a certain row of the dataframe.

However, I would like to select several of the rows of the dataframe into another dataframe under the following conditions:

  1. I don't know how many rows there are
  2. I have to select the first and then in an interval.

For example if I have

import pandas as pd
import numpy as np
df = pd.DataFrame({'A': 'foo bar foo bar foo bar foo foo'.split(),
                   'B': 'one one two three two two one three'.split(),
                   'C': np.arange(8), 'D': np.arange(8) * 2})
                   
print(df)

I have

     A      B  C   D
0  foo    one  0   0
1  bar    one  1   2
2  foo    two  2   4
3  bar  three  3   6
4  foo    two  4   8
5  bar    two  5  10
6  foo    one  6  12
7  foo  three  7  14

I would like to obtain with an interval of 4

     A      B  C   D
0  foo    one  0   0
4  foo    two  4   8

or if the interval is 3

     A      B  C   D
0  foo    one  0   0
3  bar  three  3   6
6  foo    one  6  12

CodePudding user response:

Use loc with a slice as you would do for an iterable:

df.loc[::4]

     A    B  C  D
0  foo  one  0  0
4  foo  two  4  8

df.loc[::3]

     A      B  C   D
0  foo    one  0   0
3  bar  three  3   6
6  foo    one  6  12

NB. You can also use df[::4]/df[::3], but loc is more versatile (for example for columns: df.loc[:,::3]).

  • Related