I have a pandas dataframe with some info. I know how to select a certain row of the dataframe.
However, I would like to select several of the rows of the dataframe into another dataframe under the following conditions:
- I don't know how many rows there are
- I have to select the first and then in an interval.
For example if I have
import pandas as pd
import numpy as np
df = pd.DataFrame({'A': 'foo bar foo bar foo bar foo foo'.split(),
'B': 'one one two three two two one three'.split(),
'C': np.arange(8), 'D': np.arange(8) * 2})
print(df)
I have
A B C D
0 foo one 0 0
1 bar one 1 2
2 foo two 2 4
3 bar three 3 6
4 foo two 4 8
5 bar two 5 10
6 foo one 6 12
7 foo three 7 14
I would like to obtain with an interval of 4
A B C D
0 foo one 0 0
4 foo two 4 8
or if the interval is 3
A B C D
0 foo one 0 0
3 bar three 3 6
6 foo one 6 12
CodePudding user response:
Use loc
with a slice as you would do for an iterable:
df.loc[::4]
A B C D
0 foo one 0 0
4 foo two 4 8
df.loc[::3]
A B C D
0 foo one 0 0
3 bar three 3 6
6 foo one 6 12
NB. You can also use df[::4]
/df[::3]
, but loc
is more versatile (for example for columns: df.loc[:,::3]
).