Remove rows from table by using slice or pandas functions-CodePudding

I have csv file with data , but there rows i dont need. So task is remove rows from table. For example:

 0 A
 1 B
 2 C
 3 D
 4 E * to delete
 5 F *
 6 G *
 7 H *
 8 I
 9 J
10 k
11 L
12 M *
13 N *
14 O *
15 P *

So i want remove last 4 rows for each 8 rows in table . In table 3089 rows

I try to slice table , but no good result

CodePudding user response：

Use numpy to craft a mask:

import numpy as np

mask = (np.arange(len(df))%8//4) == 0

out = df[mask]

Other option:

mask = np.arange(len(df))%8 < 4

out = df[mask]

output:

How it works

We first get the modulo 8 to get the position in each group of 8, then floor division by 4 and comparison to 0 to keep only the first 4 per group:

   col  arange  %8  //4   mask
0    A       0   0    0   True
1    B       1   1    0   True
2    C       2   2    0   True
3    D       3   3    0   True
4    E       4   4    1  False
5    F       5   5    1  False
6    G       6   6    1  False
7    H       7   7    1  False
8    I       8   0    0   True
9    J       9   1    0   True
10   k      10   2    0   True
11   L      11   3    0   True
12   M      12   4    1  False
13   N      13   5    1  False
14   O      14   6    1  False
15   P      15   7    1  False

CodePudding user response：

Groupby every eight rows, generate a new column say id and filter out any ids greater than 4.

df=df.assign(id=df.groupby(df.index//8).cumcount()).query('id<=4').drop(columns='id')

   item
0     A
1     B
2     C
3     D
4     E
8     I
9     J
10    k
11    L
12    M