How to delete every 5 rows in Pandas?-CodePudding

I want to delete particular rows, and leave the ones that have the first occurrence of the machine ID. So I want row 0, 5, 10 and 15

0,M_0003
1,M_0003
2,M_0003
3,M_0003
4,M_0003
5,M_0005
6,M_0005
7,M_0005
8,M_0005
9,M_0005
10,M_0007
11,M_0007
12,M_0007
13,M_0007
14,M_0007
15,M_0003
16,M_0003
17,M_0003
18,M_0003
19,M_0003

That's how the result should look like:

0,M_0003
1,M_0005
2,M_0007
3,M_0003

Is there a function in Python that will help? The only thing I found is this, but it does not work.

y_data = y_data.groupby(np.arange(len(y_data)) // 5)

CodePudding user response：

Use GroupBy.first:

y_data = y_data.groupby(np.arange(len(y_data)) // 5).first()

CodePudding user response：

You can use boolean indexing:

y_data = y_data[np.arange(len(y_data))%5==0]

intermediates:

np.arange(len(y_data))%5
# array([0, 1, 2, 3, 4, 0, 1, 2, 3, 4, 0, 1, 2, 3, 4, 0, 1, 2, 3, 4])

np.arange(len(y_data))%5==0
# array([ True, False, False, False, False,  True, False, False, False,
#        False,  True, False, False, False, False,  True, False, False,
#        False, False])