Home > database >  How can I delete those table rows (DataFrame) - by selection condition?
How can I delete those table rows (DataFrame) - by selection condition?

Time:08-08

How can I delete those table rows (DataFrame) - by selection condition (whose value is > 3 (for example))

I'm trying to do it in different ways, I've also tried using loops - sorting through the desired column, and through operational commands Pandas

Something I can't do anything. There is no change, no deletion, or an infinite loop.

I would really appreciate any help.

Thank you.

The table (DataFrame) shown below has 207000 rows, but after iterating through the condition and then deleting the desired rows, there are about 100000 rows.

Qnv = 8

lens = len(df_calc_rulet_q.index)


df_calc_rulet_q = df_calc_rulet_q.drop(df_calc_rulet_q[df_calc_rulet_q['summ']<Qnv].index)

# k = 1
# i = 0
# for k in range(lens):
#     if df_calc_rulet_q.at[i, 'summ'] < Qnv:
#         # print(df_calc_rulet_q.loc[i, 'summ'])
#         df_calc_rulet_q.drop(i)
#         df_calc_rulet_q = df_calc_rulet_q.reset_index(drop=True)
#     else:
#         i = i   1

print(df_calc_rulet_q)

        Qvir1  Qvir2  Qvir3  Qvir4  Qvir5   summ
8         0.9   2.10    0.9   1.60    2.8   8.30
9         1.0   2.35    1.0   1.67    3.1   9.12
10        1.1   2.60    1.1   1.80    3.4  10.00
21        1.0   2.10    1.0   1.40    3.1   8.60
22        1.1   2.35    1.1   1.60    3.4   9.55
...       ...    ...    ...    ...    ...    ...
207608    0.9   2.85    0.9   0.60    2.8   8.05
207621    1.0   2.85    1.0   0.40    3.1   8.35
207631    0.8   2.10    0.8   2.00    2.5   8.20
207632    0.9   2.35    0.9   2.20    2.8   9.15
207634    1.1   2.85    1.1   0.20    3.4   8.65

CodePudding user response:

Use loc:

import pandas as pd
import numpy as np

df = pd.DataFrame(np.random.randint(0,100, size=(10, 4)), columns=list('ABCD'))
#     A   B   C   D
# 0  66  31  65  89
# 1  34  33  24  53
# 2   5  67  29  54
# 3  68  26  31  36
# 4   3  94   9  51
# 5  62  45  60  24
# 6  60  86  33  86
# 7  65  23  62  70
# 8  12   8  21  41
# 9  49  55  78  23


df = df.loc[(df['A']>20) & (df['B']<50) & (df['C']>60)]

#    A   B   C   D
#0  66  31  65  89
#7  65  23  62  70

CodePudding user response:

Consider DataFrame.query to filter rows you intend to keep and not DataFrame.drop which is really intended to drop index or column labels not rows by conditions.

Qnv = 8

df_calc_rulet_q_subset = (
    df_calc_rulet_q.query("summ >= @Qnv")
        .reset_index(drop=True)
)
  • Related