How can I delete those table rows (DataFrame) - by selection condition (whose value is > 3 (for example))
I'm trying to do it in different ways, I've also tried using loops - sorting through the desired column, and through operational commands Pandas
Something I can't do anything. There is no change, no deletion, or an infinite loop.
I would really appreciate any help.
Thank you.
The table (DataFrame) shown below has 207000 rows, but after iterating through the condition and then deleting the desired rows, there are about 100000 rows.
Qnv = 8
lens = len(df_calc_rulet_q.index)
df_calc_rulet_q = df_calc_rulet_q.drop(df_calc_rulet_q[df_calc_rulet_q['summ']<Qnv].index)
# k = 1
# i = 0
# for k in range(lens):
# if df_calc_rulet_q.at[i, 'summ'] < Qnv:
# # print(df_calc_rulet_q.loc[i, 'summ'])
# df_calc_rulet_q.drop(i)
# df_calc_rulet_q = df_calc_rulet_q.reset_index(drop=True)
# else:
# i = i 1
print(df_calc_rulet_q)
Qvir1 Qvir2 Qvir3 Qvir4 Qvir5 summ
8 0.9 2.10 0.9 1.60 2.8 8.30
9 1.0 2.35 1.0 1.67 3.1 9.12
10 1.1 2.60 1.1 1.80 3.4 10.00
21 1.0 2.10 1.0 1.40 3.1 8.60
22 1.1 2.35 1.1 1.60 3.4 9.55
... ... ... ... ... ... ...
207608 0.9 2.85 0.9 0.60 2.8 8.05
207621 1.0 2.85 1.0 0.40 3.1 8.35
207631 0.8 2.10 0.8 2.00 2.5 8.20
207632 0.9 2.35 0.9 2.20 2.8 9.15
207634 1.1 2.85 1.1 0.20 3.4 8.65
CodePudding user response:
Use loc:
import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randint(0,100, size=(10, 4)), columns=list('ABCD'))
# A B C D
# 0 66 31 65 89
# 1 34 33 24 53
# 2 5 67 29 54
# 3 68 26 31 36
# 4 3 94 9 51
# 5 62 45 60 24
# 6 60 86 33 86
# 7 65 23 62 70
# 8 12 8 21 41
# 9 49 55 78 23
df = df.loc[(df['A']>20) & (df['B']<50) & (df['C']>60)]
# A B C D
#0 66 31 65 89
#7 65 23 62 70
CodePudding user response:
Consider DataFrame.query
to filter rows you intend to keep and not DataFrame.drop
which is really intended to drop index or column labels not rows by conditions.
Qnv = 8
df_calc_rulet_q_subset = (
df_calc_rulet_q.query("summ >= @Qnv")
.reset_index(drop=True)
)