If I have this value.counts() dataframe (already ascending) :
A 20
B 15
C 15
D 10
E 10
F 10
G 8
H 5
I 5
Then I want to get a first 70% for example, then what I get is
A
B
C
D
E
Do pandas have function for that ? I have tried with groupby but it does not work. Or should I code manually like with for loop or something ?
thanks
Find biggest coverage in percent with pandas from a dataframe. Is there any shortcut function or should code manually ?
CodePudding user response:
Assuming X, Y the column names, you can compare the cumsum
to be lower or equal (le
) to 70, and slice with boolean indexing:
df.loc[df['Y'].cumsum().le(70), 'X']
Alternative by position (first and second column):
df.loc[df.iloc[:, 1].cumsum().le(70), df.columns[0]]
output:
0 A
1 B
2 C
3 D
4 E
Name: X, dtype: object
Used input:
X Y
0 A 20
1 B 15
2 C 15
3 D 10
4 E 10
5 F 10
6 G 8
7 H 5
8 I 5
9 G 2