How to get filtered values of data frame in Python?-CodePudding

I want to find, in a given column "type", the values of that column that repeats "n" times.

I did this:

n = 5
df = dataf["type"].value_counts() > 5

print(df) will return something like this:

Bike           True
Truck          True
Car            False

How to get the values "Bike" and "Car" ? I want to add them in a set.

CodePudding user response：

You can use lambda in a loc for this:

import pandas as pd

df = pd.DataFrame({"vehicle": ["bike"] * 7   ["truck"] * 8   ["car"] * 4})
print(df)
print("\nUsing loc...")
print(df["vehicle"].value_counts().loc[lambda x: x > 5])

gives

   vehicle
0     bike
1     bike
2     bike
3     bike
4     bike
5     bike
6     bike
7    truck
8    truck
9    truck
10   truck
11   truck
12   truck
13   truck
14   truck
15     car
16     car
17     car
18     car

Using loc...
truck    8
bike     7
Name: vehicle, dtype: int64

CodePudding user response：

Try this

aux = dataf["type"].value_counts()
greater_than_five = aux[aux > 5]

The first line get the count of the types and the second line filter for the types that is greater than five.

CodePudding user response：

Try this,

n = 5
df = dataf["type"].value_counts()[dataf["type"].value_counts() > n]
print(df)

CodePudding user response：

the most efficient way is with lambda that @user1717828 wrote it. another way :

df = pd.DataFrame({"vehicle": ["bike"] * 7   ["truck"] * 8   ["car"] * 4})


df2 = df["vehicle"].agg({'count':'value_counts'})
df2[df2['count'] > 5]

CodePudding user response：

You can add a new columns called counter which contain '1':

df['counter'] = 1

and use groupby:

df = df.groupby(['types']).sum()
df = df[df.counter > n]