Home > database >  Python Panda Dataframe Count Specific Values from List
Python Panda Dataframe Count Specific Values from List

Time:09-21

Say I have a list:

mylist = ['a','b','c']

and a Pandas dataframe (df) that has a column named "rating". How can I get the count for number of occurrence of a rating while iterating my list? For example, here is what I need:

for item in myList
   # Do a bunch of stuff in here that takes a long time
   # want to do print statement below to show progress
   # print df['rating'].value_counts().a <- I can do this, 
   #     but want to use variable 'item'
   # print df['rating'].value_counts().item <- Or something like this

I know I can get counts for all distinct values of 'rating', but that is not what I am after.

CodePudding user response:

If you must do it this way, you can use .loc to filter the df prior to getting the size of the resulting df.

mylist = ['a','b','c']
df = pd.DataFrame({'rating':['a','a','b','c','c','c','d','e','f']})


for item in mylist:
    print(item, df.loc[df['rating']==item].size)

Output

a 2
b 1
c 3

CodePudding user response:

Instead of thinking about this problem as one of going "from the list to the Dataframe" it might be easiest to flip it around:

mylist = ['a','b','c']
df = pd.DataFrame({'rating':['a','a','b','c','c','c','d','e','f']})

ValueCounts = df['rating'].value_counts()
ValueCounts[ValueCounts.index.isin(mylist)]

Output:

c    3
a    2
b    1
Name: rating, dtype: int64

CodePudding user response:

You don't even need a for loop, just do:

df['rating'].value_counts()[mylist]

Or to make it a dictionary:

df['rating'].value_counts()[['a', 'b', 'c']].to_dict()
  • Related