Home > Software engineering >  How can I get how many different 'values' are in a dataset for pandas?
How can I get how many different 'values' are in a dataset for pandas?

Time:11-12

df['value'].value_counts()

This code gives me two columns. On the left I get the unique values in this column (say 15) and the right gives the frequencies of each (ranging from 1 to 100). However, I want the int count of the unique values on the left of 15. Can't figure this out. Thanks

CodePudding user response:

I don't know if I understood it right, but if you want to sum the the values of the left column that appear only once you could df.groupby('values').filter(lambda x: len(x)==1)['values'].unique(). Probably there is a better way, but that should do it.

CodePudding user response:

You could add a reset_index() at the end of the code, and save it to a DataFrame, then change the columns and rename the index column. Something like this:

temp_df = df['value'].value_counts().reset_index()
temp_df ['index'], temp_df ['value'] = temp_df ['value'], temp_df ['index']
temp_df.columns = ['count', 'value']

CodePudding user response:

Sounds like you are looking for the number of unique values. Here are a few ways to go about it. Since I'm using range(10) to populate my sample DataFrame, all the methods below result in a value of 10 as we'd expect.

import pandas as pd
import random
random.seed(42)

df = pd.DataFrame({'value': random.choices(range(10), k=1000)})

Inspect the shape of the DataFrame. The shape parameter will give you a tuple of (rows, columns) in your DataFrame.

df.value_counts().shape[0] 

When you call len() on a DataFrame, it returns the number of rows.

len(df.value_counts())

Count the number of unique values in the DataFrame. unique()

len(df['value'].unique())

Probably the best way to do it. nunique()

df.value.nunique()
  • Related