Unique Values and the corresponding % in the number they appear-CodePudding

I want to achieve something here, the unique values with the corresponding percentage appearance of the values in the column. e.g. I have dataframe df and column prices ,to get the list it will be sth like.

print(df.prices.unique().tolist())

it could result to

1030,1075, 2010,3000, 3050, 4050, 4550

However I want a % mapped to all te answers above. I think value counts would work but I dont know how..

CodePudding user response：

depending of your expectation, you can do as mentionned in comments: To have the % of occurences of each prices:

data={'prices':[1030,1075, 2010,3000, 3050, 4050, 4550,1030,1030,1030,1030, 3050, 3050,3050,3050]}
df=pd.DataFrame(data)


#to have %of occurence
print(df['prices'].value_counts(normalize=True))

Result:

1030    0.333333
3050    0.333333
1075    0.066667
2010    0.066667
3000    0.066667
4050    0.066667
4550    0.066667
Name: prices, dtype: float64

Or if you want to have the sum of all items of this prices / total sum of prices:

data={'prices':[1030,1075, 2010,3000, 3050, 4050, 4550,1030,1030,1030,1030, 3050, 3050,3050,3050]}
df=pd.DataFrame(data)


#to have %of occurence
print(df['prices'].value_counts(normalize=True))


#to have %of Sum of prices
df['forCumSum']=df['prices']
dfCumSum=df.groupby('prices')['forCumSum'].sum().reset_index()
dfCumSum["%of totalPrices"]=dfCumSum['forCumSum']/dfCumSum['forCumSum'].sum()
print(dfCumSum.sort_values("%of totalPrices",ascending=False))

result:

   prices  forCumSum  %of totalPrices
4    3050      15250         0.434659
0    1030       5150         0.146786
6    4550       4550         0.129685
5    4050       4050         0.115434
3    3000       3000         0.085507
2    2010       2010         0.057289
1    1075       1075         0.030640