I want to achieve something here, the unique values with the corresponding percentage appearance of the values in the column. e.g. I have dataframe df and column prices ,to get the list it will be sth like.
print(df.prices.unique().tolist())
it could result to
1030,1075, 2010,3000, 3050, 4050, 4550
However I want a % mapped to all te answers above. I think value counts would work but I dont know how..
CodePudding user response:
depending of your expectation, you can do as mentionned in comments: To have the % of occurences of each prices:
data={'prices':[1030,1075, 2010,3000, 3050, 4050, 4550,1030,1030,1030,1030, 3050, 3050,3050,3050]}
df=pd.DataFrame(data)
#to have %of occurence
print(df['prices'].value_counts(normalize=True))
Result:
1030 0.333333
3050 0.333333
1075 0.066667
2010 0.066667
3000 0.066667
4050 0.066667
4550 0.066667
Name: prices, dtype: float64
Or if you want to have the sum of all items of this prices / total sum of prices:
data={'prices':[1030,1075, 2010,3000, 3050, 4050, 4550,1030,1030,1030,1030, 3050, 3050,3050,3050]}
df=pd.DataFrame(data)
#to have %of occurence
print(df['prices'].value_counts(normalize=True))
#to have %of Sum of prices
df['forCumSum']=df['prices']
dfCumSum=df.groupby('prices')['forCumSum'].sum().reset_index()
dfCumSum["%of totalPrices"]=dfCumSum['forCumSum']/dfCumSum['forCumSum'].sum()
print(dfCumSum.sort_values("%of totalPrices",ascending=False))
result:
prices forCumSum %of totalPrices
4 3050 15250 0.434659
0 1030 5150 0.146786
6 4550 4550 0.129685
5 4050 4050 0.115434
3 3000 3000 0.085507
2 2010 2010 0.057289
1 1075 1075 0.030640