To compute a value for overall dataframe column-CodePudding

I was trying to work on a requirement where I had to compute the value for an entire column based on a formula, here is my below code

import pandas as pd
import numpy as np
s={'Fruits':['Apple','Orange', 'Banana', 'Mango'],
   'month':['201401','201502','201603','201604'],'weight':[2,4,1,6],
   'Quant':[132,178,298,300]}
p=pd.DataFrame(data=s)
n=len(p)
std_dev=((1)/(n-1))*(sum([(p['Quant'] - p['weight']) ** 2 for _ in range(n)]))
alpha=2
std_devf= p['Quant']   alpha*(std_dev)

The expected value for std_devf should be a single value. (Eg 100 or 200)

But the O/P I'm getting is this, which is based on every Fruits-

0     45198.666667
1     80914.000000
2    235522.000000
3    230796.000000

How would I be able to just get a single value, based on the formula? Is it due to my formula that I'm getting values in this manner?

CodePudding user response：

First of all, the std_dev formula needs to be fixed. You are creating a list of 4 dataframes and summing them up. However, the link you have provided did not mention that way. According to the link, it should be like this:

n = len(p)
std_dev = (1/(n-1)*(sum([(p['Quant'][i] - p['weight'][i]) ** 2 for i in range(n)]))) ** 0.5
alpha = 2
std_devf= p['Quant']   alpha*(std_dev)

On the other hand, you are looking for the expected value of the std_devf or the bound limit? If that's the case, the result will have decimals as in the link, but you can always round it up to two decimals.

round(std_devf,2)
Out[33]: 
0    675.84
1    721.84
2    841.84
3    843.84
Name: Quant, dtype: float64

CodePudding user response：

for calculating your standard deviation you can follow my method
import math
s=0
for i in range(n):
s=sum([(p['Quant'][i] - p['weight'][i])])*(sum([(p['Quant'][i] -p['weight'[i])]))

std_dev=s/(n-1)
math.sqrt(std_dev)
alpha=2
std_devf= p['Quant'] alpha*(std_dev)

Hope it solves your query, you can find the image of my solution on below link [1]: https://i.stack.imgur.com/3HH5b.png