Hi I want to find a outlier data using standard normal distribution
I'm not used to statistics. So If any error in my question, please give me advice.
I have two requirement.
I want to make a standard normal distribution using integer list of data like [15, 13, 18, 20, 22, 17, 16, 16, 30, 18, 15, 16]
When a new data is coming to list like 32, I want to check that new data is in range of standard normal distribution with in - 1 sigma from mean
Thanks for reading my question.
CodePudding user response:
As explained on Wikipedia, the standard deviation σ
can be computed as:
σ = sqrt(E[X^2] - E[X]^2)
where E
is the expected value (mean). The following simple class does just that:
class Statistics {
int count = 0; // num values so far
double mean = Double.NaN; // E[X]
double meanOfSquares = Double.NaN; // E[X^2]
public void add(final double value) {
if (count == 0) {
mean = value;
meanOfSquares = value * value;
} else {
mean = (mean * count value) / (count 1);
meanOfSquares = (meanOfSquares * count value * value) / (count 1);
}
count ;
}
public double getMean() {
// sum of all values divided by count
return mean;
}
public double getVariance() {
// σ^2 = E[X^2] - E[X]^2;
return meanOfSquares - mean * mean;
}
public double getStandardDeviation() {
// variance is square of standard deviance
return Math.sqrt(getVariance());
}
}
You can now check whether a value x
is within one σ
as follows:
Statistics statistics = new Statistics():
statistics.add(10);
statistics.add(12);
int x = 5;
double mean = statistics.getMean();
double sigma = statistics.getStandardDeviation();
if (x < mean - sigma || x > mean sigma) {
System.out.println(x " not within one standard deviation");
}