I have 2 dataframes, representing the estimate mean and standard error.
import numpy as np
import scipy.stats as st
rows = (1, 2)
col1 = ["mean_x", "mean_y"]
col2 = ["std_x", "std_y"]
data1 = ([10, 20], [5, 10])
data2 = ([1, 2], [0.5, 1])
df1 = pd.DataFrame(data1, index = rows, columns = col1)
df2 = pd.DataFrame(data2, index = rows, columns = col2)
I want to manipulate these 2 dataframes elementwisely, to construct a dataframe of confidence interval at 95% level
The ideal format is
Running a loop seems to be awkward, I am wondering if there is any method that is more elegant, efficient?
CodePudding user response:
You can use stats.norm.interval
and find confidence interval at 95% level then create DataFrame like below:
>>> from scipy import stats
>>> twoDf = pd.concat([df1, df2], axis=1)
mean_x mean_y std_x std_y
1 10 20 1.0 2
2 5 10 0.5 1
>>> cols = [('mean_x','std_x', 'interval_x'),('mean_y','std_y', 'interval_y')]
>>> for col in cols:
... twoDf[col[2]] = twoDf.apply(lambda row : \
... stats.norm.interval(0.95, loc=row[col[0]], scale=row[col[1]]), axis=1)
>>> twoDf
mean_x mean_y std_x std_y interval_x interval_y
1 10 20 1.0 2 (8.040036015459947, 11.959963984540053) (16.080072030919894, 23.919927969080106)
2 5 10 0.5 1 (4.020018007729973, 5.979981992270027) (8.040036015459947, 11.959963984540053)