Home > Software engineering >  how to use elementwise dataframe manipulation?
how to use elementwise dataframe manipulation?

Time:10-08

I have 2 dataframes, representing the estimate mean and standard error.

import numpy as np
import scipy.stats as st

rows = (1, 2)
col1 = ["mean_x", "mean_y"]
col2 = ["std_x", "std_y"]

data1 = ([10, 20], [5, 10])
data2 = ([1, 2], [0.5, 1])

df1 = pd.DataFrame(data1, index = rows, columns = col1)
df2 = pd.DataFrame(data2, index = rows, columns = col2)

enter image description here

I want to manipulate these 2 dataframes elementwisely, to construct a dataframe of confidence interval at 95% level

The ideal format is

enter image description here

Running a loop seems to be awkward, I am wondering if there is any method that is more elegant, efficient?

CodePudding user response:

You can use stats.norm.interval and find confidence interval at 95% level then create DataFrame like below:

>>> from scipy import stats
>>> twoDf = pd.concat([df1, df2], axis=1)

    mean_x  mean_y  std_x   std_y
1     10      20     1.0    2
2      5      10     0.5    1

>>> cols = [('mean_x','std_x', 'interval_x'),('mean_y','std_y', 'interval_y')]

>>> for col in cols:
...    twoDf[col[2]] = twoDf.apply(lambda row : \
...                                stats.norm.interval(0.95, loc=row[col[0]], scale=row[col[1]]), axis=1)

>>> twoDf

    mean_x   mean_y   std_x     std_y     interval_x                                 interval_y
1   10        20        1.0     2        (8.040036015459947, 11.959963984540053)    (16.080072030919894, 23.919927969080106)
2   5         10        0.5     1        (4.020018007729973, 5.979981992270027)     (8.040036015459947, 11.959963984540053)
  • Related