Home > database >  Generating values based on mean and std listed in a dataframe
Generating values based on mean and std listed in a dataframe

Time:11-19

I have a data frame of this format:

import pandas as pd

df = pd.DataFrame({
    1: {'mean': 1.0, 'std': 0.8},
    2: {'mean': 0.5, 'std': 0.2},
    3: {'mean': 0.2, 'std': 0.1},
    4: {'mean': 0.1, 'std': 0.1},
    5: {'mean': 0.6, 'std': 0.2}
})

df
        1    2    3    4    5
mean  1.0  0.5  0.2  0.1  0.6
std   0.8  0.2  0.1  0.1  0.2

Based on these values of mean and std, I am trying to generate a big data frame of randomly generated numbers normally distributed, which has the same number of columns but more rows:

full_noise = []

for mean, std in enumerate(df):
    noise = np.random.normal(mean, std, [5, 1000]) 
    full_noise.append(noise)

So, each column of this new data frame will have values generated on mean and std listed in the data frame above. I am definitely doing something wrong, though.

Sorry, I am quite new to Python! I hope you can help :(

CodePudding user response:

To create what you want I would suggest iterating over the dataframe df one column at a time (to do so first transpose the dataframe and then use iterrows).

For each column you can generate a numpy array of the lenght you desire from a normal distribution using the mean and std from the column.

At the end you can concatenate the numpy arrays as columns of a dataframe (so along axis=1).

full_noise = []
for _, col in df.T.iterrows():
    noise = np.random.normal(loc=col["mean"], scale=col["std"], size=(1000,))
    full_noise.append(pd.Series(noise))

noise_df = pd.concat(full_noise, axis=1)

CodePudding user response:

Using .apply to make full_noise.

full_noise = df.apply(
    lambda col: np.random.normal(loc=col["mean"], scale=col["std"], size=(1_000,)),
)

print(full_noise)
            1         2         3         4         5
0    0.900445  0.555275  0.206491  0.161578  0.491196
1    1.555625  0.261742  0.196981 -0.068225  0.770397
2    0.308983  0.256334  0.119617  0.157978  0.453351
3    0.799080  0.255109  0.164719 -0.088953  0.462583
4    1.263621  0.650327  0.217544  0.046004  0.893409
..        ...       ...       ...       ...       ...
995  1.345332  0.827836  0.320708  0.113350  0.789898
996  1.235461  0.464576  0.270596  0.049924  0.708799
997  1.211508  0.751700  0.230916  0.176736  0.661312
998  1.753942  0.941567  0.097372  0.177429  0.810710
999  1.847943  0.240993 -0.006139  0.200517  0.523238

[1000 rows x 5 columns]
  • Related