I have a data frame of this format:
import pandas as pd
df = pd.DataFrame({
1: {'mean': 1.0, 'std': 0.8},
2: {'mean': 0.5, 'std': 0.2},
3: {'mean': 0.2, 'std': 0.1},
4: {'mean': 0.1, 'std': 0.1},
5: {'mean': 0.6, 'std': 0.2}
})
df
1 2 3 4 5
mean 1.0 0.5 0.2 0.1 0.6
std 0.8 0.2 0.1 0.1 0.2
Based on these values of mean
and std
, I am trying to generate a big data frame of randomly generated numbers normally distributed, which has the same number of columns but more rows:
full_noise = []
for mean, std in enumerate(df):
noise = np.random.normal(mean, std, [5, 1000])
full_noise.append(noise)
So, each column of this new data frame will have values generated on mean
and std
listed in the data frame above. I am definitely doing something wrong, though.
Sorry, I am quite new to Python! I hope you can help :(
CodePudding user response:
To create what you want I would suggest iterating over the dataframe df
one column at a time (to do so first transpose the dataframe and then use iterrows
).
For each column you can generate a numpy array of the lenght you desire from a normal distribution using the mean and std from the column.
At the end you can concatenate the numpy arrays as columns of a dataframe (so along axis=1
).
full_noise = []
for _, col in df.T.iterrows():
noise = np.random.normal(loc=col["mean"], scale=col["std"], size=(1000,))
full_noise.append(pd.Series(noise))
noise_df = pd.concat(full_noise, axis=1)
CodePudding user response:
Using .apply
to make full_noise
.
full_noise = df.apply(
lambda col: np.random.normal(loc=col["mean"], scale=col["std"], size=(1_000,)),
)
print(full_noise)
1 2 3 4 5
0 0.900445 0.555275 0.206491 0.161578 0.491196
1 1.555625 0.261742 0.196981 -0.068225 0.770397
2 0.308983 0.256334 0.119617 0.157978 0.453351
3 0.799080 0.255109 0.164719 -0.088953 0.462583
4 1.263621 0.650327 0.217544 0.046004 0.893409
.. ... ... ... ... ...
995 1.345332 0.827836 0.320708 0.113350 0.789898
996 1.235461 0.464576 0.270596 0.049924 0.708799
997 1.211508 0.751700 0.230916 0.176736 0.661312
998 1.753942 0.941567 0.097372 0.177429 0.810710
999 1.847943 0.240993 -0.006139 0.200517 0.523238
[1000 rows x 5 columns]