I want to add a Guassian Noise N(0, x * 10%) -> where is N(mean,sigma), to each element of the panda dataframe, where x is the element. Is there a more elegant way to change each element rather that iterating through one by one?
CodePudding user response:
You can create a vector/matrix with Gaussian distributed noise and add it to your panda dataframe. You could do so e.g. by using numpy.random.rand and converting the numpy object to add it to your pandas dataframe. Since this is a vector operation this is likely more efficient.
CodePudding user response:
Creating a dataframe:
import numpy as np
import pandas as pd
# Example DataFrame
df = pd.DataFrame(np.random.randn(10, 4) * 10, columns=['a', 'b', 'c', 'd'])
# a b c d
# 0 5.440693 -5.116895 -18.895540 -0.285117
# 1 7.280997 -5.452700 -12.512881 -5.230587
# 2 -15.925851 -8.838794 -4.783024 3.851198
# 3 -8.028241 -17.885670 4.209227 3.367078
# 4 -7.116592 -12.763987 4.371488 -1.376339
Then we define a Gaussian noise function and apply it on each cell:
def gaussian_noise(x):
return np.random.normal(0, abs(x))
noise = df.applymap(gaussian_noise)
df_noisy = df noise
a b c d
# 0 5.130622 1.948290 -32.617351 -0.525052
# 1 -3.354433 -5.591920 -17.065476 -7.150039
# 2 6.275060 -19.027546 -3.051764 6.420511
# 3 -12.589584 -10.084946 1.654662 -1.481193
# 4 -9.452465 -10.243213 7.487428 -2.902000