I'm working on a Python framework for training ML models with different noise functions applied on the training data. Here's an example of this noise function.
def add_gauss(x, a=0, b=0.1)
return x np.random.normal(a,b)
I then build a list with several functions like this
functions = [add_gauss, add_laplace]
And this then gets used in the training function, to be vectorized and applied to the training data:
data = [1, 2, 3, 4]
modified_data_list = []
for function in functions:
v_function = numpy.vectorize(function)
modified_data_list.append(v_function(data))
And this results in a list with, for this case, two datasets, one with Gaussian noise and one with Laplace noise. My current problem: this setup only works because I gave default parameters to the functions I made. I am unsure if there is a way to declare them so that I'd get something like:
functions = [add_gauss(*, 0, 0.1), add_laplace(*, 0, 0.1)]
Where the "*" represents the value of each data entry as it gets modified by the vectorized function.
Is this possible or should I change approach?
CodePudding user response:
One possibility is to write a function that returns a function:
def add_gauss_func(a, b):
def f(x)
return x np.random.normal(a,b)
return f
This uses a concept called "closure" if you are interested in learning more.
Now you can do
functions = [add_gauss(0, 0.1), add_gauss(0, 0.2)]
for two different functions with different gaussian noise.
A similar technique can work for the laplacian noise function.
In fact, this can probably be generalized to def add_noise(f, *params)
or something similar if you want to get really fancy.