Parameter declaration for vectorized functions-CodePudding

I'm working on a Python framework for training ML models with different noise functions applied on the training data. Here's an example of this noise function.

def add_gauss(x, a=0, b=0.1)
   return x   np.random.normal(a,b)

I then build a list with several functions like this

functions = [add_gauss, add_laplace]

And this then gets used in the training function, to be vectorized and applied to the training data:

data = [1, 2, 3, 4]
modified_data_list = []

for function in functions:
   v_function = numpy.vectorize(function)
   modified_data_list.append(v_function(data))

And this results in a list with, for this case, two datasets, one with Gaussian noise and one with Laplace noise. My current problem: this setup only works because I gave default parameters to the functions I made. I am unsure if there is a way to declare them so that I'd get something like:

functions = [add_gauss(*, 0, 0.1), add_laplace(*, 0, 0.1)]

Where the "*" represents the value of each data entry as it gets modified by the vectorized function.

Is this possible or should I change approach?

CodePudding user response：

One possibility is to write a function that returns a function:

def add_gauss_func(a, b):
   def f(x)
       return x   np.random.normal(a,b)
   return f

This uses a concept called "closure" if you are interested in learning more.

Now you can do

functions = [add_gauss(0, 0.1), add_gauss(0, 0.2)]

for two different functions with different gaussian noise.

A similar technique can work for the laplacian noise function.

In fact, this can probably be generalized to def add_noise(f, *params) or something similar if you want to get really fancy.