I've been working on a genetic algorithm in PyTorch, and I've run into an issue while trying to mutate my model's parameters. I've been using the .apply()
function to randomly change a model's weights and biases. Here is the exact function I made:
def mutate(m):
if type(m) == nn.Linear:
m.weight = nn.Parameter(m.weight torch.randn(m.weight.shape))
m.bias = nn.Parameter(m.bias torch.randn(m.bias.shape))
This function does work for sure, I've tested it, but this isn't the weird part. While trying to use this function for every model in a list, the same mutation happens to each and every model. I obviously don't want this, as I want variety in my population. Here is a reproduceable example:
import torch
import torch.nn as nn
population_size = 5 #Size of the population
population = [nn.Linear(1,1)]*population_size #Creating my population, each agent is a player in this list
dummy_input = torch.rand(1) #Random input
def mutate(m): #Mutation function
if type(m) == nn.Linear:
m.weight = nn.Parameter(m.weight torch.randn(m.weight.shape))
m.bias = nn.Parameter(m.bias torch.randn(m.bias.shape))
population = list(x.apply(mutate) for x in population) #This is the line I've been having issues with
for i in population:
print (i(dummy_input)) #This is here to show that all the models are mutating in the same way and outputting the same thing
This code has the following output:
tensor([-2.0366], grad_fn=<AddBackward0>)
tensor([-2.0366], grad_fn=<AddBackward0>)
tensor([-2.0366], grad_fn=<AddBackward0>)
tensor([-2.0366], grad_fn=<AddBackward0>)
tensor([-2.0366], grad_fn=<AddBackward0>)
As you can see, all the models mutated in the same way, and are yielding the same output.
This is running in Python 3.9, thank you all in advance.
CodePudding user response:
When x
is a mutable object and you write [x]*n
you are essentially creating a list of n references to the same object x
.
What you want in your case is something like
[nn.Linear(1,1) for _ in range(population_size)]
CodePudding user response:
It looks like creating a list of linear parameters in the way you do simply copies the initial nn.Linear
object. For example, setting population[0].weight = nn.Parameter()
sets all linear layer weights in population list to an empty parameter value. In your case, the final random weight and bias assigned by the mutate function is given to all layers in the population list since they are all copies of one another.
Changing the fourth line of your code to population = [nn.Linear(1,1) for _ in range(population_size)]
creates five unique linear layers and fixes this problem.