I have a huge dataset. What I am trying to do is separate unique names and calculate the genetic algorithm of data with the same name. to illustrate
Assume the following table
Name price quantity
a1. 100. 6
a2. 30. 20
a1 250. 125
a1. 5. 20
a2. 90. 200
a2. 50. 705
so I want to calculate the genetic algorithm of a1 and a2 separately to get the best solution for x1-x3. I have already coded the genetic algorithm for the whole dataset, but I am confused about how to calculate a1 and a2 separately within the same dataset.
Note: I have used pandas to import my dataset
CodePudding user response:
With regards to @IgnatiusReilly for adding to the question in the comments.
If you want to slice your DataFrame into chunks for every unique name to perform calculations over them, you may do the following:
# assume there's a function ga() that calculates the genetic algorithm for a column
# and your DataFrame is df
for name in df.Name.unique():
ga(df.loc[df.Name == name])
As for applying this to calculate geneticalgorithm
, I have a humble assumption this might look like this:
from geneticalgorithm import geneticalgorithm as ga
for name in df.Name.unique():
s = df.loc[df.Name == name].to_numpy() # convert to ndarray
# take columns from 1 to 3 if 'Name' is indexed 0
varbound = np.array([[np.min(s[:, 1]), np.max(s[:, 1])],
[np.min(s[:, 2]), np.max(s[:, 2])],
[np.min(s[:, 3]), np.max(s[:, 3])]])
model = ga(function=equ, # this function has been declared somewhere before
dimension=3,
variable_type='int',
variable_boundaries=varbound)
model.run()