Context:
I started with a car dataframe with over 20 car models. Then I created an array with the number of occurences for each model. Now I am trying to seperate the models with less then 500 occurences from those with => 500 occurences into two different 2D arrays.
My code:
unique_models, count_of_models = np.unique(my_data_frame["model"],
return_counts=True)
print(unique_models, count_of_models)
[' A1' ' A2' ' A3' ' A4' ' A5' ' A6' ' A7' ' A8' ' Q2' ' Q3' ' Q5' ' Q7' ' Q8' ' R8'
' RS3' ' RS4' ' RS5' ' RS6' ' RS7' ' S3' ' S4' ' S5' ' S8' ' SQ5' ' SQ7' ' TT']
[1347 1 1929 1381 882 748 122 118 822 1417 877 397 69 28 33 31
29 39 1 18 12 3 4 16 8 336]
representative_models = np.empty((0, 2), int)
other_models = np.empty((0, 2), int)
for models in unique_models:
for counts in count_of_models:
if counts < 500:
other_models = np.append(other_models, np.array([[models, counts]]), axis=0)
else:
representative_models = np.append(other_models, np.array([[models, counts]]), axis=0)
unique_models = 1
Everything works, except for one small thing. Somehow, the unique_models = 1 incrementation won't work and the second for loop will continue with the same unique_model but increment its count.
What I want is to get back in the first loop, increment the model, then go to the second loop and increment the count.
Hope this is clear enough, thanks :)
CodePudding user response:
Here you're looping over a list or array unique_models
for models in unique_models:
unique_models = 1
makes no sense at all and should throw an error as you're trying to add an int
to a list.
Instead you can get to the next model you can break
out of the inner loop by using break
instead of unique_models = 1
, however that will break out of the loop after the first iteration.
What you want to do is loop over BOTH count_of_models
and unique_models
at the same time, you can use zip
for that.
for models, counts in zip(unique_models, count_of_models):
if counts < 500:
other_models = np.append(other_models, np.array([[models, counts]]), axis=0)
else:
representative_models = np.append(other_models, np.array([[models, counts]]), axis=0)