Here is my code:
unique_models, count_of_models = np.unique(my_data_frame["model"], return_counts=True)
print(unique_models, count_of_models)
[' A1' ' A2' ' A3' ' A4' ' A5' ' A6' ' A7' ' A8' ' Q2' ' Q3' ' Q5' ' Q7' ' Q8' ' R8'
' RS3' ' RS4' ' RS5' ' RS6' ' RS7' ' S3' ' S4' ' S5' ' S8' ' SQ5' ' SQ7' ' TT']
[1347 1 1929 1381 882 748 122 118 822 1417 877 397 69 28 33 31 29
39 1 18 12 3 4 16 8 336]
representative_models = np.empty((0, 2), int)
other_models = np.empty((0, 2), int)
for models, counts in zip(unique_models, count_of_models):
if counts < 500:
other_models = np.append(other_models, np.array([[models, counts]]), axis=0)
else:
representative_models = np.append(representative_models, np.array([[models, counts]]), axis=0)
print(representative_models[representative_models[:, 1].argsort()])
[[' A1' '1347']
[' A4' '1381']
[' Q3' '1417']
[' A3' '1929']
[' A6' '748']
[' Q2' '822']
[' Q5' '877']
[' A5' '882']]
print(representative_models)
[[' A1' '1347']
[' A3' '1929']
[' A4' '1381']
[' A5' '882']
[' A6' '748']
[' Q2' '822']
[' Q3' '1417']
[' Q5' '877']]
So as you can see, everything was succesful except for the sorting, they are simply not sorted. Anyone would know of another method to sort from largest to smallest by the second column???
Example of what it should look like:
[[' A3' '1929']
[' Q3' '1417']
[' A4' '1381']
[' A1' '1347']
[' A5' '882']
[' Q5' '877']
[' Q2' '822']
[' A6' '748']]
Thank you!
CodePudding user response:
Here you go:
import numpy as np
data = [[' A1', '1347'],
[' A4', '1381'],
[' Q3', '1417'],
[' A3', '1929'],
[' A6', '748'],
[' Q2', '822'],
[' Q5', '877'],
[' A5', '882']]
indices = np.argsort([int(d[1]) for d in data])
sorted_data = [data[i] for i in indices[::-1]]
CodePudding user response:
Assuming your array is as below, you could do:
import numpy as np
representative_models = np.array([[' A1', '1347'],
[' A4', '1381'],
[' Q3', '1417'],
[' A3', '1929'],
[' A6', '748'],
[' Q2', '822'],
[' Q5', '877'],
[' A5', '882']])
# convert last column to int and arg-sort in decreasing order [::-1]
order = np.argsort(representative_models[:, 1].astype(int))[::-1]
# simply index on the input array
result = representative_models[order, :]
print(result)
Output
[[' A3' '1929']
[' Q3' '1417']
[' A4' '1381']
[' A1' '1347']
[' A5' '882']
[' Q5' '877']
[' Q2' '822']
[' A6' '748']]