Home > Back-end >  How correct sort not int numpy array?
How correct sort not int numpy array?

Time:11-05

help please! I ran into a problem that when sorting numpy arrays by the second and third columns in descending order, it is impossible to do this because of that. Everything works until the code encounters numbers greater than 9. How can you solve this problem? enter image description here

import numpy as np
data = [
    ['Other Theft', 2003, 5, 12, 16, 15, 'Strathcona', 49.269802, -123.083763],
    ['Other Theft', 2003, 5, 7, 15, 20, 'Strathcona', 49.269802, -123.083763],
    ['Other Theft', 2003, 4, 23, 16, 40, 'Strathcona', 49.269802, -123.083763],
    ['Other Theft', 2003, 4, 20, 11, 15, 'Strathcona', 49.269802, -123.083763],
    ['Other Theft', 2003, 4, 12, 17, 45, 'Strathcona', 49.269802, -123.083763],
    ['Other Theft', 2003, 3, 26, 20, 45, 'Strathcona', 49.269802, -123.083763],
    ['Offence Against a Person', 2015, 8, 11,'unknown', 'unknown', 'unknown', 0.000000, 0.000000],
    ['Break and Enter Residential/Other', 2003, 3, 10, 12, 0, 'Kerrisdale', 49.228051, -123.146610],
    ['Mischief', 2003, 6, 28, 4, 13, 'Dunbar-Southlands', 49.255559, -123.193725],
    ['Mischief', 2017, 3, 26, 23, 0, 'Sunset', 49.21431483, -123.101945],
    ['Other Theft', 2003, 2, 16, 9, 2, 'Strathcona', 49.269802, -123.083763],
    ['Break and Enter Residential/Other', 2003, 7, 9, 18, 15, 'Grandview-Woodland', 49.267734, -123.067654],
    ['Other Theft', 2003, 1, 31, 19, 45, 'Strathcona', 49.269802, -123.083763],
    ['Mischief', 2003, 9, 27, 1, 0, 'Dunbar-Southlands', 49.253762, -123.194407],
    ['Offence Against a Person', 2017, 1 , 24, 'unknown', 'unknown', 'unknown', 0.000000, 0.000000],
    ['Break and Enter Residential/Other', 2003, 4, 19, 18, 0, 'Grandview-Woodland', 49.267814, -123.067441],
    ['Break and Enter Residential/Other', 2003, 9, 24, 18, 30, 'Grandview-Woodland', 49.267731, -123.067302],
    ['Break and Enter Residential/Other', 2003, 11, 5, 8, 12, 'Sunset', 49.226430, -123.085283],
    ['Break and Enter Commercial', 2003, 9, 26, 2, 30, 'West End', 49.284715, -123.122824],
    ['Break and Enter Residential/Other', 2003, 10, 21, 10, 0, 'Grandview-Woodland', 49.267811, -123.067089],
    ['Other Theft', 2003, 1, 25, 12, 30, 'Strathcona', 49.269802, -123.083763],
    ['Offence Against a Person', 2003, 2, 12, 'unknown', 'unknown', 'unknown', 0.000000, 0.000000],
    ['Other Theft', 2003, 1, 9, 6, 45, 'Strathcona', 49.269802, -123.083763],
    ['Offence Against a Person', 2008, 2, 6, 'unknown', 'unknown', 'unknown', 0.000000, 0.000000],
]
np_array = np.array(data)
bool_column = np.int_(np.empty(np_array.shape[0]))
for line in range(np_array.shape[0]):
    if np_array[line,4] and np_array[line,5] == 'unknown':
        bool_column[line] = int(0)
    else:
        bool_column[line] = 1
np_array = np.append(np_array, np.reshape(bool_column,(np_array.shape[0],-1)), axis=1) #add bool col
#sort
np_array = np_array[np_array[:,2].argsort()]#month sort
np_array = np_array[np_array[:,1].argsort(kind='stable')[::-1]] #year sort with month
#count spaces foe nice data output
equals_spaces = []
temp_for_cicle = 0
for num_in_line in range(np_array.shape[1]):
    for line in range (np_array.shape[0]):
        if len(np_array[line][num_in_line]) > temp_for_cicle:
            temp_for_cicle = len(np_array[line][num_in_line])
    equals_spaces.append(temp_for_cicle)
    temp_for_cicle = 0
#data output
for line in range(np_array.shape[0]):
    for num_in_line in range (np_array.shape[1]):
        print('{:<{}}'.format(np_array[line][num_in_line],equals_spaces[num_in_line]), end='')
        if num_in_line 1 == np_array.shape[1]:
            print("")
        else:
            print(" | ", end='')

CodePudding user response:

If you do the sort before converting to numpy, and skip the numpy.argsort calls, it works:

data.sort( key=lambda row: (-row[1], -row[2]) )
np_array = np.array(data)

The advise to use pandas is probably the best advice.

  • Related