Home > Blockchain >  Julia: sort two arrays (like lexsort in numpy)
Julia: sort two arrays (like lexsort in numpy)

Time:06-21

Python example


In Numpy there is lexsort to sort one array within another:

Given multiple sorting keys, which can be interpreted as columns in a spreadsheet, lexsort returns an array of integer indices that describes the sort order by multiple columns.

So taking the following example:

import numpy as np

a = np.array([1,1,1,2,2,2])
b = np.array([10,8,11,4,8,0])

sorted_idx = np.lexsort((b,a))
print(b[sorted_idx])
# [ 8 10 11  0  4  8]

So this sorts b within a as we can see like:

1 1  1      2  2  2
8 10 11     0  4  8

I can't find anything similar in Julia so I wonder how this can be achieved? In my case two columns is sufficient


Julia

So lets take the same data to figure that out:

a = Vector([1,1,1,2,2,2])
b = Vector([10,8,11,4,8,0])

CodePudding user response:

Use sort and sortperm functions with a vector of tuples:

julia> a = [1, 1, 1, 2, 2, 2];

julia> b = [10, 8, 11, 4, 8, 0];

julia> x = collect(zip(a, b))
6-element Vector{Tuple{Int64, Int64}}:
 (1, 10)
 (1, 8)
 (1, 11)
 (2, 4)
 (2, 8)
 (2, 0)

julia> sort(x)
6-element Vector{Tuple{Int64, Int64}}:
 (1, 8)
 (1, 10)
 (1, 11)
 (2, 0)
 (2, 4)
 (2, 8)

julia> sortperm(x) #indices
6-element Vector{Int64}:
 2
 1
 3
 6
 4
 5

CodePudding user response:

With DataFrames.jl it can be a bit shorter to write:

using DataFrames
sortperm(DataFrame(a=a,b=b, copycols=false))

copycols=false is to avoid unnecessary copy of vectors when creating a data frame. If you do not care about performance and want a short code then you can even write:

sortperm(DataFrame(; a, b))
  • Related