Home > front end >  Efficient way to get index to sorted distinct index array from a huge data array
Efficient way to get index to sorted distinct index array from a huge data array

Time:01-13

To simplify the question suppose that I have an array of numbers like

[3, 7, 8, 3, 9, 9, ...]

Now I want to get an array of the index of

array.Distinct().OrderBy(x=>x)

For the example above we first get the result sorted array of [3, 7, 8, 9]. Then we can go through original array, find the index of the result sorted array, finally we get

[0, 1, 2, 0, 3, 3, ...]

This can be achieved as

var array = new[] {3, 7, 8, 3, 9, 9};
var sortedArray = array.Distinct().OrderBy(x => x).ToList();
var result = array.Select(x => sortedArray.IndexOf(x)).ToArray();

However when I have a HUGE array this will be extremely slow. Is there a more efficient way to get the same result?

Note the sortedArray is huge too. (Data range is large)

Thanks.

CodePudding user response:

You can convert the sorted array to a dictionary and search the index from it.

var i = 0;
var sortedDict = array.Distinct().OrderBy(x => x).ToDictionary(x => x, x => i  );
var result = array.Select(x => sortedDict[x]).ToArray();

If you want to save some memory spaces, you can also try BinarySearch.

var result = array.Select(x => sortedArray.BinarySearch(x)).ToArray();
  • Related