Home > Enterprise >  What could be the sorting algorithm that leads to this biased sorting behaviour?
What could be the sorting algorithm that leads to this biased sorting behaviour?

Time:12-10

For this problem, we are interested in sorting an array of length up to 7. The entries of the array consists of two components, its index and a sorting key. The sorting key is an integer in the range between 1 and 4 (inclusive). As an example, an array we start with can look like this:

[{'index': 0, 'val': 1}, {'index': 1, 'val': 2}, {'index': 2, 'val': 1}].

Any programmer know how to sort an array of this kind, i.e. by calling a built-in sorting algorithm of the programming language or write one on his/her own. This question is not about that.

In this problem, I have observed a large set of outputs of the sorting algorithm, and my goal is to figure out which sorting algorithm is used.

The outputs I have collected are as follows.

  • The fraction of the first element in the sorted array only: link.
  • Array of length 5: link.
  • Array of length 4: link.

As it can be seen, the sorted array has a very biased distribution when it comes to tie-breakers (entries that have the same value). For example, for array

[{'index': 0, 'val': 2}, {'index': 1, 'val': 1}, {'index': 2, 'val': 1}, {'index': 3, 'val': 1}].

The chance of index 1 being the second item in the sorted array is only 12.5%, whereas the chance of index 2 being the second item is 50%.

I have tried reproducing the bahaviour of this mysterious sorting algorithm using most of the known unstable sorting algorithms but I have no luck.

One key observation I have is that the probabilities in a lot of the cases are powers of two (or sums of them). So it sounds like the underlying algorithm is a comparison based algorithm to me. I tried reproducing the behaviour of the algorithm with many known sorting algorithms but nothing fits so far.

The algorithm with the closest behaviour is QuckSort with the middle element as the pivot. In QuickSort, for an array of length n, if all of the elements are equal in value, the probability of the pivot appearing as the first element is 2-(n-1). But the chance that the middle element appearing first in the observed samples is 2-n instead. So QuickSort cannot be the answer to this.

So my question is: do anyone know what (potentially ill-implemented) sorting algorithm can produce this bebaviour? Thanks!

CodePudding user response:

Maybe the sorting algorithm is just to rearrange the array based on the number of possible combinations, which is N!. The algorithm is as follows:

  • Pick a random number between 0 to N! - 1
  • Unpack this number into a combination of N elements
  • Transform input array using this combination
  • If the array is sorted return, else repeat from the first step

CodePudding user response:

Based on the information provided, it is difficult to say for certain what sorting algorithm is being used. However, it is possible that the sorting algorithm is using a variation of quicksort with a different pivot selection strategy.

In quicksort, the pivot is typically chosen as the middle element of the array. However, there are many other ways to choose the pivot, and it is possible that the algorithm in question is using a different pivot selection strategy that leads to the probabilities you observed. For example, the algorithm may be choosing the pivot as the first or last element of the array, or using a random pivot selection strategy.

Another possibility is that the algorithm is using a stable sorting algorithm, such as insertion sort or merge sort, but with a different tie-breaking strategy. Stable sorting algorithms maintain the relative order of elements with equal values, so the probabilities you observed may be the result of a different tie-breaking strategy.

Without more information or a specific implementation of the sorting algorithm, it is difficult to say for certain what algorithm is being used.

  • Related