Home > Mobile >  How to split a list into 2 unsorted groupings based on the median
How to split a list into 2 unsorted groupings based on the median

Time:04-20

I am aiming to sort a list into two subsections that don't need to be sorted.

Imagine I have a list of length 10 that has values 0-9 in it.

arr = [50, 30, 20, 10, 90, 40, 100, 70, 60, 80]

I would want to sort it in a way that indices 0 through 4 contain values 10, 20, 30, 40, and 50 in any ordering.

For example:

#         SPLIT HERE V
[40, 30, 20, 50, 10,     70, 60, 80, 90, 100]

I've looked into various divide and conquer sorting algorithms, but I'm uncertain which one would be the best to use in this case.

My current thought is to use quicksort, but I believe there is a better way to do what I am searching to do since everything does not need to be sorted exactly, but sorted in a "general" sense that all values are on their respective side of the median in any ordering.

CodePudding user response:

The statistics package has a method for finding the median of a list of numbers. From there, you can use a for loop to separate the values into two separate lists based on whether or not it is greater than the median:

from statistics import median

arr = [50, 30, 20, 10, 90, 40, 100, 70, 60, 80]
med = median(arr)

result1 = []
result2 = []

for item in arr:
    if item <= med:
        result1.append(item)
    else:
        result2.append(item)
        
print(result1)
print(result2)

This outputs:

[50, 30, 20, 10, 40]
[90, 100, 70, 60, 80]

CodePudding user response:

If you would like to solve the problem from scratch you could implement Median of Medians algorithm to find median of unsorted array in linear time. Then it depends what is your goal.

If you would like to make the reordering in place you could use the result of Median of Medians algorithm to select a Pivot for Partition Algorithm (part of quick sort).

On the other hand using python you could then just iterate through the array and append the values respectively to left or right array.

CodePudding user response:

Other current other answers have the list split into two lists, and based on your example I am under the impression there is two groupings, but the output is one list.

import numpy as np

# setup
arr = [50, 30, 20, 10, 90, 40, 100, 70, 60, 80]
# output array
unsorted_grouping = []
# get median
median = np.median(arr)

# loop over array, if greater than median, append. Append always assigns
# values at the end of array
# else insert it at position 0, the beginning / left side
for val in arr:
    if val >= median:
        unsorted_grouping.append(val)
    else:
        unsorted_grouping.insert(0, val)

# output
unsorted_grouping

[40, 10, 20, 30, 50, 90, 100, 70, 60, 80]

CodePudding user response:

You can use the statistics module to calculate the median, and then use it to add each value to one group or the other:

import statistics

arr = [50, 30, 20, 10, 90, 40, 100, 70, 60, 80]

median = statistics.median(arr)
bins = [], []  # smaller and bigger values
for value in arr:
    bins[value > median].append(value)

print(bins[0])  # -> [50, 30, 20, 10, 40]
print(bins[1])  # -> [90, 100, 70, 60, 80]

CodePudding user response:

to me this seems to do the trick , unless you exactly need the output to be unordered :

arr = [50, 30, 20, 10, 90, 40, 100, 70, 60, 80]

sorted_arr = sorted(arr)
median_index = len(arr)//2
sub_list1, sub_list2 = sorted_arr[:median_index],sorted_arr[median_index:]

this outputs :

[10, 20, 30, 40, 50] [60, 70, 80, 90, 100]

CodePudding user response:

You can do this with numpy (which is significantly faster if arr is large):

import numpy as np
arr = [50, 30, 20, 10, 90, 40, 100, 70, 60, 80]
arr = np.array(arr)
median = np.median(arr)
result1 = arr[arr <= median]
result2 = arr[arr > median]

Output:

array([50, 30, 20, 10, 40])
array([ 90, 100,  70,  60,  80])
  • Related