I've added a complete Fortran program for compilation at the bottom of this post such that the output can be reproduced. The execution times are linear.
More timing data in a clearer format using the code below from a Debian environment in Win 10:
dwalters@Lapper3:~/PROGRAMMING/DATA-WATER$ for (( i=100000; i<=50000000; i=2*i )); do ./derrelSORT-example.py $i; done | awk 'BEGIN {print "N Time(s)"}; {if ($1=="Creating") {printf $4" "} else if ($1=="Sorting" && $NF=="seconds") {print $3}}'
N Time(s)
100000 0.01
200000 0.02
400000 0.04
800000 0.08
1600000 0.17
3200000 0.35
6400000 0.76
12800000 1.59
25600000 3.02
This code executes linearly with respect to the number of elements (integer example given here). It achieves this by exponentially increasing the size of the sorted chunks as the (merge)sort proceeds. To facilitate the exponentially growing chunks:
- The number of iterations needs to be calculated before the sort begins
- Indices transformations need to be derived for the chunks (language specific depending upon the indexing protocol) for passage to merge()
- Gracefully handle the remainder at the tail of the list when the chunk size is not evenly divisible by a power of 2
With these things in mind and starting, traditionally, by merging pairs of single value arrays, the merged chunks can be grown from 2 to 4 to 8 to 16 to --- to 2^n. This single case is the exception that breaks the speed limit of O(nlogn) time complexity for comparative sorts. This routine sorts linearly with respect to the number of elements to be sorted.
Can anyone sort faster? ;)
Fortran Code (derrelSort.f90):
! Derrel Walters © 2019
! These sort routines were written by Derrel Walters ~ 2019-01-23
SUBROUTINE iSORT(arrA, nA)
! This implementation of derrelSORT is for integers,
! but the same principles apply for other datatypes.
!
! ~ Derrel Walters
IMPLICIT NONE
INTEGER(KIND=8),INTENT(IN) :: nA
INTEGER,DIMENSION(nA),INTENT(INOUT) :: arrA
INTEGER,DIMENSION(nA) :: arrB
INTEGER(KIND=8) :: lowIDX, highIDX, midIDX
INTEGER :: iStat
INTEGER(KIND=8) :: i, j, A, B, C, thisHigh, mergeSize, nLoops
INTEGER,DIMENSION(:),ALLOCATABLE :: iterMark
LOGICAL,DIMENSION(:),ALLOCATABLE :: moreToGo
arrB = arrA
mergeSize = 2
lowIDX = 1 - mergeSize
highIDX = 0
nLoops = INT(LOG(REAL(nA))/LOG(2.0))
ALLOCATE(iterMark(nLoops), moreToGo(nLoops), STAT=iStat)
moreToGo = .FALSE.
iterMark = 0
DO i = 1, nLoops
iterMark(i) = FLOOR(REAL(nA)/2**i)
IF (MOD(nA, 2**i) > 0) THEN
moreToGo(i) = .TRUE.
iterMark(i) = iterMark(i) 1
END IF
END DO
DO i = 1, nLoops
DO j = 1, iterMark(i)
A = 0
B = 1
C = 0
lowIDX = lowIDX mergeSize
highIDX = highIDX mergeSize
midIDX = (lowIDX highIDX 1) / 2
thisHigh = highIDX
IF (j == iterMark(i).AND.moreToGo(i)) THEN
lowIDX = lowIDX - mergeSize
highIDX = highIDX - mergeSize
midIDX = (lowIDX highIDX 1) / 2
A = midIDX - lowIDX
B = 2
C = nA - 2*highIDX midIDX - 1
thisHigh = nA
END IF
CALL imerge(arrA(lowIDX:midIDX-1 A), B*(midIDX-lowIDX), &
arrA(midIDX A:thisHigh), highIDX-midIDX 1 C, &
arrB(lowIDX:thisHigh), thisHigh-lowIDX 1)
arrA(lowIDX:thisHigh) = arrB(lowIDX:thisHigh)
END DO
mergeSize = 2*mergeSize
lowIDX = 1 - mergeSize
highIDX = 0
END DO
END SUBROUTINE iSORT
SUBROUTINE imerge(arrA, nA, arrB, nB, arrC, nC)
! This merge is a faster merge. Array A arrives
! just to the left of Array B, and Array C is
! filled from both ends simultaneously - while
! still preserving the stability of the sort.
! The derrelSORT routine is so fast, that
! the merge does not affect the O(n) time
! complexity of the sort in practice
!
! ~ Derrel Walters
IMPLICIT NONE
INTEGER(KIND=8),INTENT(IN) :: nA, nB , nC
INTEGER,DIMENSION(nA),INTENT(IN) :: arrA
INTEGER,DIMENSION(nB),INTENT(IN) :: arrB
INTEGER,DIMENSION(nC),INTENT(INOUT) :: arrC
INTEGER(KIND=8) :: i, j, k, x, y, z
arrC = 0
i = 1
j = 1
k = 1
x = nA
y = nB
z = nC
DO
IF (i > x .OR. j > y) EXIT
IF (arrB(j) < arrA(i)) THEN
arrC(k) = arrB(j)
j = j 1
ELSE
arrC(k) = arrA(i)
i = i 1
END IF
IF (arrA(x) > arrB(y)) THEN
arrC(z) = arrA(x)
x = x - 1
ELSE
arrC(z) = arrB(y)
y = y - 1
END IF
k = k 1
z = z - 1
END DO
IF (i <= x) THEN
DO
IF (i > x) EXIT
arrC(k) = arrA(i)
i = i 1
k = k 1
END DO
ELSEIF (j <= y) THEN
DO
IF (j > y) EXIT
arrC(k) = arrB(j)
j = j 1
k = k 1
END DO
END IF
END SUBROUTINE imerge
Times using f2py3 to convert the above fortran file (derrelSORT.f90) into something callable in python. Here is the python code and times it produced (derrelSORT-example.py):
#!/bin/python3
import numpy as np
import derrelSORT as dS
import time as t
import random as rdm
import sys
try:
array_len = int(sys.argv[1])
except IndexError:
array_len = 100000000
# Create an array with array_len elements
print(50*'-')
print("Creating array of", array_len, "random integers.")
t0 = t.time()
x = np.asfortranarray(np.array([round(100000*rdm.random(),0)
for i in range(array_len)]).astype(np.int32))
t1 = t.time()
print('Creation time:', round(t1-t0, 2), 'seconds')
# Sort the array using derrelSORT
print("Sorting the array with derrelSORT.")
t0 = t.time()
dS.isort(x, len(x))
t1 = t.time()
print('Sorting time:', round(t1-t0, 2), 'seconds')
print(50*'-')
Output from the command line. Please note the times.
dwalters@Lapper3:~/PROGRAMMING/DATA-WATER$ ./derrelSORT-example.py 1000000
--------------------------------------------------
Creating array of 1000000 random integers.
Creation time: 0.78 seconds
Sorting the array with derrelSORT.
Sorting time: 0.1 seconds
--------------------------------------------------
dwalters@Lapper3:~/PROGRAMMING/DATA-WATER$ ./derrelSORT-example.py 10000000
--------------------------------------------------
Creating array of 10000000 random integers.
Creation time: 8.1 seconds
Sorting the array with derrelSORT.
Sorting time: 1.07 seconds
--------------------------------------------------
dwalters@Lapper3:~/PROGRAMMING/DATA-WATER$ ./derrelSORT-example.py 20000000
--------------------------------------------------
Creating array of 20000000 random integers.
Creation time: 15.73 seconds
Sorting the array with derrelSORT.
Sorting time: 2.21 seconds
--------------------------------------------------
dwalters@Lapper3:~/PROGRAMMING/DATA-WATER$ ./derrelSORT-example.py 40000000
--------------------------------------------------
Creating array of 40000000 random integers.
Creation time: 31.64 seconds
Sorting the array with derrelSORT.
Sorting time: 4.39 seconds
--------------------------------------------------
dwalters@Lapper3:~/PROGRAMMING/DATA-WATER$ ./derrelSORT-example.py 80000000
--------------------------------------------------
Creating array of 80000000 random integers.
Creation time: 64.03 seconds
Sorting the array with derrelSORT.
Sorting time: 8.92 seconds
--------------------------------------------------
dwalters@Lapper3:~/PROGRAMMING/DATA-WATER$ ./derrelSORT-example.py 160000000
--------------------------------------------------
Creating array of 160000000 random integers.
Creation time: 129.56 seconds
Sorting the array with derrelSORT.
Sorting time: 18.04 seconds
--------------------------------------------------
More output:
dwalters@Lapper3:~/PROGRAMMING/DATA-WATER$ for (( i=100000; i<=500000000; i=2*i )); do
> ./derrelSORT-example.py $i
> done
--------------------------------------------------
Creating array of 100000 random integers.
Creation time: 0.08 seconds
Sorting the array with derrelSORT.
Sorting time: 0.01 seconds
--------------------------------------------------
--------------------------------------------------
Creating array of 200000 random integers.
Creation time: 0.16 seconds
Sorting the array with derrelSORT.
Sorting time: 0.02 seconds
--------------------------------------------------
--------------------------------------------------
Creating array of 400000 random integers.
Creation time: 0.32 seconds
Sorting the array with derrelSORT.
Sorting time: 0.04 seconds
--------------------------------------------------
--------------------------------------------------
Creating array of 800000 random integers.
Creation time: 0.68 seconds
Sorting the array with derrelSORT.
Sorting time: 0.08 seconds
--------------------------------------------------
--------------------------------------------------
Creating array of 1600000 random integers.
Creation time: 1.25 seconds
Sorting the array with derrelSORT.
Sorting time: 0.15 seconds
--------------------------------------------------
--------------------------------------------------
Creating array of 3200000 random integers.
Creation time: 2.57 seconds
Sorting the array with derrelSORT.
Sorting time: 0.32 seconds
--------------------------------------------------
--------------------------------------------------
Creating array of 6400000 random integers.
Creation time: 5.23 seconds
Sorting the array with derrelSORT.
Sorting time: 0.66 seconds
--------------------------------------------------
--------------------------------------------------
Creating array of 12800000 random integers.
Creation time: 10.09 seconds
Sorting the array with derrelSORT.
Sorting time: 1.35 seconds
--------------------------------------------------
--------------------------------------------------
Creating array of 25600000 random integers.
Creation time: 20.25 seconds
Sorting the array with derrelSORT.
Sorting time: 2.74 seconds
--------------------------------------------------
--------------------------------------------------
Creating array of 51200000 random integers.
Creation time: 41.84 seconds
Sorting the array with derrelSORT.
Sorting time: 5.62 seconds
--------------------------------------------------
--------------------------------------------------
Creating array of 102400000 random integers.
Creation time: 93.19 seconds
Sorting the array with derrelSORT.
Sorting time: 11.49 seconds
--------------------------------------------------
--------------------------------------------------
Creating array of 204800000 random integers.
Creation time: 167.55 seconds
Sorting the array with derrelSORT.
Sorting time: 24.13 seconds
--------------------------------------------------
--------------------------------------------------
Creating array of 409600000 random integers.
Creation time: 340.84 seconds
Sorting the array with derrelSORT.
Sorting time: 47.21 seconds
--------------------------------------------------
When the array size doubles, the time doubles - as demonstrated. Thus, Mr. Mischels initial assessment was incorrect. The reason why is because that, while the outer loop determines the number of cycles at each chunk size (which is log2(n)), the inner loop counter decreases exponentially as the sort proceeds. The proverbial proof is the pudding, however. The times demonstrate the linearity clearly.
If anyone needs any assistance reproducing the results, please let me know. I'm happy to help.
The Fortran program found at the end of this is an as-is copy of that I wrote in 2019. It is meant to be used on the command-line. To compile it:
- Copy the fortran code to a file with an .f90 extension
- Compile the code using a command, such as:
gfortran -o derrelSORT-ex.x derrelSORT.f90
- Give yourself permission to run the executable:
chmod u x derrelSORT-ex.x
- Execute the program from the command-line with or without an integer argument:
./derrelSORT-ex.x
or
./derrelSORT-ex.x 10000000
The output should look something like this (here, I've used a bash c-style loop to call the command repeatedly). Notice that as the array sizes double with each iteration, the execution time also doubles.
SORT-RESEARCH$ for (( i=100000; i<500000000; i=2*i )); do
> ./derrelSORT-2022.x $i
> done
Derrel Walters © 2019
Demonstrating derrelSORT©
WARNING: This program can produce LARGE files!
Generating random array of length: 100000
Time = 0.0000 seconds
Writing Array to rand-in.txt:
Time = 0.0312 seconds
Sorting the Array
Time = 0.0156 seconds
Writing Array to rand-sorted-out.txt:
Time = 0.0469 seconds
Derrel Walters © 2019
Demonstrating derrelSORT©
WARNING: This program can produce LARGE files!
Generating random array of length: 200000
Time = 0.0000 seconds
Writing Array to rand-in.txt:
Time = 0.0625 seconds
Sorting the Array
Time = 0.0312 seconds
Writing Array to rand-sorted-out.txt:
Time = 0.0312 seconds
Derrel Walters © 2019
Demonstrating derrelSORT©
WARNING: This program can produce LARGE files!
Generating random array of length: 400000
Time = 0.0156 seconds
Writing Array to rand-in.txt:
Time = 0.1250 seconds
Sorting the Array
Time = 0.0625 seconds
Writing Array to rand-sorted-out.txt:
Time = 0.0938 seconds
Derrel Walters © 2019
Demonstrating derrelSORT©
WARNING: This program can produce LARGE files!
Generating random array of length: 800000
Time = 0.0156 seconds
Writing Array to rand-in.txt:
Time = 0.2344 seconds
Sorting the Array
Time = 0.1406 seconds
Writing Array to rand-sorted-out.txt:
Time = 0.2031 seconds
Derrel Walters © 2019
Demonstrating derrelSORT©
WARNING: This program can produce LARGE files!
Generating random array of length: 1600000
Time = 0.0312 seconds
Writing Array to rand-in.txt:
Time = 0.4219 seconds
Sorting the Array
Time = 0.2969 seconds
Writing Array to rand-sorted-out.txt:
Time = 0.3906 seconds
Derrel Walters © 2019
Demonstrating derrelSORT©
WARNING: This program can produce LARGE files!
Generating random array of length: 3200000
Time = 0.0625 seconds
Writing Array to rand-in.txt:
Time = 0.8281 seconds
Sorting the Array
Time = 0.6562 seconds
Writing Array to rand-sorted-out.txt:
Time = 0.7969 seconds
Derrel Walters © 2019
Demonstrating derrelSORT©
WARNING: This program can produce LARGE files!
Generating random array of length: 6400000
Time = 0.0938 seconds
Writing Array to rand-in.txt:
Time = 1.5938 seconds
Sorting the Array
Time = 1.3281 seconds
Writing Array to rand-sorted-out.txt:
Time = 1.6406 seconds
Derrel Walters © 2019
Demonstrating derrelSORT©
WARNING: This program can produce LARGE files!
Generating random array of length: 12800000
Time = 0.2500 seconds
Writing Array to rand-in.txt:
Time = 3.3906 seconds
Sorting the Array
Time = 2.7031 seconds
Writing Array to rand-sorted-out.txt:
Time = 3.2656 seconds
Derrel Walters © 2019
Demonstrating derrelSORT©
WARNING: This program can produce LARGE files!
Generating random array of length: 25600000
Time = 0.4062 seconds
Writing Array to rand-in.txt:
Time = 6.6250 seconds
Sorting the Array
Time = 5.6094 seconds
Writing Array to rand-sorted-out.txt:
Time = 6.5312 seconds
Derrel Walters © 2019
Demonstrating derrelSORT©
WARNING: This program can produce LARGE files!
Generating random array of length: 51200000
Time = 0.8281 seconds
Writing Array to rand-in.txt:
Time = 13.2656 seconds
Sorting the Array
Time = 11.5000 seconds
Writing Array to rand-sorted-out.txt:
Time = 13.1719 seconds
Derrel Walters © 2019
Demonstrating derrelSORT©
WARNING: This program can produce LARGE files!
Generating random array of length: 102400000
Time = 1.6406 seconds
Writing Array to rand-in.txt:
Time = 26.3750 seconds
Sorting the Array
Time = 23.3438 seconds
Writing Array to rand-sorted-out.txt:
Time = 27.0625 seconds
Derrel Walters © 2019
Demonstrating derrelSORT©
WARNING: This program can produce LARGE files!
Generating random array of length: 204800000
Time = 3.3438 seconds
Writing Array to rand-in.txt:
Time = 53.1094 seconds
Sorting the Array
Time = 47.3750 seconds
Writing Array to rand-sorted-out.txt:
Time = 52.8906 seconds
Derrel Walters © 2019
Demonstrating derrelSORT©
WARNING: This program can produce LARGE files!
Generating random array of length: 409600000
Time = 6.6562 seconds
Writing Array to rand-in.txt:
Time = 105.1875 seconds
Sorting the Array
Time = 99.5938 seconds
Writing Array to rand-sorted-out.txt:
Time = 109.9062 seconds
This is the program as-is from 2019 without modification:
SORT-RESEARCH$ cat derrelSORT.f90
! Derrel Walters © 2019
! These sort routines were written by Derrel Walters ~ 2019-01-23
PROGRAM sort_test
! This program demonstrates a linear sort routine
! by generating a random array (here integer), writing it
! to a file 'rand-in.txt', sorting it with an
! implementation of derrelSORT (here for integers -
! where the same principles apply for other applicable
! datatypes), and finally, printing the sorted array
! to a file 'rand-sorted-out.txt'.
!
! To the best understanding of the author, the expert
! concensus is that a comparative sort can, at best,
! be done with O(nlogn) time complexity. Here a sort
! is demonstrated which experimentally runs O(n).
!
! Such time complexity is currently considered impossible
! for a sort. Using this sort, extremely large amounts of data can be
! sorted on any modern computer using a single processor core -
! provided the computer has enough memory to hold the array! For example,
! the sorting time for a given array will be on par (perhaps less than)
! what it takes the same computer to write the array to a file.
!
! ~ Derrel Walters
IMPLICIT NONE
INTEGER,PARAMETER :: in_unit = 21
INTEGER,PARAMETER :: out_unit = 23
INTEGER,DIMENSION(:),ALLOCATABLE :: iArrA
REAL,DIMENSION(:),ALLOCATABLE :: rArrA
CHARACTER(LEN=15) :: cDims
CHARACTER(LEN=80) :: ioMsgStr
INTEGER(KIND=8) :: nDims, i
INTEGER :: iStat
REAL :: start, finish
WRITE(*,*) ''
WRITE(*,'(A)') 'Derrel Walters © 2019'
WRITE(*,*) ''
WRITE(*,'(A)') 'Demonstrating derrelSORT©'
WRITE(*,'(A)') 'WARNING: This program can produce LARGE files!'
WRITE(*,*) ''
CALL GET_COMMAND_ARGUMENT(1, cDims)
IF (cDims == '') THEN
nDims = 1000000
ELSE
READ(cDims,'(1I15)') nDims
END IF
ALLOCATE(iArrA(nDims),rArrA(nDims),STAT=iStat)
WRITE(*,'(A,1X,1I16)') 'Generating random array of length:', nDims
CALL CPU_TIME(start)
CALL RANDOM_NUMBER(rArrA)
iArrA = INT(rArrA*1000000)
CALL CPU_TIME(finish)
WRITE(*,'(A,1X,f9.4,1X,A)') 'Time =',finish-start,'seconds'
DEALLOCATE(rArrA,STAT=iStat)
WRITE(*,'(A)') 'Writing Array to rand-in.txt: '
OPEN(UNIT=in_unit,FILE='rand-in.txt',STATUS='REPLACE',ACTION='WRITE',IOSTAT=iStat,IOMSG=ioMsgStr)
IF (iStat /= 0) THEN
WRITE(*,'(A)') ioMsgStr
ELSE
CALL CPU_TIME(start)
DO i=1, nDims
WRITE(in_unit,*) iArrA(i)
END DO
CLOSE(in_unit)
CALL CPU_TIME(finish)
WRITE(*,'(A,1X,f9.4,1X,A)') 'Time =',finish-start,'seconds'
END IF
WRITE(*,'(A)') 'Sorting the Array'
CALL CPU_TIME(start)
CALL iderrelSORT(iArrA, nDims) !! SIZE(iArrA))
CALL CPU_TIME(finish)
WRITE(*,'(A,1X,f9.4,1X,A)') 'Time =',finish-start,'seconds'
WRITE(*,'(A)') 'Writing Array to rand-sorted-out.txt: '
OPEN(UNIT=out_unit,FILE='rand-sorted-out.txt',STATUS='REPLACE',ACTION='WRITE',IOSTAT=iStat,IOMSG=ioMsgStr)
IF (iStat /= 0) THEN
WRITE(*,'(A)') ioMsgStr
ELSE
CALL CPU_TIME(start)
DO i=1, nDims
WRITE(out_unit,*) iArrA(i)
END DO
CLOSE(out_unit)
CALL CPU_TIME(finish)
WRITE(*,'(A,1X,f9.4,1X,A)') 'Time =',finish-start,'seconds'
END IF
WRITE(*,*) ''
END PROGRAM sort_test
SUBROUTINE iderrelSORT(arrA, nA)
! This implementation of derrelSORT is for integers,
! but the same principles apply for other datatypes.
!
! ~ Derrel Walters
IMPLICIT NONE
INTEGER(KIND=8),INTENT(IN) :: nA
INTEGER,DIMENSION(nA),INTENT(INOUT) :: arrA
INTEGER,DIMENSION(nA) :: arrB
INTEGER(KIND=8) :: lowIDX, highIDX, midIDX
INTEGER :: iStat
INTEGER(KIND=8) :: i, j, A, B, C, thisHigh, mergeSize, nLoops
INTEGER,DIMENSION(:),ALLOCATABLE :: iterMark
LOGICAL,DIMENSION(:),ALLOCATABLE :: moreToGo
arrB = arrA
mergeSize = 2
lowIDX = 1 - mergeSize
highIDX = 0
nLoops = INT(LOG(REAL(nA))/LOG(2.0))
ALLOCATE(iterMark(nLoops), moreToGo(nLoops), STAT=iStat)
moreToGo = .FALSE.
iterMark = 0
DO i = 1, nLoops
iterMark(i) = FLOOR(REAL(nA)/2**i)
IF (MOD(nA, 2**i) > 0) THEN
moreToGo(i) = .TRUE.
iterMark(i) = iterMark(i) 1
END IF
END DO
DO i = 1, nLoops
DO j = 1, iterMark(i)
A = 0
B = 1
C = 0
lowIDX = lowIDX mergeSize
highIDX = highIDX mergeSize
midIDX = (lowIDX highIDX 1) / 2
thisHigh = highIDX
IF (j == iterMark(i).AND.moreToGo(i)) THEN
lowIDX = lowIDX - mergeSize
highIDX = highIDX - mergeSize
midIDX = (lowIDX highIDX 1) / 2
A = midIDX - lowIDX
B = 2
C = nA - 2*highIDX midIDX - 1
thisHigh = nA
END IF
!! The traditional merge can also be used (see subroutine for comment). !!
! !
! CALL imerge(arrA(lowIDX:midIDX-1 A), B*(midIDX-lowIDX), & !
! arrA(midIDX A:thisHigh), highIDX-midIDX 1 C, & !
! arrB(lowIDX:thisHigh), thisHigh-lowIDX 1) !
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
CALL imerge2(arrA(lowIDX:midIDX-1 A), B*(midIDX-lowIDX), &
arrA(midIDX A:thisHigh), highIDX-midIDX 1 C, &
arrB(lowIDX:thisHigh), thisHigh-lowIDX 1)
arrA(lowIDX:thisHigh) = arrB(lowIDX:thisHigh)
END DO
mergeSize = 2*mergeSize
lowIDX = 1 - mergeSize
highIDX = 0
END DO
END SUBROUTINE iderrelSORT
SUBROUTINE imerge(arrA, nA, arrB, nB, arrC, nC)
! This merge is a traditional merge that places
! the lowest element first. The form that the
! time complexity takes, O(n), is not affected
! by the merge routine - yet this routine
! does not run as fast as the merge used in
! imerge2.
!
! ~Derrel Walters
IMPLICIT NONE
INTEGER(KIND=8),INTENT(IN) :: nA, nB , nC
INTEGER,DIMENSION(nA),INTENT(IN) :: arrA
INTEGER,DIMENSION(nB),INTENT(IN) :: arrB
INTEGER,DIMENSION(nC),INTENT(INOUT) :: arrC
INTEGER(KIND=8) :: i, j, k
arrC = 0
i = 1
j = 1
k = 1
DO
IF (i > nA .OR. j > NB) EXIT
IF (arrB(j) < arrA(i)) THEN
arrC(k) = arrB(j)
j = j 1
ELSE
arrC(k) = arrA(i)
i = i 1
END IF
k = k 1
END DO
IF (i <= nA) THEN
DO
IF (i > nA) EXIT
arrC(k) = arrA(i)
i = i 1
k = k 1
END DO
ELSEIF (j <= nB) THEN
DO
IF (j > nB) EXIT
arrC(k) = arrB(j)
j = j 1
k = k 1
END DO
END IF
END SUBROUTINE imerge
SUBROUTINE imerge2(arrA, nA, arrB, nB, arrC, nC)
! This merge is a faster merge. Array A arrives
! just to the left of Array B, and Array C is
! filled from both ends simultaneously - while
! still preserving the stability of the sort.
! The derrelSORT routine is so fast, that
! the merge does not affect the O(n) time
! complexity of the sort in practice
! (perhaps, making its execution more linear
! at small numbers of elements).
!
! ~ Derrel Walters
IMPLICIT NONE
INTEGER(KIND=8),INTENT(IN) :: nA, nB , nC
INTEGER,DIMENSION(nA),INTENT(IN) :: arrA
INTEGER,DIMENSION(nB),INTENT(IN) :: arrB
INTEGER,DIMENSION(nC),INTENT(INOUT) :: arrC
INTEGER(KIND=8) :: i, j, k, x, y, z
arrC = 0
i = 1
j = 1
k = 1
x = nA
y = nB
z = nC
DO
IF (i > x .OR. j > y) EXIT
IF (arrB(j) < arrA(i)) THEN
arrC(k) = arrB(j)
j = j 1
ELSE
arrC(k) = arrA(i)
i = i 1
END IF
IF (arrA(x) > arrB(y)) THEN
arrC(z) = arrA(x)
x = x - 1
ELSE
arrC(z) = arrB(y)
y = y - 1
END IF
k = k 1
z = z - 1
END DO
IF (i <= x) THEN
DO
IF (i > x) EXIT
arrC(k) = arrA(i)
i = i 1
k = k 1
END DO
ELSEIF (j <= y) THEN
DO
IF (j > y) EXIT
arrC(k) = arrB(j)
j = j 1
k = k 1
END DO
END IF
END SUBROUTINE imerge2
MOAR data using the Fortran version. Anyone into straight lines?
SORT-RESEARCH$ for (( i=100000; i<500000000; i=2*i )); do ./derrelSORT-2022.x $i; done | awk 'BEGIN {old_1="Derrel"; print "N Time(s)"};{if ($1 == "Generating") {printf $NF" "; old_1=$1} else if (old_1 == "Sorting") {print $3; old_1=$1} else {old_1=$1}}'
N Time(s)
100000 0.0000
200000 0.0312
400000 0.0625
800000 0.1562
1600000 0.2969
3200000 0.6250
6400000 1.3594
12800000 2.7500
25600000 5.5625
51200000 11.8906
102400000 23.3750
204800000 47.3750
409600000 96.4531
Appears linear, doesn't it? ;) Fortran sorting times from above plotted.
CodePudding user response:
Your algorithm is not O(n). Your calculated number of loops (nLoops
) is log2(n)
. The number of inner loops (the values in iterMark
) will be essentially n/2, n/4, n/8, etc. But the segment sizes really don't matter because every time through the outer loop you look at every item in the list.
No matter how you obfuscate it, you're doing log2(n) passes over n items: O(n log n).
Your code is a fairly standard merge sort, which is proven to be O(n log n). It's well proven that the general case for comparison sorts is O(n log n). Sure, some algorithms can sort some specific cases more quickly. Conversely, the same algorithms have pathological cases that will take O(n^2). Other comparison sorts (heap sort, merge sort, for example) are not highly subject to the order of items. But in the general case comparison sorts make on the order of n log n comparisons. See https://www.cs.cmu.edu/~avrim/451f11/lectures/lect0913.pdf for a detailed explanation.
But don't take my word for it. You can easily test yourself by doing some simple timings. Time how long it takes to sort, say, 100K items. If your algorithm is indeed O(n), then it should take approximately twice as long to sort 200K items, and ten times as long to sort 1 million items. But if it's O(n log n), as I suspect, then the timings will be somewhat longer.
Consider: log(2) of 100K is 16.61. log(2) of 200K is 17.61. So sorting 100K items (if the algorithm is O(log n)) will take time proportional to 100K * 16.61. Sorting 200K items will take time proportional to 200K * 17.71. Doing the arithmetic:
100K * 16.61 = 1,661,000
200K * 17.61 = 3,522,000
So 200K items will take approximately 2.12 times (3,522,000/1,661,000) as long. Or, about 10% longer than if the algorithm is linear.
If you're still unsure, pump it up to a million items. If the algorithm is linear, then a million items will take 10x the time that 100K items took. If it's O(n log n), then it will take 12 times as long.
1M * 19.93 = 19,930,000
(19,930,000 / 1,661,000) = 11.9987 (call it 12)
CodePudding user response:
My f2py skills are not strong, so I wrote a pure fortran wrapper for your code (posted below, if you want to check it), and the timings I got were:
n time (s) 0.1*n/1e6 0.1*n*log(n)/1e6*log(1e6)
1000000 0.109375000 0.100000001 0.100000001
2000000 0.203125000 0.200000003 0.210034326
4000000 0.453125000 0.400000006 0.440137327
8000000 0.937500000 0.800000012 0.920411944
16000000 1.92187500 1.60000002 1.92109859
32000000 4.01562500 3.20000005 4.00274658
64000000 8.26562500 6.40000010 8.32659149
128000000 17.0468750 12.8000002 17.2953815
256000000 35.1406250 25.6000004 35.8751564
This is... not looking good for your O(n)
theory, I'm afraid.
My wrapper:
module m
contains
! Your code goes here
end module
program p
use m
implicit none
integer(8) :: i,n
real, allocatable :: real_array(:)
integer, allocatable :: int_array(:)
real :: start
real :: stop
real_array = [0]
int_array = [0]
write(*,*) "n time (s) 0.1*n/1e6 0.1*n*log(n)/1e6*log(1e6)"
do i=0,30
n = 2**i*1e6
deallocate(real_array, int_array)
allocate(real_array(n), int_array(n))
call random_number(real_array)
int_array = -huge(0)*real_array 2.0*huge(0)
call cpu_time(start)
call isort(int_array, n)
call cpu_time(stop)
write(*,*) n, stop-start, 0.1*n/1.0e6, 0.1*n*log(1.0*n)/(1.0e6*log(1.0e6))
enddo
end program
CodePudding user response:
I don't doubt that your sort is fast, and I can believe that it compares favorably with the sort
command-line utility. But it is an O(N log(N)) iterative merge sort, not an O(N) sort (nor a novel algorithm).
Observe,
- Your outer loop iterates O(log(N)) times.
- On each of those iterations, the inner loop iterates O(N / 2k) times for some k.
- And the main work of each inner-loop iteration is to split O(2k) items into two halves and merge those together, which involves examining and moving every single item. (And then moving them all again back to the original array.) That costs O(2k) operations per inner-loop iteration.
Those factors all multiply together:
O(log(N)) * O(N / 2k) * O(2k)
The factors of 2k cancel each other, and you're left with O(N log(N)). (The ks are functions of N, so they cannot simply be ignored as constants.)
The logarithm grows very slowly, so if you don't look too closely then it is easy to be fooled into thinking you see linear growth when it's really N log(N). You need to look at wide ranges of values to see the superlinearity, and some is indeed visible in your data.
As for your plot, there's a problem with the result of your curve fitting: the y
intercept is significantly negative (for the scale of the data, and especially for the concentration of points with small y
). Your data may fit a linear model well (if not entirely sensically), but they sure appear to fit an N log(N) model better.