Page 2 of 2

Re: Custom sort for Linked Lists

Posted: Tue Jan 29, 2019 7:10 am
by zikitrake
Thank you very much :!: You saved my day. Really useful functions.

Re: Custom sort for Linked Lists

Posted: Tue Jan 29, 2019 7:54 am
by Little John
Many thanks, Fred! :-)

zikitrake, I'm glad that the functions are useful for you.

Re: Custom sort for Linked Lists

Posted: Thu Jan 31, 2019 7:10 am
by Little John
New version 1.22, 2019-01-31

Changes
  • Simplified and improved the MergeSort routine, so that it's even faster now.
    As a consequence, the default value of sInsertionSortMaxSize was reduced from 50 to 35.
  • The procedure SetInsertionSortMaxSize() now returns the previous value of sInsertionSortMaxSize for convenience.
  • Some cosmetic changes.

Re: Custom sort for Linked Lists

Posted: Thu Jan 31, 2019 9:08 am
by wilbert
Did you consider using a (natural) bottom-up merge sort instead of top-down ?
Bottom-up is non-recursive.

Re: Custom sort for Linked Lists

Posted: Thu Jan 31, 2019 11:20 am
by Little John
wilbert wrote:Did you consider using a (natural) bottom-up merge sort instead of top-down ?
Thanks for your suggestion!
Yes, I considered using a bottom-up merge sort. Comparing its speed with the speed of the existing version is on my ToDo list.

Re: Custom sort for Linked Lists

Posted: Sat Mar 02, 2019 7:59 pm
by wilbert
Little John wrote:Speed test

The following code demonstrates that the hybrid sorting algorithm is considerably faster than using self-written MergeSort alone. On my system, 35 is a good threshold value.
I got some strange results when I ran your speed test.
I found out it had to do with the copies of lists that you are using. Since they are located in different parts of memory, one list may be in a better place compared to another one.
When I changed your code so every test is done with the same list, the results were very different on my computer.

Code: Select all

;-============  Structures  ============

Structure Person
  Idx.i
  Name.s
  Age.i
EndStructure


Procedure CreateRandomStructures (List x.Person(), n.i)
  Protected m$, a, i, r=Int(n/2)
  
  For i = 1 To r
    m$ = RandomString(Random(1000, 200))
    a = Random(80, 20)
    
    AddElement(x())
    x()\Idx = i*2 - 1
    x()\Name = m$
    x()\Age = a
    
    AddElement(x())
    x()\Idx = i*2
    x()\Name = m$
    x()\Age = a + 10
  Next
EndProcedure

Procedure.i ComparePersons (*pa.Integer, *pb.Integer, mode.i)
  ; -- custom comparison function of type 'ProtoCompare'
  ; in : *pa, *pb: pointers to pointers to structures to be compared
  ;      mode    : mode of comparison:
  ;                #PB_Sort_Ascending/#PB_Sort_Descending
  ; out: return value: #True/#False
  Protected.Person *a = *pa\i, *b = *pb\i     ; dereference the pointers passed as parameters
  
  If (mode & #PB_Sort_Descending)
    ProcedureReturn Bool(*a\Name < *b\Name)
  Else
    ProcedureReturn Bool(*a\Name > *b\Name)
  EndIf
EndProcedure


Define u0, u1, u2, u3, u4

NewList p0.Person()
CreateRandomStructures(p0(), n)

; --------------------------------------------------------------------------

threshold = 35

SortStructuredList(p0(), #PB_Sort_Ascending, OffsetOf(Person\Idx), TypeOf(Person\Idx))
u0 = ElapsedMilliseconds()
SortStructuredList(p0(), #PB_Sort_Ascending, OffsetOf(Person\Name), TypeOf(Person\Name))
u0 = ElapsedMilliseconds() - u0

SortStructuredList(p0(), #PB_Sort_Ascending, OffsetOf(Person\Idx), TypeOf(Person\Idx))
CS::SetInsertionSortMaxSize(2 * threshold)
u1 = ElapsedMilliseconds()
CS::SortListAny(p0(), @ ComparePersons())
u1 = ElapsedMilliseconds() - u1

SortStructuredList(p0(), #PB_Sort_Ascending, OffsetOf(Person\Idx), TypeOf(Person\Idx))
CS::SetInsertionSortMaxSize(threshold)
u2 = ElapsedMilliseconds()
CS::SortListAny(p0(), @ ComparePersons())
u2 = ElapsedMilliseconds() - u2

SortStructuredList(p0(), #PB_Sort_Ascending, OffsetOf(Person\Idx), TypeOf(Person\Idx))
CS::SetInsertionSortMaxSize(0.5 * threshold)
u3 = ElapsedMilliseconds()
CS::SortListAny(p0(), @ ComparePersons())
u3 = ElapsedMilliseconds() - u3

SortStructuredList(p0(), #PB_Sort_Ascending, OffsetOf(Person\Idx), TypeOf(Person\Idx))
CS::SetInsertionSortMaxSize(1)
u4 = ElapsedMilliseconds()
CS::SortListAny(p0(), @ ComparePersons())
u4 = ElapsedMilliseconds() - u4

Re: Custom sort for Linked Lists

Posted: Sat Mar 02, 2019 10:27 pm
by Little John
Hi wilbert!
wilbert wrote: I got some strange results when I ran your speed test.
I found out it had to do with the copies of lists that you are using. Since they are located in different parts of memory, one list may be in a better place compared to another one.
I wasn't aware of that problem. Thank you for your investigations!

Now I also did a speed test the way you proposed (with PB 5.70 x64 on Windows 10). Here is the complete executable code I used:

Code: Select all

EnableExplicit

XIncludeFile "SortListCustom.pbi"

CompilerIf #PB_Compiler_Debugger
   MessageRequester("Error",
                    "Switch the Debugger off!",
                    #PB_MessageRequester_Error)
   End
CompilerEndIf


Procedure.s RandomString (length.i)
   Protected *char.Character, ret$=Space(length)
   
   *char = @ ret$
   While *char\c <> 0
      If Random(1) = 0
         *char\c = Random('Z', 'A')
      Else
         *char\c = Random('z', 'a')
      EndIf
      *char + SizeOf(Character)
   Wend
   
   ProcedureReturn ret$
EndProcedure


;-============  Structures  ============

Structure Person
  Idx.i
  Name.s
  Age.i
EndStructure


Procedure CreateRandomStructures (List x.Person(), n.i)
  Protected m$, a, i, r=Int(n/2)
 
  For i = 1 To r
    m$ = RandomString(Random(1000, 200))
    a = Random(80, 20)
   
    AddElement(x())
    x()\Idx = i*2 - 1
    x()\Name = m$
    x()\Age = a
   
    AddElement(x())
    x()\Idx = i*2
    x()\Name = m$
    x()\Age = a + 10
  Next
EndProcedure

Procedure.i ComparePersons (*pa.Integer, *pb.Integer, mode.i)
  ; -- custom comparison function of type 'ProtoCompare'
  ; in : *pa, *pb: pointers to pointers to structures to be compared
  ;      mode    : mode of comparison:
  ;                #PB_Sort_Ascending/#PB_Sort_Descending
  ; out: return value: #True/#False
  Protected.Person *a = *pa\i, *b = *pb\i     ; dereference the pointers passed as parameters
 
  If (mode & #PB_Sort_Descending)
    ProcedureReturn Bool(*a\Name < *b\Name)
  Else
    ProcedureReturn Bool(*a\Name > *b\Name)
  EndIf
EndProcedure


Define u0, u1, u2, u3, u4
Define msg$, threshold, n = 500000

NewList p0.Person()
CreateRandomStructures(p0(), n)

; --------------------------------------------------------------------------

threshold = 35

SortStructuredList(p0(), #PB_Sort_Ascending, OffsetOf(Person\Idx), TypeOf(Person\Idx))
u0 = ElapsedMilliseconds()
SortStructuredList(p0(), #PB_Sort_Ascending, OffsetOf(Person\Name), TypeOf(Person\Name))
u0 = ElapsedMilliseconds() - u0

SortStructuredList(p0(), #PB_Sort_Ascending, OffsetOf(Person\Idx), TypeOf(Person\Idx))
CS::SetInsertionSortMaxSize(2 * threshold)
u1 = ElapsedMilliseconds()
CS::SortListAny(p0(), @ ComparePersons())
u1 = ElapsedMilliseconds() - u1

SortStructuredList(p0(), #PB_Sort_Ascending, OffsetOf(Person\Idx), TypeOf(Person\Idx))
CS::SetInsertionSortMaxSize(threshold)
u2 = ElapsedMilliseconds()
CS::SortListAny(p0(), @ ComparePersons())
u2 = ElapsedMilliseconds() - u2

SortStructuredList(p0(), #PB_Sort_Ascending, OffsetOf(Person\Idx), TypeOf(Person\Idx))
CS::SetInsertionSortMaxSize(0.5 * threshold)
u3 = ElapsedMilliseconds()
CS::SortListAny(p0(), @ ComparePersons())
u3 = ElapsedMilliseconds() - u3

SortStructuredList(p0(), #PB_Sort_Ascending, OffsetOf(Person\Idx), TypeOf(Person\Idx))
CS::SetInsertionSortMaxSize(1)
u4 = ElapsedMilliseconds()
CS::SortListAny(p0(), @ ComparePersons())
u4 = ElapsedMilliseconds() - u4

; --------------------------------------------------------------------------

msg$ + "-- Sorting structures" + #LF$ +
       "u0 = " + StrD(u0/1000,3) + " Sec. (built-in MergeSort)" + #LF$ +
       "u1 = " + StrD(u1/1000,3) + " Sec."   + #LF$ +
       "u2 = " + StrD(u2/1000,3) + " Sec. <" + #LF$ +
       "u3 = " + StrD(u3/1000,3) + " Sec."   + #LF$ +
       "u4 = " + StrD(u4/1000,3) + " Sec. (self-written MergeSort alone)"

MessageRequester("Duration", msg$)
wilbert wrote: When I changed your code so every test is done with the same list, the results were very different on my computer.
Here are the results on my system with both methods.
Results of my original test wrote:-- Sorting structures
u0 = 0.396 Sec. (built-in MergeSort)
u1 = 1.214 Sec.
u2 = 1.177 Sec. <
u3 = 1.197 Sec.
u4 = 2.679 Sec. (self-written MergeSort alone)
Results of your test wrote:-- Sorting structures
u0 = 0.424 Sec. (built-in MergeSort)
u1 = 1.221 Sec.
u2 = 1.198 Sec. <
u3 = 1.234 Sec.
u4 = 2.742 Sec. (self-written MergeSort alone)
The results are pretty similar. Especially using the proposed threshold (u2) is with both methods faster than using a bigger threshold (u1) or a smaller one (u3).

What are the results that you get on macOS with both my original and your new method?

Re: Custom sort for Linked Lists

Posted: Sun Mar 03, 2019 6:55 am
by wilbert
Your original code.
-- Sorting strings
t0 = 0.667 Sec. (built-in MergeSort)
t1 = 2.530 Sec.
t2 = 2.249 Sec. <
t3 = 1.612 Sec.
t4 = 1.829 Sec. (self-written MergeSort alone)

-- Sorting structures
u0 = 3.444 Sec. (built-in MergeSort)
u1 = 1.913 Sec.
u2 = 1.844 Sec. <
u3 = 1.807 Sec.
u4 = 2.197 Sec. (self-written MergeSort alone)
The changed version
-- Sorting structures
u0 = 0.383 Sec. (built-in MergeSort)
u1 = 1.105 Sec.
u2 = 1.050 Sec. <
u3 = 1.034 Sec.
u4 = 1.255 Sec. (self-written MergeSort alone)
A PB linked list is a chain of small parts of allocated memory.
What I suspect is that when using different lists, for some lists, the assigned memory is more fragmented compared to other lists.
If you have to jump around in memory to do comparisons, this might make a significant difference.
That's the only explanation I can think of :?

The string sorting benchmark might also suffer from this but I wouldn't know how to restore that list to its original presorted order in that case.

Edit:
After some more thinking ...
Modern versions of MacOS can use memory compression.
So another possibility could be that all the duplicated lists of the original version take up so much memory that the OS decides to compress the memory.

Re: Custom sort for Linked Lists

Posted: Sun Mar 03, 2019 1:08 pm
by Little John
wilbert wrote:Your original code.
[...]

-- Sorting structures
u0 = 3.444 Sec. (built-in MergeSort)
u1 = 1.913 Sec.
u2 = 1.844 Sec. <
u3 = 1.807 Sec.
u4 = 2.197 Sec. (self-written MergeSort alone)
This is really strange. It looks as if the built-in routine were the slowest one. :D
In contrast, the results of your changed version make sense.

I have run my original speed test rather often (but only on Windows), and always got consistent results. So I think the main cause of the strange results that you encountered might be the different memory handling of Windows and MacOS.

I've changed now my original benchmark code, also for sorting the string lists. (I'm storing the order of the elements in a list of pointers, and use the pointers for rearranging the elements of the string list before each new speed test.)

Thanks again for finding this issue and reporting it!

Re: Custom sort for Linked Lists

Posted: Sat May 02, 2020 9:06 am
by kinglestat
It's nice.....I was going to do something similar (simpler!) unfortunately I couldn't translate it to SpiderBasic...which is rather finicky on a good day

Re: Custom sort for Linked Lists

Posted: Sat Aug 22, 2020 1:03 pm
by User_Russian
Why doesn't it sort?

Code: Select all

Structure struct
  x.l
EndStructure

Procedure.i CompareStruc(*pa.struct, *pb.struct, mode.i)
  Res = #False
  
  If *pa\x > *pb\x
    Res = #True
  EndIf
  
  ProcedureReturn Res
EndProcedure

NewList l.struct()

For i=1 To 100
  AddElement(l())
  l()\x = Random(100, 10)
Next

CS::SortListAny(l(), @CompareStruc())

ForEach l()
  Debug l()\x
Next

Re: Custom sort for Linked Lists

Posted: Sat Aug 22, 2020 3:41 pm
by Little John
I have provided a lot of demo code in this thread. It's all there.
User_Russian wrote:Why doesn't it sort?
Because you are not using a proper custom comparison function.

For usage with SortListAny(), the pointers passed to the comparison function must be dereferenced.
With your code, this function works:

Code: Select all

Procedure.i CompareStruc (*pa.Integer, *pb.Integer, mode.i)
   Protected.struct *a = *pa\i, *b = *pb\i    ; dereference the pointers
   Protected.i Res
   
   Res = #False
   If *a\x > *b\x
      Res = #True
   EndIf
   
   ProcedureReturn Res
EndProcedure