Re: Custom sort for Linked Lists
Posted: Tue Jan 29, 2019 7:10 am
Thank you very much You saved my day. Really useful functions.
http://www.purebasic.com
http://forums.purebasic.com/english/
Thanks for your suggestion!wilbert wrote:Did you consider using a (natural) bottom-up merge sort instead of top-down ?
I got some strange results when I ran your speed test.Little John wrote:Speed test
The following code demonstrates that the hybrid sorting algorithm is considerably faster than using self-written MergeSort alone. On my system, 35 is a good threshold value.
Code: Select all
;-============ Structures ============
Structure Person
Idx.i
Name.s
Age.i
EndStructure
Procedure CreateRandomStructures (List x.Person(), n.i)
Protected m$, a, i, r=Int(n/2)
For i = 1 To r
m$ = RandomString(Random(1000, 200))
a = Random(80, 20)
AddElement(x())
x()\Idx = i*2 - 1
x()\Name = m$
x()\Age = a
AddElement(x())
x()\Idx = i*2
x()\Name = m$
x()\Age = a + 10
Next
EndProcedure
Procedure.i ComparePersons (*pa.Integer, *pb.Integer, mode.i)
; -- custom comparison function of type 'ProtoCompare'
; in : *pa, *pb: pointers to pointers to structures to be compared
; mode : mode of comparison:
; #PB_Sort_Ascending/#PB_Sort_Descending
; out: return value: #True/#False
Protected.Person *a = *pa\i, *b = *pb\i ; dereference the pointers passed as parameters
If (mode & #PB_Sort_Descending)
ProcedureReturn Bool(*a\Name < *b\Name)
Else
ProcedureReturn Bool(*a\Name > *b\Name)
EndIf
EndProcedure
Define u0, u1, u2, u3, u4
NewList p0.Person()
CreateRandomStructures(p0(), n)
; --------------------------------------------------------------------------
threshold = 35
SortStructuredList(p0(), #PB_Sort_Ascending, OffsetOf(Person\Idx), TypeOf(Person\Idx))
u0 = ElapsedMilliseconds()
SortStructuredList(p0(), #PB_Sort_Ascending, OffsetOf(Person\Name), TypeOf(Person\Name))
u0 = ElapsedMilliseconds() - u0
SortStructuredList(p0(), #PB_Sort_Ascending, OffsetOf(Person\Idx), TypeOf(Person\Idx))
CS::SetInsertionSortMaxSize(2 * threshold)
u1 = ElapsedMilliseconds()
CS::SortListAny(p0(), @ ComparePersons())
u1 = ElapsedMilliseconds() - u1
SortStructuredList(p0(), #PB_Sort_Ascending, OffsetOf(Person\Idx), TypeOf(Person\Idx))
CS::SetInsertionSortMaxSize(threshold)
u2 = ElapsedMilliseconds()
CS::SortListAny(p0(), @ ComparePersons())
u2 = ElapsedMilliseconds() - u2
SortStructuredList(p0(), #PB_Sort_Ascending, OffsetOf(Person\Idx), TypeOf(Person\Idx))
CS::SetInsertionSortMaxSize(0.5 * threshold)
u3 = ElapsedMilliseconds()
CS::SortListAny(p0(), @ ComparePersons())
u3 = ElapsedMilliseconds() - u3
SortStructuredList(p0(), #PB_Sort_Ascending, OffsetOf(Person\Idx), TypeOf(Person\Idx))
CS::SetInsertionSortMaxSize(1)
u4 = ElapsedMilliseconds()
CS::SortListAny(p0(), @ ComparePersons())
u4 = ElapsedMilliseconds() - u4
I wasn't aware of that problem. Thank you for your investigations!wilbert wrote: I got some strange results when I ran your speed test.
I found out it had to do with the copies of lists that you are using. Since they are located in different parts of memory, one list may be in a better place compared to another one.
Code: Select all
EnableExplicit
XIncludeFile "SortListCustom.pbi"
CompilerIf #PB_Compiler_Debugger
MessageRequester("Error",
"Switch the Debugger off!",
#PB_MessageRequester_Error)
End
CompilerEndIf
Procedure.s RandomString (length.i)
Protected *char.Character, ret$=Space(length)
*char = @ ret$
While *char\c <> 0
If Random(1) = 0
*char\c = Random('Z', 'A')
Else
*char\c = Random('z', 'a')
EndIf
*char + SizeOf(Character)
Wend
ProcedureReturn ret$
EndProcedure
;-============ Structures ============
Structure Person
Idx.i
Name.s
Age.i
EndStructure
Procedure CreateRandomStructures (List x.Person(), n.i)
Protected m$, a, i, r=Int(n/2)
For i = 1 To r
m$ = RandomString(Random(1000, 200))
a = Random(80, 20)
AddElement(x())
x()\Idx = i*2 - 1
x()\Name = m$
x()\Age = a
AddElement(x())
x()\Idx = i*2
x()\Name = m$
x()\Age = a + 10
Next
EndProcedure
Procedure.i ComparePersons (*pa.Integer, *pb.Integer, mode.i)
; -- custom comparison function of type 'ProtoCompare'
; in : *pa, *pb: pointers to pointers to structures to be compared
; mode : mode of comparison:
; #PB_Sort_Ascending/#PB_Sort_Descending
; out: return value: #True/#False
Protected.Person *a = *pa\i, *b = *pb\i ; dereference the pointers passed as parameters
If (mode & #PB_Sort_Descending)
ProcedureReturn Bool(*a\Name < *b\Name)
Else
ProcedureReturn Bool(*a\Name > *b\Name)
EndIf
EndProcedure
Define u0, u1, u2, u3, u4
Define msg$, threshold, n = 500000
NewList p0.Person()
CreateRandomStructures(p0(), n)
; --------------------------------------------------------------------------
threshold = 35
SortStructuredList(p0(), #PB_Sort_Ascending, OffsetOf(Person\Idx), TypeOf(Person\Idx))
u0 = ElapsedMilliseconds()
SortStructuredList(p0(), #PB_Sort_Ascending, OffsetOf(Person\Name), TypeOf(Person\Name))
u0 = ElapsedMilliseconds() - u0
SortStructuredList(p0(), #PB_Sort_Ascending, OffsetOf(Person\Idx), TypeOf(Person\Idx))
CS::SetInsertionSortMaxSize(2 * threshold)
u1 = ElapsedMilliseconds()
CS::SortListAny(p0(), @ ComparePersons())
u1 = ElapsedMilliseconds() - u1
SortStructuredList(p0(), #PB_Sort_Ascending, OffsetOf(Person\Idx), TypeOf(Person\Idx))
CS::SetInsertionSortMaxSize(threshold)
u2 = ElapsedMilliseconds()
CS::SortListAny(p0(), @ ComparePersons())
u2 = ElapsedMilliseconds() - u2
SortStructuredList(p0(), #PB_Sort_Ascending, OffsetOf(Person\Idx), TypeOf(Person\Idx))
CS::SetInsertionSortMaxSize(0.5 * threshold)
u3 = ElapsedMilliseconds()
CS::SortListAny(p0(), @ ComparePersons())
u3 = ElapsedMilliseconds() - u3
SortStructuredList(p0(), #PB_Sort_Ascending, OffsetOf(Person\Idx), TypeOf(Person\Idx))
CS::SetInsertionSortMaxSize(1)
u4 = ElapsedMilliseconds()
CS::SortListAny(p0(), @ ComparePersons())
u4 = ElapsedMilliseconds() - u4
; --------------------------------------------------------------------------
msg$ + "-- Sorting structures" + #LF$ +
"u0 = " + StrD(u0/1000,3) + " Sec. (built-in MergeSort)" + #LF$ +
"u1 = " + StrD(u1/1000,3) + " Sec." + #LF$ +
"u2 = " + StrD(u2/1000,3) + " Sec. <" + #LF$ +
"u3 = " + StrD(u3/1000,3) + " Sec." + #LF$ +
"u4 = " + StrD(u4/1000,3) + " Sec. (self-written MergeSort alone)"
MessageRequester("Duration", msg$)
Here are the results on my system with both methods.wilbert wrote: When I changed your code so every test is done with the same list, the results were very different on my computer.
Results of my original test wrote:-- Sorting structures
u0 = 0.396 Sec. (built-in MergeSort)
u1 = 1.214 Sec.
u2 = 1.177 Sec. <
u3 = 1.197 Sec.
u4 = 2.679 Sec. (self-written MergeSort alone)
The results are pretty similar. Especially using the proposed threshold (u2) is with both methods faster than using a bigger threshold (u1) or a smaller one (u3).Results of your test wrote:-- Sorting structures
u0 = 0.424 Sec. (built-in MergeSort)
u1 = 1.221 Sec.
u2 = 1.198 Sec. <
u3 = 1.234 Sec.
u4 = 2.742 Sec. (self-written MergeSort alone)
The changed version-- Sorting strings
t0 = 0.667 Sec. (built-in MergeSort)
t1 = 2.530 Sec.
t2 = 2.249 Sec. <
t3 = 1.612 Sec.
t4 = 1.829 Sec. (self-written MergeSort alone)
-- Sorting structures
u0 = 3.444 Sec. (built-in MergeSort)
u1 = 1.913 Sec.
u2 = 1.844 Sec. <
u3 = 1.807 Sec.
u4 = 2.197 Sec. (self-written MergeSort alone)
A PB linked list is a chain of small parts of allocated memory.-- Sorting structures
u0 = 0.383 Sec. (built-in MergeSort)
u1 = 1.105 Sec.
u2 = 1.050 Sec. <
u3 = 1.034 Sec.
u4 = 1.255 Sec. (self-written MergeSort alone)
This is really strange. It looks as if the built-in routine were the slowest one.wilbert wrote:Your original code.[...]
-- Sorting structures
u0 = 3.444 Sec. (built-in MergeSort)
u1 = 1.913 Sec.
u2 = 1.844 Sec. <
u3 = 1.807 Sec.
u4 = 2.197 Sec. (self-written MergeSort alone)
Code: Select all
Structure struct
x.l
EndStructure
Procedure.i CompareStruc(*pa.struct, *pb.struct, mode.i)
Res = #False
If *pa\x > *pb\x
Res = #True
EndIf
ProcedureReturn Res
EndProcedure
NewList l.struct()
For i=1 To 100
AddElement(l())
l()\x = Random(100, 10)
Next
CS::SortListAny(l(), @CompareStruc())
ForEach l()
Debug l()\x
Next
Because you are not using a proper custom comparison function.User_Russian wrote:Why doesn't it sort?
Code: Select all
Procedure.i CompareStruc (*pa.Integer, *pb.Integer, mode.i)
Protected.struct *a = *pa\i, *b = *pb\i ; dereference the pointers
Protected.i Res
Res = #False
If *a\x > *b\x
Res = #True
EndIf
ProcedureReturn Res
EndProcedure