It is currently Tue Sep 29, 2020 11:40 am

All times are UTC + 1 hour




Post new topic Reply to topic  [ 8 posts ] 
Author Message
 Post subject: SQUINT 2, Sparse Quad Union Indexed Nibble Trie
PostPosted: Thu Jul 30, 2020 4:17 am 
Offline
Addict
Addict
User avatar

Joined: Fri Sep 21, 2007 5:52 am
Posts: 3545
Location: New Zealand
Squint 2.0.5b

Squint is a dynamic compact prefix Trie, It can by used as a Map or Numeric map though Squint has one distinct advantage over a map. Lexicographic enumeration or prefix searches. It's a dynamic structure which needs no predetermined size, and will perform as O(k*2) rather than degrading towards O(n) like a fully loaded map would. While it's Sets are almost 2 times that of a sized map, it's Gets are closer to 1:1. Memory use should also be smaller than a map

String mode functions Set, Get, Delete, Prune, Enumerate and Walk, keys can be Unicode, UTF8, Ascii or Integers, keys are mapped to UTF8
you can mix Unicode, UTF8 and Integer key input types as it's converted to UTF8 but be careful not to mix Ascii if above 126.

The SquintNumeric functions, SetNumeric, GetNumeric, DeleteNumeric, WalkNumeric Behaves just like a map, it's gets are generally faster than a map and runs at a constant speed.

Enumeration or walk callbacks are in the form *key, value.i, *userdata, though keys are UTF8, Ascii or Integers for the fast numeric walk function.

Test of random keys,
Quote:
AMD FX(tm)-6100 Six-Core Processor 64 bit
Items 1000000
Squint Set 932 ms
Map Set 410 ms
Squint Get 487 ms
Map Get 379 ms
Squint Enum 0 ms
Map Enum 256 ms
=========================
Squint Set Ratio 2.273 slower
Squint Get Ratio 1.285 slower
Squint Enum Ratio +Infinity faster
=========================
Key Input Size 7.00 mb
Squint Memory Size 54.67 mb
Overhead 7.81 bytes input
Map Size ~ 49.07mb




Fast Numeric mode test
Quote:
Items 1000000
Squint Numeric Set 572 ms
Map Set 402 ms
Squint numeric Get 244 ms
Map Get 390 ms
=========================
Squint Set Ratio 1.423 slower
Squint Get Ratio 1.598 faster
=========================
Key Input Size 7.63 mb
Squint Memory Size 25.71 mb
Overhead 3.37 bytes input
Map Size ~ 49.70mb



Code:
; SQUINT 2, Sparse Quad Union Indexed Nibble Trie
; Copyright (c) 2020 Andrew Ferguson
; Version 2.0.5b
; PB 5.72 x86 x64 Linux OSX Windows
; Thanks Wilbert for the high low insight and utf8 str asm help.
; Squint is a compact prefix Trie indexed by nibbles into a sparse array with performance metrics close to a map
; It provides O(K*2) performance with a memory size ~32 times smaller than a 256 node Trie
; The overheads are similar to that of a QP Trie, thats a 1/3 smaller than a Critbit.
; In terms of speed SquintSet and SquintGet should be at worst ~2 times slower than a Map.
; Lexographical enumerations are however magnitudes faster than what you could achieve otherwise
; In fast Numeric mode it's gets are closer to 1:1 with a map
;
; see https://en.wikipedia.org/wiki/Trie
;     https://dotat.at/prog/qp/blog-2015-10-04.html
;     https://cr.yp.to/critbit.html
;
; Squint supports Set, Get, Enum, Walk, Delete and Prune with a flag in Delete
; keys can be either Unicode, Ascii, UTF8 or Integers, the type must be specified
; keys are all mapped to UTF8
;
; SquintNumeric (fast numeric) supports, SetNumeric GetNumeric DeleteNumeric and WalkNumeric
; it's provided as a direct subtitute for a numeric map, though lacks enumeration
; keys are returned as Integers 
;
; MIT License
; Permission is hereby granted, SquintFree of charge, to any person obtaining a copy
; of this software and associated documentation files (the "Software"), to deal
; in the Software without restriction, including without limitation the rights
; To use, copy, modify, merge, publish, distribute, sublicense, and/or sell
; copies of the Software, and to permit persons to whom the Software is
; furnished to do so, subject to the following conditions:
;
; The above copyright notice and this permission notice shall be included in all
; copies or substantial portions of the Software.
;
; THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS Or
; IMPLIED, INCLUDING BUT Not LIMITED To THE WARRANTIES OF MERCHANTABILITY,
; FITNESS For A PARTICULAR PURPOSE And NONINFRINGEMENT. IN NO EVENT SHALL THE
; AUTHORS Or COPYRIGHT HOLDERS BE LIABLE For ANY CLAIM, DAMAGES Or OTHER
; LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT Or OTHERWISE, ARISING FROM,
; OUT OF Or IN CONNECTION With THE SOFTWARE Or THE USE Or OTHER DEALINGS IN THE
; SOFTWARE.

DeclareModule SQUINT
 
  #SQUINT_MAX_KEY = 1024
 
  Structure squint_node
    *vertex.edge
    StructureUnion
      squint.q
      value.i
    EndStructureUnion
  EndStructure   
 
  Structure edge
    e.squint_node[0]
  EndStructure
 
  Structure squint
    *vt
    *root.squint_node
    size.l
    datasize.l
    count.l
    ib.a[22]
    sb.a[#SQUINT_MAX_KEY]
  EndStructure
 
  CompilerIf #PB_Compiler_Processor = #PB_Processor_x86
    #Squint_Pmask = $ffffffff
  CompilerElse
    #Squint_Pmask = $ffffffffffff
  CompilerEndIf
 
  ;-Squint Callback prototype
  Prototype Squint_CB(*key,*value=0,*userdata=0)
 
  Declare SquintNew()
  Declare SquintFree(*this.Squint)
  Declare SquintDelete(*this.squint,*key,prune=0,mode=#PB_Unicode)
  Declare SquintSet(*this.squint,*key,value.i,mode=#PB_Unicode)
  Declare SquintGet(*this.squint,*key,mode=#PB_Unicode)
  Declare SquintEnum(*this.squint,*key,*pfn.squint_CB,*userdata=0,ReturnMatch=0,mode=#PB_Unicode)
  Declare SquintWalk(*this.squint,*pfn.squint_CB,*userdata=0)
  Declare SquintSetNumeric(*this.squint,key.i,value.i)
  Declare SquintGetNumeric(*this.squint,key.i)
  Declare SquintDeleteNumeric(*this.squint,key.i)
  Declare SquintWalkNumeric(*this.squint,*pfn.squint_CB,*userdata=0)
 
  ;-Squint Inteface iSquint 
  Interface iSquint
    Free()
    Delete(*key,prune=0,mode=#PB_Unicode)
    Set(*key,value.i,mode=#PB_Unicode)
    Get(*key,mode=#PB_Unicode)
    Enum(*key,*pfn.squint_CB,*userdata=0,ReturnMatch=0,mode=#PB_Unicode)
    Walk(*pfn.squint_CB,*userdata=0)
    SetNumeric(key.i,value.i)
    GetNumeric(key.i)
    SquintDeleteNumeric(key.i)
    WalkNumeric(*pfn.Squint_CB,*userdata=0)
  EndInterface
 
  DataSection: vtSquint:
    Data.i @SquintFree()
    Data.i @SquintDelete()
    Data.i @SquintSet()
    Data.i @SquintGet()
    Data.i @SquintEnum()
    Data.i @SquintWalk()
    Data.i @SquintSetNumeric()
    Data.i @SquintGetNumeric()
    Data.i @SquintDeleteNumeric()
    Data.i @SquintWalkNumeric()
  EndDataSection   
 
EndDeclareModule

Module SQUINT
 
  EnableExplicit
  ;-macros
  Macro _SETINDEX(in,index,number)
    in = in & ~(15 << (index << 2)) | (number << (index << 2))
  EndMacro
 
  Macro _GETNODECOUNT()
    CompilerIf #PB_Compiler_Processor = #PB_Processor_x86
      nodecount = MemorySize(*node\vertex) / SizeOf(squint_node)
    CompilerElse
      nodecount = (*node\vertex >> 48)
    CompilerEndIf
  EndMacro
 
  Macro _POKENHL(in,Index,Number)
    *Mem.Ascii = in
    *Mem + Index >> 1
    If Index & 1
      *Mem\a = (*Mem\a & $f0) | (Number & $f)
    Else
      *Mem\a = (*Mem\a & $0f) | (Number << 4)
    EndIf
  EndMacro
 
  Macro _STR()
    CompilerIf #PB_Compiler_Processor = #PB_Processor_x64
      ; backup registers
      !mov [rsp - 8], rbx
      !mov [rsp - 16], rdi
      ; get procedure arguments
      !mov rax, [p.v_char]
      !mov rdi, [p.p_out]
      ; determine sign
      !xor ebx, ebx
      !cmp rax, rbx
      !jge .l0
      !neg rax
      !mov rdx, 1
      !mov [rsp - 24],rdx
      !.l0:
      ; convert number to ascii characters
      !add rdi, 22
      !mov byte [rdi], 0
      !mov rbx, 0xcccccccccccccccd
      !.l1:
      !mov rcx, rax
      !mul rbx
      !mov rax, rdx
      !shr rax, 3
      !imul rdx, rax, 10
      !sub rcx, rdx
      !or rcx, 0x30
      !sub rdi, 1
      !mov [rdi], cl
      !test rax, rax
      !jnz .l1
      ;look up sign
      !mov rax,[rsp-24]
      !cmp rax,1
      !jnz .l2
      !sub rdi, 1
      !mov byte [rdi], '-'
      !.l2:
      ;store start of string
      !mov [p.v_offset],rdi
      ;restore registers
      !mov rdi, [rsp - 16]
      !mov rbx, [rsp - 8]
    CompilerElse
      ; backup registers
      !mov [esp - 4], ebx
      !mov [esp - 8], edi
      ; get procedure arguments
      !mov eax, [p.v_char]
      !mov edi, [p.p_out]
      ; determine length
      !xor ebx, ebx
      !cmp eax, ebx
      !jge .l0
      !neg eax
      !mov edx, 1
      !mov [esp - 12],edx
      !.l0:
      ; convert number to ascii characters
      !add edi, 22
      !mov byte [edi], 0
      !mov ebx, 0xcccccccd
      !.l1:
      !mov ecx, eax
      !mul ebx
      !mov eax, edx
      !shr eax, 3
      !imul edx, eax, 10
      !sub ecx, edx
      !or ecx, 0x30
      !sub edi, 1
      !mov [edi], cl
      !test eax, eax
      !jnz .l1
      ;look up sign
      !mov eax,[esp-12]
      !cmp eax,1
      !jnz .l2
      !sub edi, 1
      !mov byte [edi], '-'
      !.l2:
      ;store offset to start of string
      !mov [p.v_offset],edi
      ;restore registers
      !mov edi, [esp - 8]
      !mov ebx, [esp - 4]
    CompilerEndIf
   
  EndMacro
 
  Macro _CONVERTUTF8()
    If mode <> #PB_Integer
      char= PeekU(*key)
    Else   
      char = PeekI(*key)
      *out = @*this\ib[0]
      _STR()
      *key = offset
      mode = #PB_Ascii
      char= PeekU(*key)
    EndIf
   
    If mode = #PB_Unicode 
      !mov eax, [p.v_char]
      !cmp eax, 0x0080
      !jb .l3
      !cmp eax, 0x0800
      !jae .l4
      !shl eax, 2
      !shr al, 2
      !or eax, 1100000010000000b
      !bswap eax
      !shr eax, 16
      !jmp .l3
      !.l4:
      !shl eax, 4
      !shr ax, 2
      !shr al, 2
      !or eax, 111000001000000010000000b
      !bswap eax
      !shr eax, 8
      !.l3:
      !mov [p.v_char],eax
    EndIf
  EndMacro
 
  Macro _MODECHECK()
    _CONVERTUTF8()
    If mode <> #PB_Unicode
      If (char >> ((count&1)<<4) & $ff = 0)
        Break
      EndIf
    Else
      *this\datasize+1
    EndIf
  EndMacro
 
  Macro _SETNODE()
    If *node\vertex
      _GETNODECOUNT()
      If (offset <> 15 Or nodecount = 16)
        *node = *node\Vertex\e[offset] & #Squint_Pmask
      Else 
        *this\size + SizeOf(squint_node)
        offset = nodecount
        nodecount+1
        *node\vertex = ReAllocateMemory(*node\vertex & #Squint_Pmask,(nodecount)*SizeOf(squint_node))
        CompilerIf #PB_Compiler_Processor = #PB_Processor_x64
          *node\vertex | ((nodecount) << 48)
        CompilerEndIf
        _SETINDEX(*node\squint,idx,offset)
        *node = *node\Vertex\e[offset] & #Squint_Pmask
      EndIf
    Else
      *this\size + SizeOf(squint_node)
      *node\vertex = AllocateMemory(SizeOf(squint_Node))
      CompilerIf #PB_Compiler_Processor = #PB_Processor_x64
        *node\vertex | (1 << 48)
      CompilerEndIf
      *node\squint = -1
      _SETINDEX(*node\squint,idx,0)
      *node = *node\Vertex\e[0] & #Squint_Pmask
      *this\count+1
    EndIf
  EndMacro
 
  CompilerIf #PB_Compiler_Processor = #PB_Processor_x86
    Macro rax : eax : EndMacro
  CompilerEndIf   
  ;-General functions
  Procedure SquintNew()
    Protected *this.squint,a
    *this = AllocateMemory(SizeOf(squint))
    If *this
      *this\vt = ?vtSquint
      *this\root = AllocateMemory(SizeOf(squint_node))
      ProcedureReturn *this
    EndIf
  EndProcedure
 
  Procedure ISquintFree(*this.squint,*node.squint_node=0)
    Protected a,offset,nodecount
    If Not *node
      ProcedureReturn 0
    EndIf
    For a=0 To 15
      offset = (*node\squint >> (a<<2)) & $f
      If *node\vertex
        _GETNODECOUNT()
        If (offset <> 15 Or nodecount = 16)
          ISquintFree(*this,*node\Vertex\e[offset] & #Squint_Pmask)
        EndIf
      EndIf
    Next
    If *node\vertex
      _GETNODECOUNT()
      *this\size - nodecount
      *this\count - 1
      FreeMemory(*node\Vertex & #Squint_Pmask)
      *node\vertex=0
    EndIf
    ProcedureReturn *node
  EndProcedure
 
  Procedure SquintFree(*this.squint)
    Protected a,offset,*node.squint_node,nodecount
    *node = *this\root
    For a=0 To 15
      offset = (*node\squint >> (a<<2)) & $f
      If *node\vertex
        _GETNODECOUNT()
        If (offset <> 15 Or nodecount = 16)
          ISquintFree(*this,*node)
        EndIf
      EndIf
    Next
    FreeMemory(*this\root)
    nodecount = *this\count
    FreeMemory(*this)
    ProcedureReturn nodecount
  EndProcedure
  ;-string functions
  Procedure SquintDelete(*this.squint,*key.Unicode,prune=0,mode=#PB_Unicode)
    Protected *node.squint_node,idx,*mem.Character,offset,nodecount,char.i,count,*out
    *node = *this\root
    _CONVERTUTF8()
    While char
      offset = (*node\squint >> ((char & $f0) >> 2 )) & $f
      If *node\vertex
        _GETNODECOUNT()
        If (offset <> 15 Or nodecount = 16)
          *node = *node\Vertex\e[offset] & #Squint_Pmask
        EndIf
      Else
        ProcedureReturn 0
      EndIf
      If *node
        offset = (*node\squint >> ((char & $0f) << 2)) & $f
        If *node\vertex
          _GETNODECOUNT()
          If (offset <> 15 Or nodecount = 16)
            *node = *node\Vertex\e[offset] & #Squint_Pmask
          EndIf
        Else
          ProcedureReturn 0
        EndIf
      EndIf
      char >> 8
      If char = 0
        *key+2
        _MODECHECK()
      EndIf
    Wend
    If prune
      ISquintFree(*this,*node)
      If (*node\vertex & #Squint_Pmask) = 0
        *node\squint = 0
      EndIf
    Else
      offset = *node\squint & $f
      _GETNODECOUNT()
      If offset <= nodecount
        *node = (*node\Vertex\e[offset] & #Squint_Pmask)
        If (*node\vertex & #Squint_Pmask) = 0
          *node\squint = 0
        EndIf
      Else
        ProcedureReturn 0
      EndIf
    EndIf
  EndProcedure
 
  Procedure SquintSet(*this.squint,*key,value.i,mode=#PB_Unicode)
    Protected *node.squint_node,idx,offset,nodecount,char.i,count,*out
    *node = *this\root & #Squint_Pmask
    _CONVERTUTF8()
    While char
      idx = (char >> 4) & $f
      offset = (*node\squint >> (idx<<2)) & $f
      _SETNODE()
      idx = char & $0f
      offset = (*node\squint >> (idx<<2)) & $f
      _SETNODE()
      *this\datasize+1
      char >> 8
      count+1
      If char = 0
        *key+2
        _MODECHECK()
      EndIf
    Wend
    idx=0
    offset = *node\squint & $f
    _SETNODE()
    *node\value = value
    ProcedureReturn
  EndProcedure
 
  Procedure SquintGet(*this.squint,*key,mode=#PB_Unicode)
    Protected *node.squint_Node,idx,offset,nodecount,char.i,count,*out
    *node = *this\root & #Squint_Pmask
    _CONVERTUTF8()
    While char
      offset = (*node\squint >> ((char & $f0) >> 2 )) & $f
      _GETNODECOUNT()
      If offset < nodecount
        *node = (*node\Vertex\e[offset] & #Squint_Pmask)
      Else
        ProcedureReturn 0
      EndIf
      offset = (*node\squint >> ((char & $0f) << 2)) & $f
      _GETNODECOUNT()
      If offset < nodecount
        *node = (*node\Vertex\e[offset] & #Squint_Pmask)
      Else
        ProcedureReturn 0
      EndIf
      char >> 8
      count+1
      If char = 0
        *key+2
        _MODECHECK()
      EndIf
    Wend
    offset = *node\squint & $f
    _GETNODECOUNT()
    If offset <= nodecount
      *node = (*node\Vertex\e[offset] & #Squint_Pmask)
      ProcedureReturn *node\value
    Else
      ProcedureReturn 0
    EndIf
  EndProcedure
     
  Procedure IEnum(*this.squint,*node.squint_Node,depth,*pfn.squint_CB,*userdata=0)
    Protected a.i,offset,nodecount,*mem.Ascii
    If Not *node
      ProcedureReturn 0
    EndIf
    For a=0 To 15
      offset = (*node\squint >> (a<<2)) & $f
      If (*node\vertex And *node\squint)
        _GETNODECOUNT()
        If (offset <> 15 Or nodecount = 16)
          _POKENHL(@*this\sb,depth,a)
          IEnum(*this,*node\Vertex\e[offset] & #Squint_Pmask,depth+1,*pfn,*userdata)
        EndIf
      EndIf
    Next
    If *node\vertex=0
      If *pfn
        PokeA(@*this\sb+((depth>>1)),0)
        *pfn(@*this\sb,*node\value,*userdata)
      EndIf
    EndIf
    ProcedureReturn *node
  EndProcedure
 
  Procedure SquintEnum(*this.squint,*key.Unicode,*pfn.squint_CB,*userdata=0,ReturnMatch=0,mode=#PB_Unicode)
    Protected *node.squint_Node,idx,*mem.Ascii,offset,nodecount,depth,char.i,count,*out
    If ReturnMatch
      If SquintGetNumeric(*this,*key)
        If *pfn
          *pfn(*key,0,*userdata)
        EndIf
        ProcedureReturn
      EndIf
    EndIf
    *node = *this\root
    _CONVERTUTF8()
    While char
      idx = (char >> 4) & $f
      offset = (*node\squint >> (idx<<2)) & $f
      If (*node\vertex And *node\squint)
        _GETNODECOUNT()
        If (offset <> 15 Or nodecount = 16)
          ;POKENHL(@*this\sb,depth,idx)
          *mem = @*this\sb+(depth>>1)
          *mem\a = (*mem\a & $0f) | (idx<<4)
          depth+1
          *node = *node\Vertex\e[offset] & #Squint_Pmask
        EndIf
      EndIf
      If Not *node Or *node\vertex = 0
        ProcedureReturn 0
      EndIf
      idx = char & $f
      offset = (*node\squint >> (idx<<2)) & $f
      If (*node\vertex And *node\squint)
        _GETNODECOUNT()
        If (offset <> 15 Or nodecount = 16)
          ;POKENHL(@*this\sb,depth,idx)
          *mem = @*this\sb+(depth>>1)
          *Mem\a = (*Mem\a & $f0) | (idx & $f)
          depth+1
          *node = *node\Vertex\e[offset] & #Squint_Pmask
        EndIf
      EndIf
      If Not *node Or *node\vertex = 0
        ProcedureReturn 0
      EndIf
      char >> 8
      count+1
      If char = 0
        *key+2
        _MODECHECK()
      EndIf
    Wend
    IEnum(*this,*node,depth,*pfn,*userdata)
  EndProcedure
 
  Procedure SquintWalk(*this.squint,*pfn.squint_CB,*userdata=0)
    IEnum(*this,*this\root,0,*pfn,*userdata)
  EndProcedure
 
  ;-Numeric functions
  Procedure SquintSetNumeric(*this.squint,key.i,value.i)
    Protected *node.squint_node,idx,offset,nodecount,char.i,count
    *node = *this\root & #Squint_Pmask
    char = key
    EnableASM
    mov rax, char
    bswap rax;
    mov char,rax
    DisableASM
    While count < SizeOf(Integer)
      idx = (char >> 4) & $f
      offset = (*node\squint >> (idx<<2)) & $f
      _SetNODE()
      idx = (char & $f)
      offset = (*node\squint >> (idx<<2)) & $f
      _SetNODE()
      *this\datasize+1
      char >> 8
      count+1
    Wend
    *node\value = value
    ProcedureReturn
  EndProcedure
 
  Procedure SquintGetNumeric(*this.squint,key.i)
    Protected *node.squint_Node,idx,offset,nodecount,char.i,count,skey.s
    *node = *this\root & #Squint_Pmask
    char = key
    EnableASM
    mov rax, char
    bswap rax;
    mov char,rax
    DisableASM
    While count < SizeOf(Integer)
      offset = (*node\squint >> ((char & $f0) >> 2 )) & $f
      _GETNODECOUNT()
      If offset < nodecount
        *node = (*node\Vertex\e[offset] & #Squint_Pmask)
      Else
        ProcedureReturn 0
      EndIf
      offset = (*node\squint >> ((char & $0f) << 2)) & $f
      _GETNODECOUNT()
      If offset < nodecount
        *node = (*node\Vertex\e[offset] & #Squint_Pmask)
      Else
        ProcedureReturn 0
      EndIf
      char>>8
      count+1
    Wend
    ProcedureReturn *node\value
  EndProcedure
 
  Procedure SquintDeleteNumeric(*this.squint,key.i)
    Protected *node.squint_node,idx,*mem.Character,offset,nodecount,char.i,count
    *node = *this\root & #Squint_Pmask
    char = key
    EnableASM
    mov rax, char
    bswap rax;
    mov char,rax
    DisableASM
    While count < SizeOf(Integer)
      offset = (*node\squint >> ((char & $f0) >> 2 )) & $f
      _GETNODECOUNT()
      If offset < nodecount
        *node = (*node\Vertex\e[offset] & #Squint_Pmask)
      Else
        ProcedureReturn 0
      EndIf
      offset = (*node\squint >> ((char & $0f) << 2)) & $f
      _GETNODECOUNT()
      If offset < nodecount
        *node = (*node\Vertex\e[offset] & #Squint_Pmask)
      Else
        ProcedureReturn 0
      EndIf
      char>>8
      count+1
    Wend
    If (*node\vertex & #Squint_Pmask) = 0
      *node\squint = 0
    EndIf
  EndProcedure
 
  Procedure IEnumNumeric(*this.squint,*node.squint_Node,idx,depth,*pfn.squint_CB,*userdata=0)
    Protected a.i,offset,nodecount,*mem.Ascii,char.i
    If Not *node
      ProcedureReturn 0
    EndIf
    For a=0 To 15
      offset = (*node\squint >> (a<<2)) & $f
      If (*node\vertex And *node\squint)
        _GETNODECOUNT()
        If (offset <> 15 Or nodecount = 16)
          _POKENHL(@*this\sb,depth,a)
          IEnumNumeric(*this,*node\Vertex\e[offset] & #Squint_Pmask,0,depth+1,*pfn,*userdata)
        EndIf
      EndIf
    Next
    If *node\vertex=0
      char = PeekI(@*this\sb)
      EnableASM
      mov rax, char
      bswap rax;
      mov char,rax
      DisableASM
      If *pfn
        *pfn(@char,*node\value,*userdata)
      EndIf
    EndIf
    ProcedureReturn *node
  EndProcedure
 
  Procedure SquintWalkNumeric(*this.squint,*pfn.squint_CB,*userdata=0)
    IEnumNumeric(*this,*this\root,0,0,*pfn,*userdata)
  EndProcedure
 
  DisableExplicit
  ;-End of module   
EndModule

CompilerIf #PB_Compiler_IsMainFile
 
  UseModule SQUINT
 
  Global msg.s
 
  Procedure Map_Enum(Map mp.i(),key.s,len)
    Protected NewList items.s()
    Protected word.s
    ForEach mp()
      word = MapKey(mp())
      If Left(word,len) = key
        AddElement(items())
        items() = word
      EndIf
    Next   
    SortList(items(),#PB_Sort_Ascending)
  EndProcedure   
 
  Procedure CBSquint(*key,*value,*userData)
    If *key
      msg + PeekS(*key,-1,#PB_UTF8) + #LF$
    EndIf   
  EndProcedure
 
  Procedure CBSquintUTF8(*key,*value,*userData)
    Debug PeekS(*key,-1,#PB_UTF8) + " " + PeekS(*userData)
  EndProcedure
 
  Procedure CBSquintAscii(*key,*value,*userData)
    Debug PeekS(*key,-1,#PB_Ascii) + " " + PeekS(*userData)
  EndProcedure
 
  Procedure CBSquintInteger(*key,*value,*userData)
    Debug PeekS(*userData)  + " " + Str(PeekI(*key)) + " / " + Str(*value) 
  EndProcedure
 
  Define ct=1000000
  Define *mt.squint = SquintNew()
  Define NewMap mp.i(ct)
  Define key.s,key1.s,*utf8,ikey
  Define TrieSet,TrieGet,MapSet,MapGet,TrieEnum,MapEnum,TrieSquintSetNum,TriGetNum
  Define st,et,a,seed = 1234
 
  CompilerIf #PB_Compiler_Debugger
   
    ;unicode
    key = Chr($f6) + Chr($20ac) + Chr($e0)
    SquintSet(*mt,@key,@key)
    Debug PeekS(SquintGet(*mt,@key)) + " test unicode SquintGet "
    key = Chr($f6)
    SquintEnum(*mt,@key,@CBSquintUTF8(),@"test unicode SquintEnum")
   
    ;UTF8
    key = Chr($f6) + Chr($20ac) + Chr($e0)
    *utf8 = UTF8(key)
    SquintSet(*mt,*utf8,*utf8,#PB_UTF8) ;overwrite the unicode key
    Debug PeekS(SquintGet(*mt,*utf8,#PB_UTF8),-1,#PB_UTF8) + " test utf8 SquintGet"
    FreeMemory(*utf8)
    *utf8 = UTF8(Chr($f6))
    SquintEnum(*mt,*utf8,@CBSquintUTF8(),@"test utf8 SquintEnum",0,#PB_UTF8)
    FreeMemory(*utf8)
   
    ;Ascii
    *ascii = Ascii("An ascii string")
    SquintSet(*mt,*ascii,*ascii,#PB_Ascii)
    Debug PeekS(SquintGet(*mt,*ascii,#PB_Ascii),-1,#PB_Ascii)
    FreeMemory(*ascii)
    *ascii = Ascii("An")
    SquintEnum(*mt,*ascii,@CBSquintAscii(),@"testing Ascii SquintEnum",0,#PB_Ascii)
    FreeMemory(*ascii)
   
    ;Numeric string
    a = -123456
    SquintSet(*mt,@a,a,#PB_Integer)
    Debug SquintGet(*mt,@a,#PB_Integer)
    a = -123
    SquintEnum(*mt,@a,@CBSquintUTF8(),@"test numeric SquintEnum",0,#PB_Integer)
   
    ;Fast Numeric
    SquintFree(*mt)
    *mt = SquintNew()
   
    For a = 500 To 1000
      SquintSetNumeric(*mt,a,a)
    Next   
   
    For a = 500 To 1000
      Debug SquintGetNumeric(*mt,a)
    Next   
    a=100
    SquintDeleteNumeric(*mt,a)
   
    Debug "test delete " + Str(SquintGetNumeric(*mt,a))
    Debug "test SquintWalk"
    SquintWalkNumeric(*mt,@CBSquintInteger(),@"SquintNumeric Key / Value: ")
   
    SquintFree(*mt)
   
    End
  CompilerEndIf
 
  RandomSeed(seed)
  st = ElapsedMilliseconds()
  For a = 0 To ct
    ikey = Random($FFFFFF)
    SquintSet(*mt,@ikey,ikey,#PB_Integer)
  Next
  et = ElapsedMilliseconds()
  TrieSet = et-st
 
  RandomSeed(seed)
  st = ElapsedMilliseconds()
  For a = 0 To ct
    ikey = Random($FFFFFF)
    mp(Str(ikey)) = ikey
  Next
  et = ElapsedMilliseconds()
  MapSet = et-st
 
  RandomSeed(seed+1)
  st = ElapsedMilliseconds()
  For a = 0 To ct
    ikey = Random($FFFFFF)
    out = SquintGet(*mt,@ikey,#PB_Integer)
    If (ikey <> out And out <> 0)
      MessageRequester("error", Str(ikey) + " " + Str(out))
      End
    EndIf
  Next
  et = ElapsedMilliseconds()
  TrieGet = et-st
 
  RandomSeed(seed+1)
  st = ElapsedMilliseconds()
  For a = 0 To ct
    ikey = Random($FFFFFF)
    out = mp(Str(ikey))
    If (ikey <> out And  out <> 0)
      MessageRequester("error", Str(ikey) + " " + Str(out))
      End
    EndIf
  Next
  et = ElapsedMilliseconds()
  MapGet = et-st
 
  st = ElapsedMilliseconds()
  key = "575"
  SquintEnum(*mt,@key,0)
  et = ElapsedMilliseconds()
  TrieEnum = et-st
 
  st = ElapsedMilliseconds()
  key = "575"
  Map_Enum(mp(),key,3)
  et = ElapsedMilliseconds()
  MapEnum = et-st
 
  msg.s = Trim(CPUName()) + " " + Str(SizeOf(integer)*8) + " bit" + #LF$
  msg.s + "Items " + Str(ct) + #LF$
  msg + "Squint Set " + Str(TrieSet) + " ms"+ #LF$
  msg + "Map Set " + Str(MapSet) + " ms" + #LF$
  msg + "Squint Get " + Str(TrieGet) + " ms" + #LF$
  msg + "Map Get " + Str(MapGet) + " ms" + #LF$
  msg + "Squint Enum " + Str(TrieEnum) + " ms" + #LF$
  msg + "Map Enum " + Str(MapEnum) + " ms" + #LF$
  msg + "=========================" + #LF$
  msg + "Squint Set Ratio " + StrF(TrieSet / MapSet,3) + " slower" + #LF$
  msg + "Squint Get Ratio " + StrF(TrieGet / MapGet,3) + " slower " + #LF$
  msg + "Squint Enum Ratio " + StrF(MapEnum / TrieEnum,3) + " faster " + #LF$
  msg + "=========================" + #LF$
  msg + "Key Input Size " + StrF(*mt\datasize / 1024 / 1024,2) + " mb" + #LF$
  msg + "Squint Memory Size " + StrF(*mt\size / 1024 / 1024,2) + " mb" + #LF$
  msg + "Overhead " + StrF(*mt\size / *mt\datasize,2) + " bytes input" + #LF$
  msg + "Map Size ~ " + StrF(((4*SizeOf(Integer)*MapSize(mp())*1.42)+*mt\datasize) / 1024 / 1024,2) + "mb" + #LF$
 
  SetClipboardText(msg)
  MessageRequester("SQUINT strings keys",msg)
 
  msg=""
  in.s = "57567"
  SquintEnum(*mt,@in,@cbsquint())
  SquintDelete(*mt,@in,1)
  msg + "Pruned from " + in + #LF$
  SquintEnum(*mt,@in,@cbsquint())
  msg + "check pune worked  SquintEnum again" + #LF$
  nodecount = SquintFree(*mt)
  msg + "SquintFree check node count is zero " + Str(Nodecount) 
  MessageRequester("Enumeration",msg)
 
  ;numeric test
  Define ikey.i
  msg=""
 
  FreeMap(Mp())
  NewMap mp.i(ct)
  *mt.SQUINT = SquintNew()
 
  RandomSeed(seed)
  st = ElapsedMilliseconds()
  For a = 0 To ct
    ikey = Random($FFFFFF)
    SquintSetNumeric(*mt,ikey,ikey)
  Next
  et = ElapsedMilliseconds()
  TrieSet = et-st
 
  RandomSeed(seed)
  st = ElapsedMilliseconds()
  For a = 0 To ct
    ikey = Random($FFFFFF)
    mp(Str(ikey)) = ikey
  Next
  et = ElapsedMilliseconds()
  MapSet = et-st
 
  RandomSeed(seed+1)
  st = ElapsedMilliseconds()
  For a = 0 To ct
    ikey = Random($FFFFFF)
    out = SquintGetNumeric(*mt,ikey)
    If (ikey <> out And  out <> 0)
      MessageRequester("error", Str(ikey) + " " + Str(out))
      End
    EndIf
  Next
  et = ElapsedMilliseconds()
  TrieGet = et-st
 
  RandomSeed(seed+1)
  st = ElapsedMilliseconds()
  For a = 0 To ct
    ikey = Random($FFFFFF)
    out = mp(Str(ikey))
    If (ikey <> out And  out <> 0)
      MessageRequester("error", Str(ikey) + " " + Str(out))
      End
    EndIf
  Next
  et = ElapsedMilliseconds()
  MapGet = et-st
 
  msg.s + "Items " + Str(ct) + #LF$
  msg + "Squint Numeric Set " + Str(TrieSet) + " ms"+ #LF$
  msg + "Map Set " + Str(MapSet) + " ms" + #LF$
  msg + "Squint numeric Get " + Str(TrieGet) + " ms" + #LF$
  msg + "Map Get " + Str(MapGet) + " ms" + #LF$
  msg + "=========================" + #LF$
  msg + "Squint Set Ratio " + StrF(TrieSet / MapSet,3) + " slower" + #LF$
  msg + "Squint Get Ratio " + StrF(MapGet / TrieGet,3) + " faster" + #LF$
  msg + "=========================" + #LF$
  msg + "Key Input Size " + StrF(*mt\datasize / 1024 / 1024,2) + " mb" + #LF$
  msg + "Squint Memory Size " + StrF(*mt\size / 1024 / 1024,2) + " mb" + #LF$
  msg + "Overhead " + StrF(*mt\size / *mt\datasize,2) + " bytes input" + #LF$
  msg + "Map Size ~ " + StrF(((4*SizeOf(Integer)*MapSize(mp())*1.42)+*mt\datasize) / 1024 / 1024,2) + "mb" + #LF$
 
  SetClipboardText(msg)
  MessageRequester("SQUINT Integer keys",msg)
 
  SquintFree(*mt)
 
CompilerEndIf



Top
 Profile  
Reply with quote  
 Post subject: Re: SQUINT 2, Sparse Quad Union Indexed Nibble Trie
PostPosted: Thu Jul 30, 2020 9:01 am 
Offline
Enthusiast
Enthusiast
User avatar

Joined: Sun Jun 22, 2003 7:43 pm
Posts: 587
Location: Germany, Saarbrücken
Nice work!
Code:
Intel(R) Core(TM) i7-3820QM CPU @ 2.70GHz 64 bit
Items 1000000
Squint Set 775 ms
Map Set 393 ms
Squint Get 502 ms
Map Get 421 ms
Squint Enum 1 ms
Map Enum 245 ms
=========================
Squint Set Ratio 1.972 slower
Squint Get Ratio 1.192 slower
Squint Enum Ratio 245.000 faster
=========================
Key Input Size 13.99 mb
Squint Memory Size 54.67 mb
Overhead 3.91 bytes input
Map Size ~ 46.49mb

Keep in mind that your speed test also measures the time of Random, Str and Val.

_________________
The english grammar is freeware, you can use it freely - But it's not Open Source, i.e. you can not change it or publish it in altered way.


Top
 Profile  
Reply with quote  
 Post subject: Re: SQUINT 2, Sparse Quad Union Indexed Nibble Trie
PostPosted: Thu Jul 30, 2020 10:13 am 
Offline
Addict
Addict
User avatar

Joined: Fri Sep 21, 2007 5:52 am
Posts: 3545
Location: New Zealand
NicTheQuick wrote:
Nice work!
Keep in mind that your speed test also measures the time of Random, Str and Val.


Thanks Nic, my aim was to have equally representative tests using the same random sequence, the loops are essentially equivalent and the additional code should help smooth out time slices a little, so hopefully it gets a more realistic ratio for the sets and gets. It will vary a bit but it shows that's it fast and surprisingly good for plain pb. Sets are ~2 times slower and Gets ~1.5 time slower. It should also scale better the more you load it.

Quote:
AMD FX(tm)-6100 Six-Core Processor 64 bit
Items 10,000,000
Squint Set 14980 ms
Map Set 8704 ms
Squint Get 14750 ms
Map Get 9036 ms
Squint Enum 3 ms
Map Enum 3701 ms
=========================
Squint Set Ratio 1.721 slower
Squint Get Ratio 1.632 slower
Squint Enum Ratio 1233.667 faster
=========================
Key Input Size 139.96 mb
Squint Memory Size 269.53 mb
Overhead 1.93 bytes input
Map Size ~ 464.97mb



Top
 Profile  
Reply with quote  
 Post subject: Re: SQUINT 2, Sparse Quad Union Indexed Nibble Trie
PostPosted: Fri Jul 31, 2020 8:24 am 
Offline
Addict
Addict
User avatar

Joined: Fri Sep 21, 2007 5:52 am
Posts: 3545
Location: New Zealand
updated 2.0.1a
Added on the fly UTF8 conversion with help from wilbert, with no adverse loss of speed.
fixed bug on x86, adding mode parameter and testing for null bytes for string termination, technically it should also apply to x64, though it seems to work fine without it.


Top
 Profile  
Reply with quote  
 Post subject: Re: SQUINT 2, Sparse Quad Union Indexed Nibble Trie
PostPosted: Sat Aug 01, 2020 12:12 am 
Offline
Addict
Addict
User avatar

Joined: Fri Sep 21, 2007 5:52 am
Posts: 3545
Location: New Zealand
Squint 2.0.2a
fixed x86 utf8 mode, previous attempt didn't work properly.
tested ascii modes
Quote:
ö€à test unicode get
ö€à test unicode enum
ö€à test utf8 get 2
ö€à test utf8 enum
An ascii string
An ascii string test ascii enum


Top
 Profile  
Reply with quote  
 Post subject: Re: SQUINT 2, Sparse Quad Union Indexed Nibble Trie
PostPosted: Sun Aug 09, 2020 5:24 am 
Offline
Addict
Addict
User avatar

Joined: Fri Sep 21, 2007 5:52 am
Posts: 3545
Location: New Zealand
Added Numeric key support so it can be used as a numeric map,

test map sized to 1,000,000 items
Quote:
Items 1000000
Squint Numeric Set 791 ms
Map Set 527 ms
Squint numeric Get 352 ms
Map Get 483 ms
=========================
Squint Set Ratio 1.501 slower
Squint Get Ratio 1.372 faster
=========================
Key Input Size 7.63 mb
Squint Memory Size 25.71 mb
Overhead 3.37 bytes input
Map Size ~ 49.70mb


test vs a dynamically sized map, sized to 10,000 elements, maps not very dynamic!
Quote:
Items 1000000
Squint Numeric Set 554 ms
Map Set 4351 ms
Squint numeric Get 265 ms
Map Get 9174 ms
=========================
Squint Set Ratio 0.127 slower
Squint Get Ratio 34.619 faster
=========================
Key Input Size 7.63 mb
Squint Memory Size 25.71 mb
Overhead 3.37 bytes input
Map Size ~ 49.70mb



Top
 Profile  
Reply with quote  
 Post subject: Re: SQUINT 2, Sparse Quad Union Indexed Nibble Trie
PostPosted: Thu Aug 13, 2020 8:22 am 
Offline
Addict
Addict
User avatar

Joined: Fri Sep 21, 2007 5:52 am
Posts: 3545
Location: New Zealand
Added Integer key support to the string mode functions, It provides a marginally faster conversion than calling str() on input.
If you don't need enumeration functions for numeric inputs the fast numeric functions are a better option.

Quote:
AMD FX(tm)-6100 Six-Core Processor 64 bit
Items 1000000
Squint Set 940 ms
Map Set 414 ms
Squint Get 488 ms
Map Get 380 ms
Squint Enum 1 ms
Map Enum 256 ms
=========================
Squint Set Ratio 2.271 slower
Squint Get Ratio 1.284 slower
Squint Enum Ratio 256.000 faster
=========================
Key Input Size 7.00 mb
Squint Memory Size 54.67 mb
Overhead 7.81 bytes input
Map Size ~ 49.07mb



vs fast numeric
Quote:
Items 1000000
Squint Numeric Set 587 ms
Map Set 401 ms
Squint numeric Get 251 ms
Map Get 391 ms
=========================
Squint Set Ratio 1.464 slower
Squint Get Ratio 1.558 faster
=========================
Key Input Size 7.63 mb
Squint Memory Size 25.71 mb
Overhead 3.37 bytes input
Map Size ~ 49.70mb


Top
 Profile  
Reply with quote  
 Post subject: Re: SQUINT 2, Sparse Quad Union Indexed Nibble Trie
PostPosted: Fri Aug 14, 2020 5:16 am 
Offline
Addict
Addict
User avatar

Joined: Fri Sep 21, 2007 5:52 am
Posts: 3545
Location: New Zealand
2.0.5b
Removed the string length loop from the _STR macro

Test Squint string mode with numeric input against a sized map
Quote:
AMD FX(tm)-6100 Six-Core Processor 64 bit
Items 1000000
Squint Set 943 ms
Map Set 424 ms
Squint Get 484 ms
Map Get 385 ms
Squint Enum 1 ms
Map Enum 266 ms
=========================
Squint Set Ratio 2.224 slower
Squint Get Ratio 1.257 slower
Squint Enum Ratio 266.000 faster
=========================
Key Input Size 7.00 mb
Squint Memory Size 54.67 mb
Overhead 7.81 bytes input
Map Size ~ 49.07mb


SquintNumeric mode (no string conversion) vs a sized map
Quote:
Items 1000000
Squint Numeric Set 578 ms
Map Set 401 ms
Squint numeric Get 273 ms
Map Get 396 ms
=========================
Squint Set Ratio 1.441 slower
Squint Get Ratio 1.451 faster
=========================
Key Input Size 7.63 mb
Squint Memory Size 25.71 mb
Overhead 3.37 bytes input
Map Size ~ 49.70mb



Trying to accommodate unicode,utf8,ascii and integer input has been a bit of a challenge though it's still performing reasonably well. if you use Integers with Squint string functions, make sure you pass them by address.


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 8 posts ] 

All times are UTC + 1 hour


Who is online

Users browsing this forum: No registered users and 42 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
Jump to:  

 


Powered by phpBB © 2008 phpBB Group
subSilver+ theme by Canver Software, sponsor Sanal Modifiye