It is currently Sat Nov 28, 2020 12:58 am

All times are UTC + 1 hour




Post new topic Reply to topic  [ 18 posts ]  Go to page 1, 2  Next
Author Message
 Post subject: autocomplete Trie
PostPosted: Wed Jan 15, 2020 12:53 am 
Offline
Addict
Addict
User avatar

Joined: Fri Sep 21, 2007 5:52 am
Posts: 3554
Location: New Zealand
Working on a replacement for Autocomplete with a Trie structure bit of a balancing act trying to constrain memory use, though I think I've got a workable compromise between memory and speed by interpreting the strings in nibbles, this reduces the memory by 8 times at the cost of 2 times the lookup, it should be more efficient than current implementation but before I put in place it should get a look over for any bugs. These kind of algorithms do my head in.

Test code loads mozilla GB dictionary from marco's git, not sure if my cursory pass of the file is 100% but it should show it working


Update 0.95
changed structure to array based to facilitate load, save and catch and to keep the memory size the same as x86 on x64
while it will dynamically grow it will incur excessive page faults, so you need to allocate enough slots

Update 0.96
Mashed it into scintiila autocomplete
*note it's still a little buggy
Return enter selection space bar to cancel
Tested on xp with debugger off works better, linux x64 with debugger seems fine

Update 0.96.1

Code:
:note if your rebuilding this between versions please make sure you deleted the file en-GB_trie.tri
 DeclareModule TRIE
 
  ; Nibbled TRIE, for unicode strings; Access time O(m*2)
  ; Array based can be saved loaded and caught from memory with compression
  ; Note node value -1 is reserved for traversal checks 
  ; Author: Idle
  ; Version: 0.96.1
  ; Pb 5.71 lts
  ; Created: 24/01/2020
  ; Licence MIT
 
  #TRIE_MAX = 1024 ;Max key length in bytes for enumeration of Trie
 
  Structure trie_node
    index.l[16]
    value.l
  EndStructure
 
  Structure Trie
    *vt
    size.l
    avail.l
    position.l
    datasize.l
    nodecount.l
    count.l
    sb.a[#TRIE_MAX]
    *root.Trie_node
  EndStructure   
 
  Prototype Trie_CB(*Input,len)
  Declare Trie_New(size=512)
  Declare Trie_Free(*this.Trie)
  Declare Trie_Set(*this.Trie,*key,len,value.l=1,mode=#PB_UTF8)
  Declare Trie_Get(*this.Trie,*key,len,mode=#PB_UTF8)
  Declare Trie_iEnum(*this.Trie,*node.Trie_Node,depth,*pfn.Trie_CB)
  Declare Trie_Enum(*this.Trie,*key,len,*pfn.Trie_CB,ReturnMatch=0,mode=#PB_UTF8)
  Declare Trie_Clear(*this.Trie)
 
  Interface ITrie
    Free()
    Set(*key,StringByteLength,value.l=1,mode=#PB_UTF8)
    Get(*key,StringBytelength,mode=#PB_UTF8)
    Enum(*key,StringBytelength,*pfn.Trie_CB,ReturnMatch=0,mode=#PB_UTF8)
    Clear()
    Load(file.s)
    Save(File.s,CompressLevel=4)
    Catch(Address)
    Stats(item)
  EndInterface   
 
EndDeclareModule

Module TRIE
 
  CompilerIf #PB_Compiler_OS = #PB_OS_Windows
    ImportC "zlib.lib"
      compress2(*dest,*destlen,*source,sourcelen,level)
      uncompress(*dest,*destlen,*source,sourcelen)
    EndImport
  CompilerElse
    ImportC "-lz"   
      compress2(*dest,*destlen,*source,sourcelen,level)
      uncompress(*dest,*destlen,*source,sourcelen)
    EndImport
  CompilerEndIf
 
  EnableExplicit
  ;Thanks Wilbert for a sane version of PeekN and PokeN
  ;Peek Nibble in order High Low
  Macro PeekNHL(in,Index,out)
    *Mem.Ascii = in
    *Mem + Index >> 1
    If Index & 1
      out = *Mem\a & $f
    Else
      out = *Mem\a >> 4
    EndIf
  EndMacro
 
  Macro PokeNHL(in, Number, Index)
    *Mem.Ascii = in
    *Mem + Index >> 1
    If Index & 1
      *Mem\a = (*Mem\a & $f0) | (Number & $f)
    Else
      *Mem\a = (*Mem\a & $0f) | (Number << 4)
    EndIf
  EndMacro
 
  Procedure Trie_Stats(*this.TRIE,item=1)
    Select item
      Case 1
        ProcedureReturn *this\datasize
      Case 2
        ProcedureReturn MemorySize(*this\root)
      Case 3
        ProcedureReturn MemorySize(*this\root) / *this\datasize
      Case 4
        ProcedureReturn *this\nodecount
      Case 5 
        ProcedureReturn *this\count
    EndSelect
  EndProcedure   
 
  Procedure Trie_Clear(*this.Trie)
    Protected idx,*node.Trie_node,size
    size = MemorySize(*this\root)
    While idx < size
      *node = *this\root+idx
      If *node\value
        *node\value = -1
      EndIf   
      idx + SizeOf(Trie_node)     
    Wend     
  EndProcedure
 
  Procedure Trie_New(size=512)
    Protected *obj.Trie = AllocateMemory(SizeOf(Trie))
    If *obj
      *obj\vt=?vt_Trie
      *obj\size = size * SizeOf(Trie_node)
      *obj\avail= *obj\size  - SizeOf(Trie_node)
      *obj\position = SizeOf(Trie_node)
      *obj\root = AllocateMemory(*obj\size)
    EndIf
    ProcedureReturn *obj
  EndProcedure
 
  Procedure Trie_Free(*this.Trie)
    FreeMemory(*this\root)
    FreeMemory(*this)
  EndProcedure
 
  ;Set or Unset a value in the Trie, len must be StringByteLength
  Procedure Trie_Set(*this.Trie,*key,len,value.l=1,mode=#PB_UTF8)
    Protected *node.Trie_node,*ikey,ilen,a,idx,*mem.Ascii,offset
    *node = *this\root
   
    If mode <> #PB_UTF8
      *ikey = UTF8(PeekS(*key,len))
      len = MemorySize(*ikey)
    Else
      *ikey = *key
    EndIf
   
    *this\datasize + len
   
    ilen = (len*2)-1
   
    For a = 0 To ilen
      PeekNHL(*ikey,a,idx)
      If *node\index[idx]     
        offset = *node\index[idx]
        *node = *this\root + offset
      ElseIf *this\avail  > 0
        offset = *this\Position
        *node\index[idx] = offset
        *node = *this\root + offset
        *node\value = -1
        *this\nodecount + 1
        *this\avail - SizeOf(Trie_node)
        *this\Position+SizeOf(Trie_node)
      Else
        *this\avail = 512*SizeOf(Trie_node)
        *this\size + *this\avail
        *this\root = ReAllocateMemory(*this\root,*this\size)
        *node = *this\root + offset
        offset = *this\Position
        *node\index[idx] = offset ;
        *node = *this\root + offset
        *node\value = -1
        *this\nodecount + 1
        *this\avail - SizeOf(Trie_node)
        *this\Position+SizeOf(Trie_node)
      EndIf
    Next   
   
    If *node\value
      *this\count+1
      *node\value = value
    EndIf   
    If mode <> #PB_UTF8
      FreeMemory(*ikey)
    EndIf   
  EndProcedure
 
  Procedure Trie_Get(*this.Trie,*key,len,mode=#PB_UTF8)
    ;Find the value of the key returns the value or 0
    Protected *ikey,ilen,*node.Trie_Node,a,idx.i,*mem.Ascii
   
    If mode <> #PB_UTF8
      *ikey = UTF8(PeekS(*key,len))
      len = MemorySize(*ikey)
    Else
      *ikey = *key
    EndIf
   
    ilen = (len*2)-1
    *node.Trie_node = *this\root
   
    For a = 0 To ilen
      PeekNHL(*ikey,a,idx)
      If *node   
        If *node\index[idx]     
          *node = *this\root + *node\index[idx]
        Else
          If mode <> #PB_UTF8
             FreeMemory(*ikey)
          EndIf
          ProcedureReturn 0
        EndIf
      Else
        FreeMemory(*ikey)
        ProcedureReturn 0
      EndIf
    Next   
    If mode <> #PB_UTF8
      FreeMemory(*ikey)
    EndIf   
    If *node\value > -1     
      ProcedureReturn *node\value
    EndIf
   
  EndProcedure
 
  Procedure Trie_iEnum(*this.Trie,*node.Trie_Node,depth,*pfn.Trie_CB)
    Protected a.i,tkey.s,*mem.Ascii
   
    If *node\value = 0
      ProcedureReturn 0
    EndIf
    ;build the key   
    For a=0 To 15
      If *node\index[a]
        PokeNHL(@*this\sb,a,depth)
        Trie_iEnum(*this,*this\root + *node\index[a],depth+1,*pfn)
      EndIf
    Next
    ;lookup the key value
    If *node\value > 0
      If *pfn
        *pfn(@*this\sb,depth/2-1)
      EndIf
    EndIf   
    ProcedureReturn *node
   
  EndProcedure
 
  Procedure Trie_Enum(*this.Trie,*key,len,*pfn.Trie_CB,ReturnMatch=0,mode=#PB_UTF8)
    ;Enumerate all the entries below the given root key
    Protected *node.Trie_Node,*tnode.Trie_Node,tkey.i,ilen.i,pos.i,*mem.Ascii,a,*ikey,lkey
    If mode <> #PB_UTF8
      *ikey = UTF8(PeekS(*key,len))
      len = MemorySize(*ikey)
      ilen =((len-1)*2)-1
    Else
      *ikey = *key
      ilen =(len*2)-1
    EndIf
   
    FillMemory(@*this\sb[0],#TRIE_MAX,0)
   
    If ReturnMatch
      If Trie_Get(*this,*ikey,len)
        If *pfn
          *pfn(*ikey,len)
        EndIf
        ProcedureReturn
      EndIf
    EndIf   
   
    *node = *this\root   
   
    For a = 0 To ilen
      PeekNHL(*ikey,a,tkey)
      *node = *this\root + *node\index[tkey]
      If *node\value = 0
        If mode <> #PB_UTF8
          FreeMemory(*ikey)
        EndIf   
        ProcedureReturn 0
      EndIf
    Next   
   
    CopyMemory(*ikey,@*this\sb,MemorySize(*ikey))
    Trie_iEnum(*this,*node,a,*pfn)
   
    If mode <> #PB_UTF8
      FreeMemory(*ikey)
    EndIf   
  EndProcedure
 
  Procedure Trie_SaveToFile(*this.trie,file.s,CompressLevel=4)
    Protected fn,len,*dst,destlen,srclen
    fn = OpenFile(#PB_Any,file)
    If fn
      srclen = MemorySize(*this\root)
      destlen = (srclen * 1.5) + SizeOf(trie) + 8
      *dst = AllocateMemory(destlen)
      If *dst                                               
        If Compress2(*dst+(SizeOf(trie)+8),@destlen,*this\root,srclen,CompressLevel) = 0   
          PokeL(*dst,srclen)
          PokeL(*dst+4,destlen)
          CopyMemory(*this,*dst+8,SizeOf(Trie))
          len = WriteData(fn,*dst,destlen+SizeOf(trie)+8)
          CloseFile(fn)
          FreeMemory(*dst)
          ProcedureReturn len   
        EndIf
      EndIf
    EndIf   
  EndProcedure   
 
  Procedure Trie_LoadFile(*this.trie,file.s)
    Protected fn,destlen,srclen,*src,*dst
    fn = OpenFile(#PB_Any,file)
    If fn
      destlen = ReadLong(fn)
      *src = AllocateMemory(destlen)
      srclen = ReadLong(fn)
      *src = AllocateMemory(srclen)
      FreeMemory(*this\root)
      ReadData(fn,*this,SizeOf(trie))
      ReadData(fn,*src,srclen)
      *this\vt=?vt_Trie
      *this\root = AllocateMemory(destlen)
      If uncompress(*this\root,@destlen,*src,srclen) = 0
        FreeMemory(*src)
        CloseFile(fn)
        ProcedureReturn destlen
      EndIf
    EndIf
  EndProcedure   
 
  Procedure Trie_Catch(*this.Trie,*mem,len)
    Protected destlen,srclen,*src
    If *mem
      FreeMemory(*this\root)
      destlen = PeekL(*mem)
      srclen = PeekL(*mem+4)
      CopyMemory(*mem+8,*this,SizeOf(Trie))
      *this\vt=?vt_Trie
      *this\root = AllocateMemory(destlen)
      If uncompress(*this\root,@destlen,*mem+SizeOf(Trie)+8,srclen) = 0
        ProcedureReturn destlen
      EndIf
    EndIf
  EndProcedure   
 
  DataSection: vt_Trie:
    Data.i @Trie_Free()
    Data.i @Trie_Set()
    Data.i @Trie_Get()
    Data.i @Trie_Enum()
    Data.i @Trie_Clear()
    Data.i @Trie_LoadFile()
    Data.i @Trie_SaveToFile()
    Data.i @Trie_Catch()
    Data.i @Trie_Stats()
  EndDataSection
 
EndModule


CompilerIf #PB_Compiler_IsMainFile
  ;set compiler to create exe in current directory
  ;On 1st run it will downloads mozilla GB english dictionary and build the Trie
  ;when you close the window it will compress and save the Trie to file
  ;On 2nd run it will Load the saved Trie
  ;if you uncoment Include binary down the bottom it will catch the Trie from memory
  UseModule TRIE
  EnableExplicit
  #Auto_Start =0
  #Auto_Get =1
  #Auto_Enum =2
 
  Global TmpPath.s = GetTemporaryDirectory() +"en-GB.dic"
  Global TriePath.s = GetPathPart(ProgramFilename()) + "en-GB_trie.tri"
  Global *mt.ITrie = Trie_New(700384) ;specify number of nodes required, realloc can be very slow 
 
  Global index,result
  Global *buf = AllocateMemory(1024)
  Global itemsize=1024*1024
  Global itempos
  Global *items = AllocateMemory(itemsize)  ;utf8 list of words for lookup
  Global *text
 
  Procedure ClearItems()
    FillMemory(*items,itemsize,0)
  EndProcedure 
 
  Procedure cbTrie(*key,len)
    Protected out.s
    If (itempos + len + 1) < MemorySize(*items)
      CopyMemory(*key,*items+itempos,len)
      out = PeekS(*items+itempos,len,#PB_UTF8)
      Debug out
      Debug Str(Len(out)) + " " + Str(len)
      itempos + len
      PokeA(*items+itempos,32)
      itempos + 1
    Else
      itemsize + len + 1
      *items = ReAllocateMemory(*items,itemsize)
      CopyMemory(*key,*items+itempos,len)
      itempos + len
      PokeA(*items+itempos,32)
      itempos + 1
    EndIf   
  EndProcedure   
 
  Procedure LoadDicFile(Dicfile.s,*trie.ITrie)
    Protected fn,fm,pos,word.s,count
    fn = ReadFile(#PB_Any,Dicfile)
    If fn
      fm = ReadStringFormat(fn)
      While Not Eof(fn)
        word = ReadString(fn,fm,-1)
        pos = FindString(word,"/")
        If pos > 0
          word = Left(word,pos-1)
        EndIf
        word = LCase(word)
        *trie\Set(@word,StringByteLength(word),1,#PB_Unicode)
        count+1
      Wend
      CloseFile(fn)
    EndIf
    ProcedureReturn count
  EndProcedure
 
  Procedure ScintillaCallBack(Gadget, *scinotify.SCNotification)
    Static Auto_state.i     
    Static pos.i,space.i
         
    Protected mods = (#SC_MOD_DELETETEXT | #SC_PERFORMED_USER)
    Select *scinotify\nmhdr\code 
       
      Case #SCN_AUTOCSELECTION
        Debug "selected " + PeekS(*scinotify\text,-1,#PB_UTF8)
        Auto_state = #Auto_Start
        ;ScintillaSendMessage(0,#SCI_AUTOCCANCEL)
        pos = 0
      Case #SCN_MODIFIED
        If *scinotify\modificationType = mods Or *scinotify\modificationType =  (mods | #SC_STARTACTION)
          pos - *scinotify\length 
          Debug "back spaced " + *scinotify\length
          If pos < 1
            Debug "cancel from back space"
            Auto_state = #Auto_Start
            ScintillaSendMessage(0,#SCI_AUTOCCANCEL)
            pos = 0
          EndIf
        EndIf   
      Case #SCN_CHARADDED   
        Select *scinotify\ch
          Case $A,$20 
            If Auto_state < #Auto_Enum 
               Debug "by space " + PeekS(*buf,pos,#PB_UTF8)
               If *mt\Get(*buf,pos+1) = 0
                 Debug "added "
                 *mt\Set(*buf,pos+1)
               EndIf
               Auto_state = #Auto_Start
               pos=0
            EndIf   
                     
          Default
            PokeA(*buf+pos,*scinotify\ch)
            If pos > 1
              Auto_state = #Auto_Get
              ClearItems()
              *mt\enum(*buf,pos+1,@cbTrie())
              itempos=0
              If ScintillaSendMessage(0,#SCI_AUTOCACTIVE) = 0
                ScintillaSendMessage(0,#SCI_AUTOCSHOW,pos+1,*items)   
              Else
                ScintillaSendMessage(0,#SCI_AUTOCCANCEL)
                ScintillaSendMessage(0,#SCI_AUTOCSHOW,pos+1,*items) 
              EndIf
            EndIf
            pos+1
        EndSelect       
      Default
       
    EndSelect       
   
  EndProcedure
 
  If ?TRIE_DICTEND - ?TRIE_DICT > 0
    result = *mt\Catch(?TRIE_DICT)
    Debug "Caught Trie :" + Str(result) + " bytes"
  ElseIf FileSize(TriePath) > 0
    result = *mt\Load(TriePath)
    Debug "Loaded Trie :" + Str(result) + " bytes"
  Else
    InitNetwork()
    If ReceiveHTTPFile("https://raw.githubusercontent.com/marcoagpinto/aoo-mozilla-en-dict/master/en_GB%20(Marco%20Pinto)/en-GB.dic",TmpPath)
      loadDicfile(TmpPath,*mt)
      Debug "raw bytes loaded =" + Str(*mt\Stats(1))
      Debug "size of data stored = " + Str(*mt\Stats(2))
      Debug "data / raw = " + Str(*mt\Stats(3))
      Debug "nodes " + Str(*mt\Stats(4))
      Debug "items " + Str(*mt\Stats(5))
    Else
      MessageRequester("OOPS","failed to download")
      End
    EndIf   
  EndIf
 
  If OpenWindow(0, 0, 0, 800, 600, "ScintillaGadget", #PB_Window_SystemMenu | #PB_Window_ScreenCentered)
   
    If InitScintilla()
      ScintillaGadget(0,0,0,800, 600,@ScintillaCallBack())
      ScintillaSendMessage(0,#SCI_AUTOCSETIGNORECASE,#True)
      ScintillaSendMessage(0,#SCI_AUTOCSETAUTOHIDE,0)
      ScintillaSendMessage(0,#SCI_AUTOCSETMAXHEIGHT,10)
      ScintillaSendMessage(0,#SCI_AUTOCSETORDER,#SC_ORDER_PRESORTED)
      ScintillaSendMessage(0, #SCI_STYLESETFORE, 0, RGB(0, 0, 0))
      *Text=UTF8(" ")
      ScintillaSendMessage(0, #SCI_AUTOCSTOPS,0,*text)
      FreeMemory(*Text)
      *Text=UTF8("Autocomplete test in ScintillaGadget...")
      ScintillaSendMessage(0, #SCI_SETTEXT, 0,*Text)
      FreeMemory(*Text)
    EndIf
   
    Repeat : Until WaitWindowEvent() = #PB_Event_CloseWindow
  EndIf
  CloseWindow(0)
 
  If FileSize(TriePath) < 0
    result = *mt\Save(TriePath)
    Debug "Saved Trie :" + Str(result) + " bytes"
  EndIf   
 
  *mt\clear()
  *mt\free()
 
  DataSection : TRIE_DICT:
    ;IncludeBinary #PB_Compiler_FilePath + "en-GB_trie.tri"
    TRIE_DICTEND:
  EndDataSection
 
CompilerEndIf


Top
 Profile  
Reply with quote  
 Post subject: Re: autocomplete Trie
PostPosted: Wed Jan 15, 2020 10:04 pm 
Offline
User
User

Joined: Mon Sep 03, 2012 8:52 pm
Posts: 84
Interesting, might try it out one of these days. Have not been programming much lately. But still enjoy see stuff like this. Thanks for sharing. :D


Top
 Profile  
Reply with quote  
 Post subject: Re: autocomplete Trie
PostPosted: Wed Jan 15, 2020 11:29 pm 
Offline
Always Here
Always Here

Joined: Mon Sep 22, 2003 6:45 pm
Posts: 7446
Location: Norway
Is this for the IDE?

Can I ask some questions?

What is the current data structured used in the IDE, and does it have any practical problems that justify the need for a more complicated data structure (assuming this is)?

What about removing keys?


Top
 Profile  
Reply with quote  
 Post subject: Re: autocomplete Trie
PostPosted: Wed Jan 15, 2020 11:47 pm 
Offline
Addict
Addict
User avatar

Joined: Fri Sep 21, 2007 5:52 am
Posts: 3554
Location: New Zealand
Hello Trond,

Nice to see you back!

Yes that's the plan, current autocomplete is built with a map and lists, it rebuilds each time and sorts the list, it's not ideal, plus it doesn't work half the time on linux but that's a separate issue which needs resolving. While it works a Trie is much better fit for the task, load it once, add to and remove on the fly.
An alternative would be a Fuzzy Search rather than an exact match, that could be better from the onset, though until someones written one a Trie will be an improvement.
I'm still trying to work out what the best strategy is to handle structures and modules, once I've got a handle on how the IDE does it, I should be able to add it using the nodes value to say that the item is a root, like TRIE:: then if UseModule is active you don't want the root in the results, but if you didn't usemodule and typed in TRIE:: it would still enumerate it with the root TRIE::Get() instead of Get() if usemodule was active.


Top
 Profile  
Reply with quote  
 Post subject: Re: autocomplete Trie
PostPosted: Thu Jan 16, 2020 12:24 am 
Offline
Always Here
Always Here

Joined: Mon Sep 22, 2003 6:45 pm
Posts: 7446
Location: Norway
idle wrote:
Nice to see you back!
Thank you.

Remember to benchmark size and time. Sometimes performance isn't intuitive.

I did try to make a trie a long time ago, but couldn't make it work.


Top
 Profile  
Reply with quote  
 Post subject: Re: autocomplete Trie
PostPosted: Thu Jan 16, 2020 1:21 am 
Offline
Addict
Addict
User avatar

Joined: Fri Sep 21, 2007 5:52 am
Posts: 3554
Location: New Zealand
It'll be interesting to see how it compares, it's going to take some time to do anyhow and I really want to get the display working on linux before I start butchering it, the enumeration of the Trie could well be slower than the current implementation even once it's optimised.

I'm still not happy with the memory consumption, changing it to utf8 will save ~3rd and it shouldn't impact the enumeration, I could also make it array based rather than heap based which would at least reduce the overhead on x64 by ~1/2


Top
 Profile  
Reply with quote  
 Post subject: Re: autocomplete Trie
PostPosted: Thu Jan 16, 2020 1:21 pm 
Offline
Enthusiast
Enthusiast
User avatar

Joined: Sat May 05, 2007 5:31 pm
Posts: 769
Location: Linz, Austria
Works great! (if the debugger is disabled :lol:)

_________________
Et cetera is my worst enemy


Top
 Profile  
Reply with quote  
 Post subject: Re: autocomplete Trie
PostPosted: Thu Jan 16, 2020 3:41 pm 
Offline
Addict
Addict
User avatar

Joined: Tue Dec 23, 2003 3:54 am
Posts: 1863
Great start! I'm very excited for IDE contributions. I have a few in progress.

For you I have 1 comment and 1 "bug report"

Comment: If you type a complete word, can you still show other suggestions? Or maybe make that a toggle setting?
If I type "end" I would want to see "EndIf" "EndProcedure" etc.

Bug: I get some wrong results...
type "mur" top result is "mubarak"
"murt" shows "murder" and other similar words with "d" instead of "t"
"ag" shows a lot of words starting with "aw"

_________________
On GitHub: PB Includes - IDE Tools - Color Themes - IDE Branches - TabBarGadget Mods


Top
 Profile  
Reply with quote  
 Post subject: Re: autocomplete Trie
PostPosted: Thu Jan 16, 2020 6:45 pm 
Offline
PureBasic Team
PureBasic Team
User avatar

Joined: Fri Apr 25, 2003 5:21 pm
Posts: 5815
Location: Germany
idle wrote:
Yes that's the plan, current autocomplete is built with a map and lists, it rebuilds each time and sorts the list, it's not ideal, plus it doesn't work half the time on linux but that's a separate issue which needs resolving. While it works a Trie is much better fit for the task, load it once, add to and remove on the fly.


The list is rebuilt every time because AutoComplete is context-sensitive (procedure and module scope, With-Statements for structured items etc.). The data is not static.

You could probably get rid of the sorting with a new data structure, but you should do some measurements first to see if this is actually problematic enough to be worth the increased complexity.

_________________
quidquid Latine dictum sit altum videtur


Top
 Profile  
Reply with quote  
 Post subject: Re: autocomplete Trie
PostPosted: Thu Jan 16, 2020 7:18 pm 
Offline
Addict
Addict
User avatar

Joined: Sat Feb 13, 2010 3:45 pm
Posts: 1143
@idle
There are many wrong results. If I type 'bb', there are only 4 entries that really start with 'bb', but I get several hundred results.


@freak
freak wrote:
... AutoComplete is context-sensitive (procedure and module scope ...
Really? If I type the following example, then I immediately get a suggestion for 'MyVar'.
Code:
Define MyVar

Procedure MyProc()
  MyV

_________________
sorry for my bad english


Top
 Profile  
Reply with quote  
 Post subject: Re: autocomplete Trie
PostPosted: Thu Jan 16, 2020 8:32 pm 
Offline
Addict
Addict
User avatar

Joined: Tue Dec 23, 2003 3:54 am
Posts: 1863
Josh wrote:
Really? If I type the following example, then I immediately get a suggestion for 'MyVar'.


It's definitely procedure-aware. Try below.

But you're right, inside a procedure it suggests a non-global defined outside. That seems like a bug, but a pretty harmless one.

Code:
Global This_Is_Global
Define This_Is_Define

Procedure Test1(This_Is_1.i)
  ; type "this" here
 
EndProcedure

Procedure Test2(This_Is_2.i)
  ; type "this" here
 
EndProcedure


EDIT: Also, hello freak!

_________________
On GitHub: PB Includes - IDE Tools - Color Themes - IDE Branches - TabBarGadget Mods


Top
 Profile  
Reply with quote  
 Post subject: Re: autocomplete Trie
PostPosted: Thu Jan 16, 2020 9:02 pm 
Offline
Addict
Addict
User avatar

Joined: Fri Sep 21, 2007 5:52 am
Posts: 3554
Location: New Zealand
kenmo wrote:
Great start! I'm very excited for IDE contributions. I have a few in progress.

For you I have 1 comment and 1 "bug report"

Comment: If you type a complete word, can you still show other suggestions? Or maybe make that a toggle setting?
If I type "end" I would want to see "EndIf" "EndProcedure" etc.

Bug: I get some wrong results...
type "mur" top result is "mubarak"
"murt" shows "murder" and other similar words with "d" instead of "t"
"ag" shows a lot of words starting with "aw"


Thanks and yes I had noticed it was populating the list out of order since I changed it to interpret the key by nibbles, I'm in the process of making it array based to reduce the memory and make it cache friendly, so will sort it out then.

@Josh
Thanks,

@Freak
Yes I understand why it's doing it but it's really not necessary to rebuild it every time to make it context sensitive, each token can have a type identifier and to handle things like modules it can be done a couple of ways. If you have Usemodule TRIE you'd first get to the rootnode of TRIE:: and then enumerate from there, if you don't have usemodule you would just enumerate it from the root as TRIE::Get ..., same will work for structures.
So the Trie can be statically built and added to when needed and the context is dynamic by the node values.

My main concern is to actually fix the display issues on Linux, which has nothing to do with the structure of the code


Top
 Profile  
Reply with quote  
 Post subject: Re: autocomplete Trie
PostPosted: Thu Jan 16, 2020 10:53 pm 
Offline
PureBasic Team
PureBasic Team
User avatar

Joined: Fri Apr 25, 2003 5:21 pm
Posts: 5815
Location: Germany
If you don't want to rebuild the structure at display time you have to do more work to keep it up to date when edits are made which is an even more performance critical code path. If you spend too much time there the editing will feel sluggish very quickly. The IDE had this problem a lot with the early implementations of AutoComplete when used with larger source files. This is why the current implementation is a compromise: The parsing work during editing is done only with minimal context which also means that reacting to an edit in the code means that only very little parsing is required to update the data (only the edited line in the best case) but there is work needed to build the list when AutoComplete is displayed. I think the resulting performance is well balanced on both sides.

This problem is a lot bigger than just a question of data structures. Trust me, I worked on this part a lot over the years ;)

_________________
quidquid Latine dictum sit altum videtur


Top
 Profile  
Reply with quote  
 Post subject: Re: autocomplete Trie
PostPosted: Thu Jan 16, 2020 11:05 pm 
Offline
PureBasic Team
PureBasic Team
User avatar

Joined: Fri Apr 25, 2003 5:21 pm
Posts: 5815
Location: Germany
@Josh:

You are right, stuff in the main scope is actually treated as "global" even if it isn't. There is still room for improvement there.

It does work well the other way around though:
Code:
Procedure MyProc(ParameterVar)
  Protected ProtectedVar
  Global GlobalVar
  Shared SharedVar 
  Static StaticVar
EndProcedure


The variables inside the procedure are suggested in AutoComplete only when the cursor is actually inside the procedure, except for the Global/Shared one which are visible outside as it should be.

_________________
quidquid Latine dictum sit altum videtur


Top
 Profile  
Reply with quote  
 Post subject: Re: autocomplete Trie
PostPosted: Fri Jan 17, 2020 12:08 am 
Offline
Addict
Addict
User avatar

Joined: Fri Sep 21, 2007 5:52 am
Posts: 3554
Location: New Zealand
freak wrote:
If you don't want to rebuild the structure at display time you have to do more work to keep it up to date when edits are made which is an even more performance critical code path. If you spend too much time there the editing will feel sluggish very quickly. The IDE had this problem a lot with the early implementations of AutoComplete when used with larger source files. This is why the current implementation is a compromise: The parsing work during editing is done only with minimal context which also means that reacting to an edit in the code means that only very little parsing is required to update the data (only the edited line in the best case) but there is work needed to build the list when AutoComplete is displayed. I think the resulting performance is well balanced on both sides.

This problem is a lot bigger than just a question of data structures. Trust me, I worked on this part a lot over the years ;)


I will see if I can improve it, though like I said it might not work out faster, more due to the technicalities of implementing a memory lean trie, reducing it to nibbles and bit indexes in an array should address the memory overhead and eliminate cache thrashing but the added steps to access from nibbles and converting to utf8 will hurt it, though it should still perform close to a map but slower with the added benefit of having sorted output. Also if it's array based it could then be stored and loaded on startup and shutdown eliminating the need to rebuild it.


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 18 posts ]  Go to page 1, 2  Next

All times are UTC + 1 hour


Who is online

Users browsing this forum: No registered users and 2 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
Jump to:  

 


Powered by phpBB © 2008 phpBB Group
subSilver+ theme by Canver Software, sponsor Sanal Modifiye