Fuzzy Search

Just starting out? Need help? Post your questions and find answers here.
User avatar
em_uk
Enthusiast
Enthusiast
Posts: 366
Joined: Sun Aug 08, 2010 3:32 pm
Location: Manchester UK

Fuzzy Search

Post by em_uk »

Hi guys and gals.

I'm making an app that copies files to an HDF image. One of the things I would like to do is to be able to filter my source files.

To do this I'd like to use a fuzzy search, so for example I have a folder with 10,000 titles inside. I want to find all the "ocean" games, I type Ocean in to my filter box and it will filter the list to show any titles with Ocean in the title.

I'm using an Explorer form at the moment for my source but think I need to write something custom.

Any idea on how this would be achieved (the fuzzy search bit)

Thanks
----

R Tape loading error, 0:1
User avatar
kenmo
Addict
Addict
Posts: 1967
Joined: Tue Dec 23, 2003 3:54 am

Re: Fuzzy Search

Post by kenmo »

A basic start...

Code: Select all

Path.s = GetTemporaryDirectory() + "Fuzzy/"
CreateDirectory(Path)

RandomSeed(0)
For i = 1 To 100
  File.s = Str(Random(1000)) + " " + Str(Random(1000)) + ".dat"
  File = ReplaceString(File, "9", " ocean ")
  If CreateFile(0, Path + File)
    CloseFile(0)
  EndIf
Next i

; (see below)
CreateRegularExpression(0, ".*ocean.*", #PB_RegularExpression_NoCase)

If ExamineDirectory(0, Path, "")
  While NextDirectoryEntry(0)
    File.s = DirectoryEntryName(0)
    
    ; Simple FindString
    If (FindString(File, "ocean", 1, #PB_String_NoCase))
      Debug File
    EndIf
    
    ; RegEx (more powerful)
    If (MatchRegularExpression(0, File))
      Debug File
    EndIf
    
  Wend
  FinishDirectory(0)
EndIf
infratec
Always Here
Always Here
Posts: 6817
Joined: Sun Sep 07, 2008 12:45 pm
Location: Germany

Re: Fuzzy Search

Post by infratec »

Why not:

Code: Select all

If ExamineDirectory(0, Path, "*ocean*.*")
  While NextDirectoryEntry(0)
    Debug DirectoryEntryName(0)
  Wend
  FinishDirectory(0)
EndIf
But this has nothing todo with Fuzzy :wink:

Bernd
User avatar
kenmo
Addict
Addict
Posts: 1967
Joined: Tue Dec 23, 2003 3:54 am

Re: Fuzzy Search

Post by kenmo »

That works too. I didn't do that in my example because
(a) I wanted to show a RegEx match in the loop
(b) on Linux/Mac, I think ExamineDirectory matches case-sensitive, which I assume em_uk does not want(?)
Deluxe0321
User
User
Posts: 69
Joined: Tue Sep 16, 2008 6:11 am
Location: ger

Re: Fuzzy Search

Post by Deluxe0321 »

A good way to perform a fuzzy search is to use the algo provided by Levenshtein.
Wiki: https://en.wikipedia.org/wiki/Levenshtein_distance
Code: https://github.com/acmeism/RosettaCodeD ... .purebasic

In combination with a prefilled list ( iterate all elements ) or life (check while iterating ), depending on how deep the structure is, the search itself is quite fast --> O(m*n).
You may have to tinker with the distance ( I commonly use a distance of <= 2 ---> means two different letters found in the compared string).
User avatar
Keya
Addict
Addict
Posts: 1891
Joined: Thu Jun 04, 2015 7:10 am

Re: Fuzzy Search

Post by Keya »

em, by "fuzzy search" do you actually mean complex search patterns (like regular expressions etc)? because the "search for ocean in the title" example you gave indicates it could be as simple as having the user enter a search word, and then simply enumerating through the list using FindString(nextword, wordtofind)
User avatar
em_uk
Enthusiast
Enthusiast
Posts: 366
Joined: Sun Aug 08, 2010 3:32 pm
Location: Manchester UK

Re: Fuzzy Search

Post by em_uk »

Hi, thanks for the ideas so far.

Fuzzy search as in simple search terms, not complex regex, just so that titles can be filtered easily.

The source windows is an explorer gadget which the user would browse to a folder with the source files in, but say he wants to add just games by codemasters he could type "codemasters" into the filter and the explorer gadget would filter the rest out.

infratec solution seems to work quite well in my testing and means I wont have to create my own file explorer.

Thanks Keya for the quick framework, if my app needs finer control that will come in very handy.
----

R Tape loading error, 0:1
Post Reply