search for pattern in file (First Attempt)

Just starting out? Need help? Post your questions and find answers here.
User avatar
NicTheQuick
Addict
Addict
Posts: 1227
Joined: Sun Jun 22, 2003 7:43 pm
Location: Germany, Saarbrücken
Contact:

Re: search for pattern in file (First Attempt)

Post by NicTheQuick »

collectordave wrote:Just created file with lots of spaces with an 'a' at the end then searched for ' a' and it found it no bother.
Yes, it works. But I was talking about the runtime in worst case scenarios.

I don't know if you want to perfect your algorithm or not. I am just trying to give you some hints.
Changed buffersize to 5 and searched for '90' found no problem at address 9 in the file.
You're totally right. Unfortunality I overlooked the part after the for-loop totally. Sorry for that.
The english grammar is freeware, you can use it freely - But it's not Open Source, i.e. you can not change it or publish it in altered way.
collectordave
Addict
Addict
Posts: 1309
Joined: Fri Aug 28, 2015 6:10 pm
Location: Portugal

Re: search for pattern in file (First Attempt)

Post by collectordave »

Hi Mijikai

I think that is what I am looking for on this topic viewtopic.php?f=13&t=73238 maybe should be moved there?

Changed slightly to show ogg page 2 and there are all the comments in a buffer. Brilliant!

As I understand it ogg say that 'OggS' starts a page and it continues to the next 'OggS' then within that page are a variable number of segments.

Is your programme showing me the contents of the segments in that page? There is just one segment in the comments page.

If so I just need to learn how to modify the comments and write back to the file.

Thanks for moving me on quite a bit.
Any intelligent fool can make things bigger and more complex. It takes a touch of genius — and a lot of courage to move in the opposite direction.
User avatar
Mijikai
Addict
Addict
Posts: 1360
Joined: Sun Sep 11, 2016 2:17 pm

Re: search for pattern in file (First Attempt)

Post by Mijikai »

The test code gets the complete header and segment table of each ogg page (the whole page).
(*table buffer and ogg struct are reused for each ogg page - *table always contains the complete segment table of the current ogg page)
Last edited by Mijikai on Fri Jul 26, 2019 12:36 pm, edited 1 time in total.
collectordave
Addict
Addict
Posts: 1309
Joined: Fri Aug 28, 2015 6:10 pm
Location: Portugal

Re: search for pattern in file (First Attempt)

Post by collectordave »

Worst case I can see is if the pattern spans a boundary by just one byte.

If not the first block then it steps back the filepointer by 1 byte less than the pattern calculated by number of blocks read.

So for example searching for 'quick' the block boundary could be anywhere in the word\pattern so stepping back this way means the next block starts as below

A worst case scenario could be where two words the same are together in the file such as quickquick

I am using | to show the boundary:-

quick|quick next block starts with uickquick the address of the last quick in the block was found but does not match uick in next block

quickqu|ick next block starts with ckququick same as above

and so on

Apart from the first block I see it as each block start begins in the previous block and includes 1 byte less than the pattern.

Hope someone else can understand this as well!

Kind regards

CD
Any intelligent fool can make things bigger and more complex. It takes a touch of genius — and a lot of courage to move in the opposite direction.
User avatar
Mijikai
Addict
Addict
Posts: 1360
Joined: Sun Sep 11, 2016 2:17 pm

Re: search for pattern in file (First Attempt)

Post by Mijikai »

Can you try this?

You have to provide the buffer for strings manually.

Code: Select all

EnableExplicit

Global signature.i = 1399285583;-> OggS
Global NewList offsets.i()

Procedure.i SearchFile(File.s,*Buffer,BufferSize.i,List Offset.i())
  Protected handle.i
  Protected size.i
  Protected pointer.i
  Protected result.i
  Protected *slide
  Protected slide_size.i
  Protected bytes.i
  Protected index.i
  If File And *Buffer And (BufferSize > 0) 
    handle = ReadFile(#PB_Any,File)
    If IsFile(handle)
      size = Lof(handle)
      If Not size < BufferSize
        slide_size = BufferSize << 1
        *slide = AllocateMemory(slide_size)
        If *slide
          While Not Eof(handle)
            bytes = ReadData(handle,*slide,slide_size)
            If Not bytes = slide_size
              If bytes = 0
                Break
              EndIf
              FillMemory(*slide + bytes,slide_size - bytes)
            EndIf
            For index = 0 To BufferSize - 1
              If CompareMemory(*Buffer,*slide + index,BufferSize)
                If AddElement(Offset())
                  Offset() = pointer + index
                Else
                  bytes = 0
                  Break
                EndIf
              EndIf 
            Next
            pointer + BufferSize
            FileSeek(handle,pointer)
          Wend
          If bytes = 0
            ClearList(Offset())
          Else
            result = #True
          EndIf
          FreeMemory(*slide)
        EndIf 
      EndIf 
    EndIf 
  EndIf 
  ProcedureReturn result
EndProcedure

If SearchFile("XYZ.ogg",@signature,4,offsets())
  ForEach offsets()
    Debug offsets()
  Next
EndIf

;this example is slow!
;(increase the size of the slide & seek to make it faster)

;the *Buffer is a window over the slide...
;(it ensures that the search covers all bytes)

End
collectordave
Addict
Addict
Posts: 1309
Joined: Fri Aug 28, 2015 6:10 pm
Location: Portugal

Re: search for pattern in file (First Attempt)

Post by collectordave »

Hi Mijikai

Maybe an improvement on my search, but working on the first ogg prog you posted very interesting working out how to get the segment sizes individually and display just the segment having success.

thanks again
Any intelligent fool can make things bigger and more complex. It takes a touch of genius — and a lot of courage to move in the opposite direction.
User avatar
NicTheQuick
Addict
Addict
Posts: 1227
Joined: Sun Jun 22, 2003 7:43 pm
Location: Germany, Saarbrücken
Contact:

Re: search for pattern in file (First Attempt)

Post by NicTheQuick »

Are we talking about OGG files or searching patterns in files?
The english grammar is freeware, you can use it freely - But it's not Open Source, i.e. you can not change it or publish it in altered way.
collectordave
Addict
Addict
Posts: 1309
Joined: Fri Aug 28, 2015 6:10 pm
Location: Portugal

Re: search for pattern in file (First Attempt)

Post by collectordave »

Hi nik

In this topic we are talking about searching for a pattern in files. I have suggested earlier that the ogg bits should be moved to the other thread.

I started this as I am researching ogg files to be able ro write vorbis comments which is on another thread.

Need this to get all OggS header positions in an ogg file.

Sorry for any confusion.

CD
Any intelligent fool can make things bigger and more complex. It takes a touch of genius — and a lot of courage to move in the opposite direction.
Post Reply