Rabin-Karp rolling hash example

Share your advanced PureBasic knowledge/code with the community.
Everything
Enthusiast
Enthusiast
Posts: 224
Joined: Sat Jul 07, 2018 6:50 pm

Re: Rabin-Karp rolling hash example

Post by Everything »

wilbert wrote:more detail

Code: Select all

Procedure GenerateHashTable(*Mem)
  Protected BlocksCount = Size/ChunkSize 
  LastChunk   = Size - (BlocksCount * ChunkSize)
  Protected x  
  BlocksCount - 1
  For x = 0 To BlocksCount 
    HT(x)\Offset   = x*ChunkSize
    HT(x)\WeakHash = RKHash(*Mem+HT(x)\Offset, ChunkSize)  ; <----- About
  Next 
EndProcedure
For each chunk we do now:
HT(x)\WeakHash = RKHash(*Mem+HT(x)\Offset, ChunkSize) ; direct call RKHash()

vs (with a little code additions of course)
HT(x)\WeakHash = HT(x-1)\WeakHash * 257 + *B\b - (*A\b * Power) ; "fast hash calc" way
( like we do in WalkHashTable() )

Can we theoretically do that or 'chunk stepping' may give us a wrong hashes values and direct RKHash() call for every chunk is the only way?
wilbert
PureBasic Expert
PureBasic Expert
Posts: 3870
Joined: Sun Aug 08, 2004 5:21 am
Location: Netherlands

Re: Rabin-Karp rolling hash example

Post by wilbert »

Everything wrote:Can we theoretically do that or 'chunk stepping' may give us a wrong hashes values and direct RKHash() call for every chunk is the only way?
You can use a rolling hash to fill the hash table but it would be a lot slower.
In that case you calculate the rolling hash after each byte and store the hash in a table after each ChunkSize bytes.
The problem is that in that case you need to add to the hash and remove from the hash while the RKHash procedure starts fresh and only needs to add.
Another advantage of using the RKHash procedure is that the calculation of each hash does not rely on that of the previous chunk. Since modern cpu have multiple cores, you could split the job into for example four threads and let each thread calculate 25% of all the hashes that need to be calculated.
Windows (x64)
Raspberry Pi OS (Arm64)
Everything
Enthusiast
Enthusiast
Posts: 224
Joined: Sat Jul 07, 2018 6:50 pm

Re: Rabin-Karp rolling hash example

Post by Everything »

Is it a normal behavior when hash of block with zeroes is 0 and say hash of 16 byte block:

Code: Select all

00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 01
is 1

Code: Select all

00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 02
is 2 etc ?
wilbert
PureBasic Expert
PureBasic Expert
Posts: 3870
Joined: Sun Aug 08, 2004 5:21 am
Location: Netherlands

Re: Rabin-Karp rolling hash example

Post by wilbert »

Everything wrote:Is it a normal behavior when hash of block with zeroes is 0 and say hash of 16 byte block:
Yes, that is normal behavior for this algorithm.
Windows (x64)
Raspberry Pi OS (Arm64)
Post Reply