I didn't want to bother you with code, sorry.
I would like to understand why I am wrong thinking x86 is as fast as x64 if you just replace x86 register by x64.
My logic was:
Fact (?) :
My cpu's memory bus is 64 bits large. My Cpu as well and also my Os.
If I ask for a 32bit dword she (yes my computer is a girl) has to move 64 bits anyway.
So if i replace 32bits register by 64 ones, it should be as fast without any further optimisation.
My result shows me wrong ...
I was looking for general rules or strategy to optimise x86 in x64.
As suggested, I can use more register to prevent access to memory in loop.
I guess I can also keep a 32 bits data format, transfer 2 in one qword and compute double data (if my code fits this logic)
Anything else ?
Some code... But It's probably pretty indigest and clumsy as it is. Nobody should force himeself into this.
Warning, it took arround 80 sec on my computer to get a result to my speed test (there are two of them), so be patient if you try this.
It's a hash table home made for my specific need. It uses chained clusters of 15 lines (I record 7 values per line and use X and Y coordonates as key)
I compare its speed to Pb's in my use (ie: 8 adds, one search and find result, one search and don't find result)
I compare also x64 vs x86 speed.
My point is not to compare Pb general Map to a specific Hashtable, of course.
; Add 8 000 000 elements and SEARCH/FIND 500 000 elements et Search and don't find 500 000 elements
; ___x86______ x64
; 26224ms___40305ms :My Hash add elements
; 80436ms___87073ms ;Pb Hash add elements
;___144ms_____585ms ;My Hash Searching for elements
;___589ms_____890ms ;Pb Hash Searching for elements
There are 2 methods to program bugless.
But only the third works fine.
Win10, Pb x64 5.71 LTS