ReadString() performance best practice

Oso · Post by **Oso** » Tue Aug 16, 2022 1:22 am

infratec wrote: ↑Mon Aug 15, 2022 10:38 pm You are right:
Code: Select all
*FileEnd = *File + MemorySize(*File) - 1
If you have enough RAM ... no problem.
I already read 1GB files in RAM without problems.

Or you have to do a chunk management which is not trivial if a rest of bytes needs to be copied to the begin.
If I remember, I already posted such an example.

It's only one 1 byte longer than the size of the file. I was wondering though, if something else happens to be occupying that memory location (1 byte past the size of our file), then its 'byte' could be included in the process. It's a minor point but I just wanted to make sure I was following it correctly

jacdelad · Post by **jacdelad** » Tue Aug 16, 2022 3:20 am

Oso wrote: ↑Tue Aug 16, 2022 1:18 am
jacdelad wrote: ↑Mon Aug 15, 2022 11:34 pm Keep in mind, that, if you don't need to read the full file, you maybe shouldn't read the full file. It's not clear to me, if you need to process the whole file.
Yes, understood. The code that I've written looks for an identifying key in the file, but once it has found it, then the process is complete and it doesn't need to look any further.

I assume this key isn't always in the same position?! For large files I would chunk it into 1MB pieces (or some other useful size) and analyze them. ReadData() would be your friend.

Oso · Post by **Oso** » Wed Aug 17, 2022 6:55 pm

jacdelad wrote: ↑Tue Aug 16, 2022 3:20 am I assume this key isn't always in the same position?! For large files I would chunk it into 1MB pieces (or some other useful size) and analyze them. ReadData() would be your friend.

That's right, there are no positions in the file, only delimited sequences of variable-length data. I might see better performance using ReadData() into fixed blocks as you say, the only problem would be that if I'm searching for a string of bytes that happens to straggle two consecutive blocks, it's difficult to find it, but of course there are ways around that, such as saving the last 'n' bytes from the block before.

But at the moment, using ReadByte() to process small delimited sections at a time is giving me fairly good performance. It takes 15 seconds to find something in a 45Mb file. I don't know if that sounds reasonable.

jacdelad · Post by **jacdelad** » Wed Aug 17, 2022 7:29 pm

Yeah, just add the subtract the length of the search string (be aware of ASCII and unicode differences) from the actual read position and read the next block.

PureBasic Forums - English

ReadString() performance best practice

Re: ReadString() performance best practice

Re: ReadString() performance best practice

Re: ReadString() performance best practice

Re: ReadString() performance best practice