PureBasic can consume gigabytes of memory when allocating only a few object IDs

Just starting out? Need help? Post your questions and find answers here.
Mistrel
Addict
Addict
Posts: 3415
Joined: Sat Jun 30, 2007 8:04 pm

PureBasic can consume gigabytes of memory when allocating only a few object IDs

Post by Mistrel »

There is a very strange performance regression which occurs when allocating object IDs in increasing numerical order. Not only does each subsequent number take an increasingly long time to call a specific function but the memory usage increases significantly.

In this example I will create and free 32 images by doubling the static ID specific when calling CreateImage(). The time needed by this function will on average double for each subsequent call:

** Note all examples were compiled and run using PureBasic for x64:

Code: Select all

id=1

Repeat
  DisableDebugger
  start=ElapsedMilliseconds()
  CreateImage(id,1,1)
  EnableDebugger
  
  FreeImage(id)
  
  Debug "Id/Elapsed: "+Str(id)+"/"+Str(ElapsedMilliseconds()-start)
  id*2
  
  If id=4294967296
    Debug "Done!"
    End
  EndIf
ForEver
This is not the result of a dynamic array being populated as a cache because only 32 IDs were ever specified:

Code: Select all

1
2
4
8
16
32
64
128
256
512
1024
2048
4096
8192
16384
32768
65536
131072
262144
524288
1048576
2097152
4194304
8388608
16777216
33554432
67108864
134217728
268435456
536870912
1073741824
2147483648
This may be the result of some kind of cache where the numbers in between are also being allocated. This can be confirmed by analyzing memory usage. By placing a loop at the end it can be seen that this program consumed over 8.4GB of memory! These results are consistent between PureBasic 5.7x and PureBasic 6 alpha.

At first I thought that all numbers lower than a specified ID are always allocated but this is not the case. This can be demonstrated by performing the same loop but in reverse. The program runs and finishes almost instantly with none of the delay or large amount of memory consumption:

Code: Select all

id=4294967296

Repeat
  DisableDebugger
  start=ElapsedMilliseconds()
  CreateImage(id,1,1)
  EnableDebugger
  
  FreeImage(id)
  
  Debug "Id/Elapsed: "+Str(id)+"/"+Str(ElapsedMilliseconds()-start)
  id/2
  
  If id=1
    Debug "Done!"
    
    Repeat
      Delay(1)
    ForEver
  EndIf
ForEver
But this isn't the end of the story. Despite only consuming about 1MB of memory according to Task Manager, the program size will balloon to several gigabytes as it ends whether naturally or forcibly with Kill Program.

This is not problem created by CreateImage() and can be demonstrated equally with OpenWinodw()/CloseWindow().

I don't know what the cause of this is but I did not expect to find PureBasic to consuming gigabytes of memory when allocating a few IDs.
User avatar
STARGÅTE
Addict
Addict
Posts: 2067
Joined: Thu Jan 10, 2008 1:30 pm
Location: Germany, Glienicke
Contact:

Re: PureBasic can consume gigabytes of memory when allocating only a few object IDs

Post by STARGÅTE »

If you use static numbers, the object library allocate an array with a size of the larges number.
CreateImage(1000, ...) allocates an array with 1001 elements to realize the fast access of all static numbers until this number.
CreateImage(1000000, ...) allocates an array with 1000001 elements...
Due to the re-allocation with increasing object index, the evaluation take longer with increasing object number, than with decreasing numbers.

Therefore the debugger gives a warning which you ignore. There is no bug!
Last edited by STARGÅTE on Thu Sep 16, 2021 7:41 pm, edited 1 time in total.
PB 6.01 ― Win 10, 21H2 ― Ryzen 9 3900X, 32 GB ― NVIDIA GeForce RTX 3080 ― Vivaldi 6.0 ― www.unionbytes.de
Lizard - Script language for symbolic calculations and moreTypeface - Sprite-based font include/module
Mistrel
Addict
Addict
Posts: 3415
Joined: Sat Jun 30, 2007 8:04 pm

Re: PureBasic can consume gigabytes of memory when allocating only a few object IDs

Post by Mistrel »

Task Manager does not reflect this at runtime. When specifying a very large number the memory usage does not increase. But it does balloon when the program ends; I don't understand why it does that.

Code: Select all

DisableDebugger
CreateImage(4294967296,1,1)

Repeat
  Delay(1)
ForEver
The results are inconsistent depending on whether the numbers are provided in ascending or descending order. I don't know how this is implemented internally but if there is an array that is dynamically resizing itself then that is terrible.
Last edited by Mistrel on Thu Sep 16, 2021 7:44 pm, edited 1 time in total.
User avatar
STARGÅTE
Addict
Addict
Posts: 2067
Joined: Thu Jan 10, 2008 1:30 pm
Location: Germany, Glienicke
Contact:

Re: PureBasic can consume gigabytes of memory when allocating only a few object IDs

Post by STARGÅTE »

Mistrel wrote: Thu Sep 16, 2021 7:40 pm I don't see why a static array needs to be used here.
How else you want to refer an index (static object number) to the allocated object location (somewhere).
The array stores all the links (addresses) to the objects.
PB 6.01 ― Win 10, 21H2 ― Ryzen 9 3900X, 32 GB ― NVIDIA GeForce RTX 3080 ― Vivaldi 6.0 ― www.unionbytes.de
Lizard - Script language for symbolic calculations and moreTypeface - Sprite-based font include/module
Mistrel
Addict
Addict
Posts: 3415
Joined: Sat Jun 30, 2007 8:04 pm

Re: PureBasic can consume gigabytes of memory when allocating only a few object IDs

Post by Mistrel »

It's called a hash or tree map. One has O(n) lookup and the other is O(logn). Either one would be just as fast as a static lookup table on modern computers, would use a tiny fraction of memory, and would be immune to this problem entirely.

A static array is not the only way to store a lookup table.
infratec
Always Here
Always Here
Posts: 6817
Joined: Sun Sep 07, 2008 12:45 pm
Location: Germany

Re: PureBasic can consume gigabytes of memory when allocating only a few object IDs

Post by infratec »

In a map?
Mistrel
Addict
Addict
Posts: 3415
Joined: Sat Jun 30, 2007 8:04 pm

Re: PureBasic can consume gigabytes of memory when allocating only a few object IDs

Post by Mistrel »

Demivec found where this is defined is in the documentation:
The static, indexed way, allows you to reference an object by a predefined numeric value. The first available index number is 0 and subsequent indexes are allocated sequentially. This means that if you use the number 0 and then the number 1000, 1001 indexes will be allocated and 999 (from 1 to 999) will be unused, which is not an efficient way to use indexed objects.
Maybe this implementation is historical. It also means that anything #PB_Any returns is no longer useful for many of the use cases I had been using it for with PureBasic (x86).
User avatar
jacdelad
Addict
Addict
Posts: 1431
Joined: Wed Feb 03, 2021 12:46 pm
Location: Planet Riesa
Contact:

Re: PureBasic can consume gigabytes of memory when allocating only a few object IDs

Post by jacdelad »

Mistrel wrote: Thu Sep 16, 2021 8:09 pm Demivec found where this is defined is in the documentation:
The static, indexed way, allows you to reference an object by a predefined numeric value. The first available index number is 0 and subsequent indexes are allocated sequentially. This means that if you use the number 0 and then the number 1000, 1001 indexes will be allocated and 999 (from 1 to 999) will be unused, which is not an efficient way to use indexed objects.
Maybe this implementation is historical. It also means that anything #PB_Any returns is no longer useful for many of the use cases I had been using it for with PureBasic (x86).
I don't know about your programming style, but #PB_Any it the way that ALWAYS works, if done right. The handle is always valid and there's nothing extra being reserved.
PureBasic 6.04/XProfan X4a/Embarcadero RAD Studio 11/Perl 5.2/Python 3.10
Windows 11/Ryzen 5800X/32GB RAM/Radeon 7770 OC/3TB SSD/11TB HDD
Synology DS1821+/36GB RAM/130TB
Synology DS920+/20GB RAM/54TB
Synology DS916+ii/8GB RAM/12TB
User avatar
STARGÅTE
Addict
Addict
Posts: 2067
Joined: Thu Jan 10, 2008 1:30 pm
Location: Germany, Glienicke
Contact:

Re: PureBasic can consume gigabytes of memory when allocating only a few object IDs

Post by STARGÅTE »

Mistrel wrote: Thu Sep 16, 2021 7:46 pm A static array is not the only way to store a lookup table.
I know. But it is the fastest, without hashing, no collision handling etc.
Having the fastest access is important, while the static index is used at anyplace.
PB 6.01 ― Win 10, 21H2 ― Ryzen 9 3900X, 32 GB ― NVIDIA GeForce RTX 3080 ― Vivaldi 6.0 ― www.unionbytes.de
Lizard - Script language for symbolic calculations and moreTypeface - Sprite-based font include/module
User avatar
NicTheQuick
Addict
Addict
Posts: 1224
Joined: Sun Jun 22, 2003 7:43 pm
Location: Germany, Saarbrücken
Contact:

Re: PureBasic can consume gigabytes of memory when allocating only a few object IDs

Post by NicTheQuick »

Mistrel wrote: Thu Sep 16, 2021 7:40 pm Task Manager does not reflect this at runtime. When specifying a very large number the memory usage does not increase. But it does balloon when the program ends; I don't understand why it does that.

Code: Select all

DisableDebugger
CreateImage(4294967296,1,1)

Repeat
  Delay(1)
ForEver
The results are inconsistent depending on whether the numbers are provided in ascending or descending order. I don't know how this is implemented internally but if there is an array that is dynamically resizing itself then that is terrible.
That makes sense too. If you allocate a bunch of memory your operating system will not take that into account until you really use it. Your program's memory space exists only virtually and gets mapped to the real physical memory in the background or maybe to your swap device/file if your memory is full. That's why your browser seems to have allocated 50GB of virtual memory but in reality it only uses 2GB or so. It's called Memory overcommitment.

Now if your program ends, Purebasic wants to check if there are objects existing in that huge array of object ids and iterates over it. While it iterates over these 4294967297 IDs it reads the value from the array and because of that your operating system really has to commit that memory to your process and you can see that in the taskmanager. By the way you can also enable the virtual memory column in the task manager to see how much memory a process really tries to use.
The english grammar is freeware, you can use it freely - But it's not Open Source, i.e. you can not change it or publish it in altered way.
Mistrel
Addict
Addict
Posts: 3415
Joined: Sat Jun 30, 2007 8:04 pm

Re: PureBasic can consume gigabytes of memory when allocating only a few object IDs

Post by Mistrel »

You're right. I see the memory allocation when I enable "Committed Size" in Task Manager.

It's odd how performance differs when allocating IDs incrementally vs decrementally. I also wonder what PureBasic is doing as it closes that causes the operating system to allocate memory. It shouldn't need to perform any iteration and there is no need to free memory before ending a program as this cleanup is performed automatically; trying to do this yourself can be dangerous if cleanup occurs within a race condition and can lead to your program crashing as it terminates.
Post Reply