ArrayList - Doubly-linked Dynamic Array Data Structure

Share your advanced PureBasic knowledge/code with the community.
PrincieD
Addict
Addict
Posts: 861
Joined: Wed Aug 10, 2005 2:08 pm
Location: Yorkshire, England
Contact:

ArrayList - Doubly-linked Dynamic Array Data Structure

Post by PrincieD »

Hi guys, this is my implementation of a doubly-linked dynamic array data structure. It is a hybrid of an array and a list and has performance somewhere between the two. What benchmarks do you guys get and can the performance/memory usage be improved any further?

For 10 million elements on my system:

List creation: 0ms
List population: 432ms
List iteration: 152ms

Array creation: 196ms
Array population: 95ms
Array iteration: 95ms

ArrayList creation: 0ms
ArrayList population: 1152ms
ArrayList iteration: 99ms

List deletion/insertion: 317ms
ArrayList deletion/insertion: 231ms
List iteration: 185ms
ArrayList iteration: 99ms

List deletion/insertion: 329ms
ArrayList deletion/insertion: 227ms
List iteration: 224ms
ArrayList iteration: 101ms

Here is the code, remember to disable debugger and use C backend with optimisation enabled:

Code: Select all


#ArrayListMaxAllocationBlock = 209715200 ; 200meg

Structure ArrayListItem
    
    nextIndex.l
    prevIndex.l
    
EndStructure

Structure ArrayList
    
    *buffer
    sizeOfStruct.l
    
    size.l
    itemCount.l
    headIndex.l
    tailIndex.l
    nextFreeIndex.l
    
    Array item.ArrayListItem(1)
    
EndStructure

Procedure upper_power_of_two(v)
  v-1
  v | v >> 1
  v | v >> 2
  v | v >> 4
  v | v >> 8
  v | v >> 16
  v+1
  ProcedureReturn v
EndProcedure

Procedure CreateArrayList(sizeOfStruct, size.i = #Null)
    
    *arrayList.ArrayList = AllocateStructure(ArrayList)
    *arrayList\sizeOfStruct = sizeOfStruct
    
    If *arrayList
        
        If size
            Dim *arrayList\item(size)
        Else
            size = 1
        EndIf
        
        *arrayList\buffer = AllocateMemory(sizeOfStruct * (size + 1), #PB_Memory_NoClear)
        
        *arrayList\size = size
        *arrayList\nextFreeIndex = 1
        
        ProcedureReturn *arrayList
        
    EndIf
    
    ProcedureReturn #False
    
EndProcedure

Procedure AddArrayListItem(*arrayList.ArrayList, index = 0)
    
    If Not index
        index = *arrayList\tailIndex
    EndIf
    
    If index < 0 Or index > *arrayList\size Or (index <> *arrayList\headIndex And Not *arrayList\item(index)\prevIndex)
        ProcedureReturn -1
    EndIf
    
    If *arrayList\nextFreeIndex > *arrayList\size
        
        newSize = upper_power_of_two(*arrayList\size + 1)
        If ((*arrayList\sizeOfStruct * (newSize + 1)) - (*arrayList\sizeOfStruct * (*arrayList\size + 1))) > #ArrayListMaxAllocationBlock
            newSize = *arrayList\size + Round(#ArrayListMaxAllocationBlock / *arrayList\sizeOfStruct, #PB_Round_Up)
        EndIf
        
        ReDim *arrayList\item(newSize)
        *arrayList\buffer = ReAllocateMemory(*arrayList\buffer, *arrayList\sizeOfStruct * (newSize + 1), #PB_Memory_NoClear)
        *arrayList\size = newSize
       
    EndIf
    
    If Not *arrayList\item(*arrayList\nextFreeIndex)\nextIndex
        
        *arrayList\item(*arrayList\nextFreeIndex)\nextIndex = *arrayList\nextFreeIndex + 1
    
    EndIf
    
    newIndex = *arrayList\nextFreeIndex
    
    *arrayList\nextFreeIndex = *arrayList\item(*arrayList\nextFreeIndex)\nextIndex
    
    FillMemory(*arrayList\buffer + (newIndex * *arrayList\sizeOfStruct), *arrayList\sizeOfStruct)
    
    If Not *arrayList\headIndex
        *arrayList\headIndex = newIndex
    Else
        *arrayList\item(newIndex)\prevIndex = index
    EndIf
    
    *arrayList\item(newIndex)\nextIndex = #Null
    
    If index And *arrayList\item(index)\nextIndex
        
        *arrayList\item(*arrayList\item(index)\nextIndex)\prevIndex = newIndex
        *arrayList\item(newIndex)\nextIndex = *arrayList\item(index)\nextIndex
        
    Else
        *arrayList\tailIndex = newIndex
    EndIf
    
    If index
        *arrayList\item(index)\nextIndex = newIndex
    EndIf
    
    *arrayList\itemCount + 1
    
    ProcedureReturn newIndex
    
EndProcedure

Procedure RemoveArrayListItem(*arrayList.ArrayList, index = 0)
    
    If Not index
        index = *arrayList\tailIndex
    EndIf
    
    If Not *arrayList\headIndex Or index < 0 Or index > *arrayList\size Or (index <> *arrayList\headIndex And Not *arrayList\item(index)\prevIndex)
        ProcedureReturn -1
    EndIf
    
    If index = *arrayList\headIndex
        
        *arrayList\headIndex = *arrayList\item(index)\nextIndex
        *arrayList\item(*arrayList\item(index)\nextIndex)\prevIndex = #Null
        
        If index = *arrayList\tailIndex
            *arrayList\headIndex = #Null
            *arrayList\tailIndex = #Null
        EndIf
        
    ElseIf index = *arrayList\tailIndex
        
        *arrayList\tailIndex = *arrayList\item(index)\prevIndex
        *arrayList\item(*arrayList\item(index)\prevIndex)\nextIndex = #Null
        
    Else
    
        *arrayList\item(*arrayList\item(index)\prevIndex)\nextIndex = *arrayList\item(index)\nextIndex
        *arrayList\item(*arrayList\item(index)\nextIndex)\prevIndex = *arrayList\item(index)\prevIndex
        
    EndIf
    
    prevIndex = *arrayList\item(index)\prevIndex
    *arrayList\item(index)\prevIndex = #Null
    
    *arrayList\item(index)\nextIndex = *arrayList\nextFreeIndex
    *arrayList\nextFreeIndex = index
    
    *arrayList\itemCount - 1
    
    ProcedureReturn prevIndex
    
EndProcedure

Structure test
    
    value.i
    x.l
    y.l
    width.l
    height.l
    
    idealWidth.l
    idealHeight.l
    minWidth.l
    minHeight.l
    maxWidth.l
    maxHeight.l
    
    flags.b
    
    topMargin.w
    leftMargin.w
    bottomMargin.w
    rightMargin.w
    
    *widget.Widget
    
    *bucketAdress.Integer
    *nextBucketItem
    *nextFreeBucketItem.Integer
    
EndStructure

Structure Item
    index.test[0]
EndStructure

Structure Buffer
    *buffer.Item
EndStructure

OpenConsole()

maxSize = 10000000

;- List test
startTime = ElapsedMilliseconds()
NewList test.test()
endTime = ElapsedMilliseconds()

PrintN("List creation: " + Str(endTime - startTime) + "ms")

startTime = ElapsedMilliseconds()
For n = 1 To maxSize
    
    AddElement(test())
    *test.test = test()
    *test\value = n
    
Next
endTime = ElapsedMilliseconds()

PrintN("List population: " + Str(endTime - startTime) + "ms")

startTime = ElapsedMilliseconds()
ForEach test()
    
    *test.test = test()
    *test\value / 2
    
Next
endTime = ElapsedMilliseconds()

PrintN("List iteration: " + Str(endTime - startTime) + "ms")

PrintN("")


;- Array test
startTime = ElapsedMilliseconds()
Dim test2.test(maxSize)
endTime = ElapsedMilliseconds()

PrintN("Array creation: " + Str(endTime - startTime) + "ms")

startTime = ElapsedMilliseconds()
For n = 1 To maxSize
    
    *test.test = test2(n)
    *test\value = n
    
Next
endTime = ElapsedMilliseconds()

PrintN("Array population: " + Str(endTime - startTime) + "ms")

startTime = ElapsedMilliseconds()
For n = 1 To maxSize
    
    *test.test = test2(n)
    *test\value / 2
    
Next
endTime = ElapsedMilliseconds()
PrintN("Array iteration: " + Str(endTime - startTime) + "ms")

PrintN("")


;- ArrayList test
startTime = ElapsedMilliseconds()
*arrayList.ArrayList = CreateArrayList(SizeOf(test))
*item.Buffer = *arrayList
endTime = ElapsedMilliseconds()

PrintN("ArrayList creation: " + Str(endTime - startTime) + "ms")

startTime = ElapsedMilliseconds()
For n = 1 To maxSize
    
    i = AddArrayListItem(*arrayList)
    
    *test.test = *item\buffer\index[i]
    *test\value = n
    
Next
endTime = ElapsedMilliseconds()

PrintN("ArrayList population: " + Str(endTime - startTime) + "ms")

startTime = ElapsedMilliseconds()
i = *arrayList\headIndex
While i
    
    *test.test = *item\buffer\index[i]
    *test\value / 2
    
    i = *arrayList\item(i)\nextIndex
    
Wend
endTime = ElapsedMilliseconds()

PrintN("ArrayList iteration: " + Str(endTime - startTime) + "ms")

PrintN("")


;- List + ArrayList insertion/deletion tests
RandomSeed(1337)

startTime = ElapsedMilliseconds()
ForEach test()
    
    If Random(2) = 0
        DeleteElement(test())
    EndIf
    
    If Random(2) = 0
        AddElement(test())
        *test.test = test()
        *test\value = 255;Random(255)
    EndIf
    
Next
endTime = ElapsedMilliseconds()

PrintN("List deletion/insertion: " + Str(endTime - startTime) + "ms")

RandomSeed(1337)

startTime = ElapsedMilliseconds()
i = *arrayList\headIndex
While i
   
    If Random(2) = 0
        i = RemoveArrayListItem(*arrayList, i)
    EndIf
    
    If Random(2) = 0
        i = AddArrayListItem(*arrayList, i)
        *test.test = *item\buffer\index[i]
        *test\value = 255;Random(255)
    EndIf
    
    i = *arrayList\item(i)\nextIndex
    
Wend
endTime = ElapsedMilliseconds()

PrintN("ArrayList deletion/insertion: " + Str(endTime - startTime) + "ms")

startTime = ElapsedMilliseconds()
ForEach test()
    
    *test.test = test()
    *test\value / 2
    
Next
endTime = ElapsedMilliseconds()

PrintN("List iteration: " + Str(endTime - startTime) + "ms")

startTime = ElapsedMilliseconds()
i = *arrayList\headIndex
While i
    
    *test.test = *item\buffer\index[i]
    *test\value / 2
    
    i = *arrayList\item(i)\nextIndex
    
Wend
endTime = ElapsedMilliseconds()

PrintN("ArrayList iteration: " + Str(endTime - startTime) + "ms")

PrintN("")

RandomSeed(1337)

startTime = ElapsedMilliseconds()
ForEach test()
    
    If Random(2) = 0
        DeleteElement(test())
    EndIf
    
    If Random(2) = 0
        AddElement(test())
        *test.test = test()
        *test\value = 255;Random(255)
    EndIf
    
Next
endTime = ElapsedMilliseconds()

PrintN("List deletion/insertion: " + Str(endTime - startTime) + "ms")

RandomSeed(1337)

startTime = ElapsedMilliseconds()
i = *arrayList\headIndex
While i
   
    If Random(2) = 0
        i = RemoveArrayListItem(*arrayList, i)
    EndIf
    
    If Random(2) = 0
        i = AddArrayListItem(*arrayList, i)
        *test.test = *item\buffer\index[i]
        *test\value = 255;Random(255)
    EndIf
    
    i = *arrayList\item(i)\nextIndex
    
Wend
endTime = ElapsedMilliseconds()

PrintN("ArrayList deletion/insertion: " + Str(endTime - startTime) + "ms")

startTime = ElapsedMilliseconds()
ForEach test()
    
    *test.test = test()
    *test\value / 2
    
Next
endTime = ElapsedMilliseconds()

PrintN("List iteration: " + Str(endTime - startTime) + "ms")

startTime = ElapsedMilliseconds()
i = *arrayList\headIndex
While i
    
    *test.test = *item\buffer\index[i]
    *test\value / 2
    
    i = *arrayList\item(i)\nextIndex
    
Wend
endTime = ElapsedMilliseconds()

PrintN("ArrayList iteration: " + Str(endTime - startTime) + "ms")

; FirstElement(test())
; 
; startTime = ElapsedMilliseconds()
; i = *arrayList\headIndex
; While i
;     
;     *test.test = *item\buffer\index[i]
;     
;     *test2.test = test()
;     Debug Str(*test\value) + " <-> " + Str(*test2\value)
;     
;     i = *arrayList\item(i)\nextIndex
;     NextElement(test())
;     
; Wend
; endTime = ElapsedMilliseconds()
; 
; PrintN("ArrayList iteration: " + Str(endTime - startTime) + "ms")


Input()
ProGUI - Professional Graphical User Interface Library - http://www.progui.co.uk
User avatar
Janni
Enthusiast
Enthusiast
Posts: 127
Joined: Mon Feb 21, 2022 5:58 pm
Location: Norway

Re: ArrayList - Doubly-linked Dynamic Array Data Structure

Post by Janni »

Interesting stuff, here are my numbers on Linux Mint, cpu:I7-3770K (from 2012). 6.03 Beta 8

List creation: 0ms
List population: 451ms
List iteration: 126ms

Array creation: 266ms
Array population: 91ms
Array iteration: 92ms

ArrayList creation: 0ms
ArrayList population: 380ms
ArrayList iteration: 97ms

List deletion/insertion: 248ms
ArrayList deletion/insertion: 197ms
List iteration: 152ms
ArrayList iteration: 101ms

List deletion/insertion: 267ms
ArrayList deletion/insertion: 197ms
List iteration: 185ms
ArrayList iteration: 100ms
Spec: Linux Mint 20.3 Cinnamon, i7-3770K, 16GB RAM, RTX 2070 Super
PrincieD
Addict
Addict
Posts: 861
Joined: Wed Aug 10, 2005 2:08 pm
Location: Yorkshire, England
Contact:

Re: ArrayList - Doubly-linked Dynamic Array Data Structure

Post by PrincieD »

Thanks for the benchmark Janni, that's really interesting the ArrayList population on your system is hella faster than mine lol although my laptop is pretty old now!
ProGUI - Professional Graphical User Interface Library - http://www.progui.co.uk
User avatar
NicTheQuick
Addict
Addict
Posts: 1519
Joined: Sun Jun 22, 2003 7:43 pm
Location: Germany, Saarbrücken
Contact:

Re: ArrayList - Doubly-linked Dynamic Array Data Structure

Post by NicTheQuick »

On my system it looks like this:

Code: Select all

List creation: 0ms
List population: 294ms
List iteration: 94ms

Array creation: 284ms
Array population: 75ms
Array iteration: 83ms

ArrayList creation: 0ms
ArrayList population: 354ms
ArrayList iteration: 93ms

List deletion/insertion: 172ms
ArrayList deletion/insertion: 353ms
List iteration: 94ms
ArrayList iteration: 88ms

List deletion/insertion: 179ms
ArrayList deletion/insertion: 379ms
List iteration: 120ms
ArrayList iteration: 91ms
I am running on Ubuntu 22.04.3 LTS with and AMD Ryzen 7 5800X 8-Core processor at 4.8 GHz boost clock.
The english grammar is freeware, you can use it freely - But it's not Open Source, i.e. you can not change it or publish it in altered way.
PrincieD
Addict
Addict
Posts: 861
Joined: Wed Aug 10, 2005 2:08 pm
Location: Yorkshire, England
Contact:

Re: ArrayList - Doubly-linked Dynamic Array Data Structure

Post by PrincieD »

Thanks NicTheQuick, is that using the C backend with code optimisation enabled? it seems strange that the ArrayList deletion/insertion tests take as long as the initial ArrayList population (where initial memory is allocated and re-allocated as the list grows)
ProGUI - Professional Graphical User Interface Library - http://www.progui.co.uk
User avatar
NicTheQuick
Addict
Addict
Posts: 1519
Joined: Sun Jun 22, 2003 7:43 pm
Location: Germany, Saarbrücken
Contact:

Re: ArrayList - Doubly-linked Dynamic Array Data Structure

Post by NicTheQuick »

In my Post above it was the ASM backend. Here are the results with the C compiler vs. ASM compiler vs C optimized compiler in version 6.02 LTS (x64):

Code: Select all

Task                              C    ASM  C opt
-------------------------------------------------
List creation:                  0ms    0ms    0ms
List population:              302ms  340ms  353ms
List iteration:               123ms  173ms  199ms

Array creation:               288ms  308ms  291ms
Array population:              69ms   81ms   69ms
Array iteration:               79ms   88ms   67ms

ArrayList creation:             0ms    0ms    0ms
ArrayList population:         353ms  390ms  267ms
ArrayList iteration:           84ms  103ms   74ms

List deletion/insertion:      189ms  170ms  178ms
ArrayList deletion/insertion: 365ms  356ms  240ms
List iteration:               102ms  105ms  101ms
ArrayList iteration:           87ms  100ms   74ms

List deletion/insertion:      196ms  175ms  186ms
ArrayList deletion/insertion: 397ms  381ms  260ms
List iteration:               121ms  126ms  136ms
ArrayList iteration:           89ms   98ms   78ms
The english grammar is freeware, you can use it freely - But it's not Open Source, i.e. you can not change it or publish it in altered way.
User avatar
Janni
Enthusiast
Enthusiast
Posts: 127
Joined: Mon Feb 21, 2022 5:58 pm
Location: Norway

Re: ArrayList - Doubly-linked Dynamic Array Data Structure

Post by Janni »

NicTheQuick:
Interesting and surprising to see how little difference there is on your powerful CPU compared too my old. In some cases I'm even faster...

Would also be interesting to see how this would perform in Python to compare.
Spec: Linux Mint 20.3 Cinnamon, i7-3770K, 16GB RAM, RTX 2070 Super
PrincieD
Addict
Addict
Posts: 861
Joined: Wed Aug 10, 2005 2:08 pm
Location: Yorkshire, England
Contact:

Re: ArrayList - Doubly-linked Dynamic Array Data Structure

Post by PrincieD »

NicTheQuick: Thanks that's great!
ProGUI - Professional Graphical User Interface Library - http://www.progui.co.uk
User avatar
StarBootics
Addict
Addict
Posts: 1006
Joined: Sun Jul 07, 2013 11:35 am
Location: Canada

Re: ArrayList - Doubly-linked Dynamic Array Data Structure

Post by StarBootics »

The result from my test :

Code: Select all

Task										ASM		C 		C opt
------------------------------------------------------------------------
List creation: 								0ms		0ms		0ms
List population:							226ms	307ms	243ms
List iteration:								275ms	144ms	298ms
			
Array creation:								112ms	112ms	112ms
Array population:							80ms	75ms	73ms
Array iteration: 							87ms	79ms	71ms
			
ArrayList creation:							0ms		0ms		0ms
ArrayList population:						265ms	242ms	172ms
ArrayList iteration:						99ms	87ms	78ms
			
List deletion/insertion: 					148ms	231ms	148ms
ArrayList deletion/insertion:				305ms	313ms	214ms
List iteration:								95ms	149ms	106ms
ArrayList iteration:						91ms	87ms	78ms
			
List deletion/insertion:					150ms	235ms	151ms
ArrayList deletion/insertion:				329ms	345ms	232ms
List iteration: 							108ms	166ms	116ms
ArrayList iteration:						88ms	92ms	78ms
Processor : AMD Ryzen™ 7 5800X × 16
Compiler version : 6.03 LTS Beta 8

Best regards
StarBootics
The Stone Age did not end due to a shortage of stones !
jamirokwai
Enthusiast
Enthusiast
Posts: 798
Joined: Tue May 20, 2008 2:12 am
Location: Cologne, Germany
Contact:

Re: ArrayList - Doubly-linked Dynamic Array Data Structure

Post by jamirokwai »

MacBook Pro M2 with C backend, no changes in configuration (no optimization, etc.)

Code: Select all

List creation: 0ms
List population: 161ms
List iteration: 52ms

Array creation: 79ms
Array population: 26ms
Array iteration: 26ms

ArrayList creation: 0ms
ArrayList population: 207ms
ArrayList iteration: 63ms

List deletion/insertion: 214ms
ArrayList deletion/insertion: 281ms
List iteration: 37ms
ArrayList iteration: 40ms

List deletion/insertion: 215ms
ArrayList deletion/insertion: 269ms
List iteration: 42ms
ArrayList iteration: 40ms
Regards,
JamiroKwai
PrincieD
Addict
Addict
Posts: 861
Joined: Wed Aug 10, 2005 2:08 pm
Location: Yorkshire, England
Contact:

Re: ArrayList - Doubly-linked Dynamic Array Data Structure

Post by PrincieD »

Thanks StarBootics and jamirokwai, it's interesting how it performs on different CPU architecture (caching, branch prediction etc..). Jamirokwai could you show your results for the C backend with code optimisation enabled please?
ProGUI - Professional Graphical User Interface Library - http://www.progui.co.uk
jamirokwai
Enthusiast
Enthusiast
Posts: 798
Joined: Tue May 20, 2008 2:12 am
Location: Cologne, Germany
Contact:

Re: ArrayList - Doubly-linked Dynamic Array Data Structure

Post by jamirokwai »

PrincieD wrote: Fri Sep 22, 2023 3:32 pm Thanks StarBootics and jamirokwai, it's interesting how it performs on different CPU architecture (caching, branch prediction etc..). Jamirokwai could you show your results for the C backend with code optimisation enabled please?
I would. But how can I enable the optimisation? :oops:
Regards,
JamiroKwai
PrincieD
Addict
Addict
Posts: 861
Joined: Wed Aug 10, 2005 2:08 pm
Location: Yorkshire, England
Contact:

Re: ArrayList - Doubly-linked Dynamic Array Data Structure

Post by PrincieD »

jamirokwai: It's in the "Compiler Options" dialogue, there should be a checkbox called "Optimize generated code"
ProGUI - Professional Graphical User Interface Library - http://www.progui.co.uk
User avatar
NicTheQuick
Addict
Addict
Posts: 1519
Joined: Sun Jun 22, 2003 7:43 pm
Location: Germany, Saarbrücken
Contact:

Re: ArrayList - Doubly-linked Dynamic Array Data Structure

Post by NicTheQuick »

I have a question about your data types. Why are you using Long instead of Integer for nearly every field in your structure? Is it to save memory?

You are also not using Define and Protected correctly. When I add `EnabledExplicit` at the top of your code there are multiple occurrences where variables were not defined properly.

After I changed everything from Long to Integer and added `Protected` and `Define` where needed, I tested the code again with maxSize = 100000000 but it was not much faster with the optimized C backend. Sometimes it was faster, sometimes it was not, so I guess it makes no difference.
The english grammar is freeware, you can use it freely - But it's not Open Source, i.e. you can not change it or publish it in altered way.
PrincieD
Addict
Addict
Posts: 861
Joined: Wed Aug 10, 2005 2:08 pm
Location: Yorkshire, England
Contact:

Re: ArrayList - Doubly-linked Dynamic Array Data Structure

Post by PrincieD »

NicTheQuick: Yes it's to save memory (and keep everything in CPU cache as much as possible), as the next/prev fields are indices and not pointers you only need 4 bytes instead of 8. Define, Protected and 'EnabledExplicit' are down to your own personal preference - this was more of a proof of concept. One of the issues I can see is cleaning up the memory properly without invalidating the 'pointers' (indices) e.g. If you add 1,000,000 elements and then add another 100 and then remove the first 1,000,000 that's a huge chunk of memory you can't free without invalidating the last 100 - however this is fine if you are constantly adding and removing elements as each slot gets re-used for example a particle system - does anyone have any good ideas how to get around that problem? :D
ProGUI - Professional Graphical User Interface Library - http://www.progui.co.uk
Post Reply