In many projects, i need to randomly lookup a large set of unknown size dynamically created user data , this make it difficult or impossible
to use fixed size array.
Most the time using linkedlist is the simplest way, but the accessing performance is getting worst and worst with a
very large element size, like say a few hundred thousands items or even a million items.
To overcome this short fall , some way of improvement was developed.
Method 1 : (Indexing Purebasic native linkedlist)
Indexing your purebasic native linkedlist, similar to the (RDBMS) database server index, but not the same way.
Create and populated your main data set linkedlist , then build a index of the main linkedlist element, in a step of 100's or 200's , not for every element.
Rebuild you index when you Add/Delete/Insert Elements.
Pro:
Improve random access time
easy to implement
Con:
Need to take care the index list when modifying or destroying the main linkedlist
Implementation code:
Code: Select all
; Please disable Debugger
; A few Method To improve the performance of ordinary linkedlist
; Method 1 (Indexing Purebasic native linkedlist) :
#Index_Step = 300
; How to Choose INDEX STEP :: #Index_Step = Total Element / (between 2500 ~ 3300 )
Procedure ReBuildIndex(List MainList(), List Indexing() )
Protected k
ClearList(Indexing())
ForEach MainList()
IF k % #Index_Step = 0
AddElement(Indexing()): Indexing()=@MainList()
ENDIF
k+1
Next
EndProcedure
Procedure SelectElementFast(List MainList(), List Indexing(), ID )
Protected k , predict , *Element
; ------------------------------------------
predict = (ID / #Index_Step )
SelectElement(Indexing(),predict)
*Element = Indexing()
ChangeCurrentElement(MainList(),*Element)
K = (Predict) * #Index_Step
While K < ID
IF NextElement(MainList())
K+1
ENDIF
Wend
IF K = ID
ProcedureReturn MainList()
ENDIF
; ------------------------------------------
ProcedureReturn 0 ; Null
EndProcedure
; ================================================
OpenConsole()
; Example :
#DataSize = 999999
NewList MyData()
NewList MyIndex()
For i = 0 To #DataSize
AddElement(MyData())
MyData() = i
Next
ReBuildIndex(MyData(), MyIndex() )
; ---------------------------------------------
PrintN( "Check accuracy :")
SelectElementFast(MyData(), MyIndex() ,2150 )
PrintN("Fast Method 1 Select : Element 2150 , Value = "+ Str(MyData()))
SelectElement(MyData(),2150)
PrintN("Purebasic Select : Element 2150 , Value = "+ Str(MyData()))
; ---------------------------------------------
PrintN(" ======================================================")
PrintN( "benchmark test And comparison")
#TestLoop = 10000
; --------------------------------------------------
; Fast Test
Time_S = ElapsedMilliseconds()
For i = 1 To #TestLoop
idx = Random(#DataSize)
SelectElementFast(MyData(), MyIndex() ,Idx )
Next
Time_E = ElapsedMilliseconds()
Time_Fast = Time_E - Time_S
; --------------------------------------------------
; Ordinary Test
Time_S = ElapsedMilliseconds()
For i = 1 To #TestLoop
idx = Random(#DataSize)
SelectElement(MyData(),Idx )
Next
Time_E = ElapsedMilliseconds()
Time_Slow = Time_E - Time_S
; --------------------------------------------------
PrintN(Str(#TestLoop )+" Random seek (Indexed) Time ="+Str(Time_Fast)+"ms")
PrintN(Str(#TestLoop )+" Random seek (Ordinary) Time ="+Str(Time_Slow)+"ms")
Input()
Code: Select all
Check accuracy :
Fast Method 1 Select : Element 2150 , Value = 2150
Purebasic Select : Element 2150 , Value = 2150
======================================================
benchmark test And comparison
10000 Random seek (Indexed) Time =14ms
10000 Random seek (Ordinary) Time =2565ms
ENJOY !
(coming more method later)