I had the time today to code a very slim and fast hasharray. I understood the following scenario:
a) There is a couple of servers, from which maximal #N at a time are active
b) When they are active, some data for them has to be found very fast for some modifcations
c) An active server might become unactivated, and then its slot in the underlying datastructure is freed for a new server
If all active servers are only accessed by their unique id, then a hashtable would be asymptotically a good solution, but it is usually too slow for small problemsizes.
An alternative would be to use a linked list of active servers. Another would be to use a simple linear array. This idea could be enhanced to an array with more or less intelligent indexing, such as the circular approach.
This idea could be further enhanced to a very slim hashtable by
direct mapping with dynamic addressing, that means:
A server is mapped by its ID to the array-position CURPOS = ID % ArraySize.
If this slot is in use by another server, then try another slot, e.g. by simply incrementing CURPOS or by adding a high prime to the current position: CURPOS = (CURPOS + HIGHPRIME) % ArraySize. Repeat this until a free slot (or the slot we searched for by its ID) is found. Here even more complex (e.g. quadratic, or on the ID itself depending) algorithms can be used to find a free slot with high probability
I implemented this approach and tested it in some scenarios and it seems (as there suspected) to be always faster for random accesses than using the circular array. Of course some further tests should be made, especially for very often deleted and new added server, etc., but I do not suspect there any bad surprise. So this approach is more like the intuitive one of a hashtable, and seems to be very fast. But I don't know if it fits your needs.
Here's the code. CA_* for CycleArray-functions, HA_* for HashArray
Code: Select all
EnableExplicit
#N = 256 ; the maximal number of elements, must be a power of 2
#Nmask = #N-1
Structure Element
id.l ; e.g. the socket-id
active.l ; if 0, then this element might be overwritten
elemData.l ; some arbitrary data
EndStructure
Define *tempElement.Element
Define i.l
;
; CycleArray
;
Dim CycleArray.Element(#N) ; elements at indices 0..#N-1 are used
Define ca_curAddIndex.l = 0
Define ca_curFindIndex.l = 0
Macro IncreaseCircular(var)
var = (var + 1) & #Nmask
EndMacro
Macro DecreaseCircular(var)
var = (var - 1) & #Nmask
EndMacro
Macro CA_AddElement(m_id, m_elemData)
; ASSUMPTION: there will never be #N active elements at a time, otherwise danger of endless loop
While CycleArray(ca_curAddIndex)\active
IncreaseCircular(ca_curAddIndex)
Wend
CycleArray(ca_curAddIndex)\id = m_id
CycleArray(ca_curAddIndex)\active = 1
CycleArray(ca_curAddIndex)\elemData = m_elemData
EndMacro
Macro CA_FindElement(m_id, m_p_elem)
; ASSUMPTION: the m_id to find IS somewhere in the array, otherwise danger of endless loop
ca_curFindIndex = ca_curAddIndex
While CycleArray(ca_curFindIndex)\id <> m_id
; lets search backwards, because higher probability to find in most recent entries
DecreaseCircular(ca_curFindIndex)
Wend
; let _p_elem point to the found element
m_p_elem = @CycleArray(ca_curFindIndex)
EndMacro
Macro CA_RemoveElement(m_id)
; ASSUMPTION: the m_id to remove IS somewhere in the array, otherwise danger of endless loop
CA_FindElement(m_id, *tempElement)
*tempElement\active = 0
EndMacro
Macro CA_DebugElements()
For i=0 To #N-1
Debug "--------------------"
Debug i
Debug CycleArray(i)\id
Debug CycleArray(i)\active
Debug CycleArray(i)\elemData
Next
EndMacro
;
; HashArray
;
Dim HashArray.Element(#N) ; elements at indices 0..#N-1 are used
Define ha_tempIndex.l
Macro StartHashIndex(var, key)
var = (key) & #Nmask
EndMacro
Macro IncreaseHashIndex(var)
var = (var + 123454339) & #Nmask ; 123454339 might be any big prime
EndMacro
Macro HA_AddElement(m_id, m_elemData)
; ASSUMPTION: there will never be #N active elements at a time, otherwise danger of endless loop
StartHashIndex(ha_tempIndex, m_id)
While HashArray(ha_tempIndex)\active
IncreaseHashIndex(ha_tempIndex)
Wend
HashArray(ha_tempIndex)\id = m_id
HashArray(ha_tempIndex)\active = 1
HashArray(ha_tempIndex)\elemData = m_elemData
EndMacro
Macro HA_FindElement(m_id, m_p_elem)
; ASSUMPTION: the m_id to find IS somewhere in the array, otherwise danger of endless loop
StartHashIndex(ha_tempIndex, m_id)
While HashArray(ha_tempIndex)\id <> m_id
IncreaseHashIndex(ha_tempIndex)
Wend
; let _p_elem point to the found element
m_p_elem = @HashArray(ha_tempIndex)
EndMacro
Macro HA_RemoveElement(m_id)
; ASSUMPTION: the m_id to remove IS somewhere in the array, otherwise danger of endless loop
HA_FindElement(m_id, *tempElement)
*tempElement\active = 0
EndMacro
Macro HA_DebugElements()
For i=0 To #N-1
Debug "--------------------"
Debug i
Debug HashArray(i)\id
Debug HashArray(i)\active
Debug HashArray(i)\elemData
Next
EndMacro
;########################################
; SPEEDING-TEST
;########################################
;
; testsuite: creates some actions and works through them once by CA_* and once by HA_*
;
#RandomSeed = 123456789
RandomSeed(#RandomSeed)
Structure Action
type.l ; 1=add, 2=modify data, 3=remove
id.l
newData.l ; only for type 1 and 2
EndStructure
Macro CA_HandleActions()
Define ca_time.l = ElapsedMilliseconds()
Define id.l
Define elemData.l
Define *elem.Element
For i=0 To #ActionsArraySize-1
Debug "CA_HandleActions: " + Str(Actions(i)\type) + " " + Str(Actions(i)\id)
Select Actions(i)\type
Case 1 : id = Actions(i)\id
elemData = Actions(i)\newData
CA_AddElement(id, elemData)
Case 2 : id = Actions(i)\id
CA_FindElement(id, *elem)
*elem\elemData = Actions(i)\newData
Case 3 : id = Actions(i)\id
CA_RemoveElement(id)
EndSelect
Next
ca_time = ElapsedMilliseconds() - ca_time
EndMacro
Macro HA_HandleActions()
Define ha_time.l = ElapsedMilliseconds()
Define id.l
Define elemData.l
Define *elem.Element
For i=0 To #ActionsArraySize-1
Debug "HA_HandleActions: " + Str(Actions(i)\type) + " " + Str(Actions(i)\id)
Select Actions(i)\type
Case 1 : id = Actions(i)\id
elemData = Actions(i)\newData
HA_AddElement(id, elemData)
Case 2 : id = Actions(i)\id
HA_FindElement(id, *elem)
*elem\elemData = Actions(i)\newData
Case 3 : id = Actions(i)\id
HA_RemoveElement(id)
EndSelect
Next
ha_time = ElapsedMilliseconds() - ha_time
EndMacro
;
; EXAMPLE: Access-speed to #N-2 servers
;
#NServer = (#N-2)
#NAccesses = 1000000
#ActionsArraySize = 2*#NServer+#NAccesses
Macro ServerID(m_i)
m_i
;1234567*(m_i)+7654321 ; use this more pseudo-random instead of m_i shows no big difference in speedings
;m_i*#N ; this is the worst-case in theory. All servers are initially mapped to the same entry
;m_i*#N/2 ; still a very bad case
;100000+123*m_i ; in this case the offset (here 123) should be no power of 2
EndMacro
;
; create the actions
;
Dim Actions.Action(#ActionsArraySize)
; add server
For i=0 To #NServer-1
Actions(i)\type = 1
Actions(i)\id = ServerID(i) ; a unique id
Actions(i)\newData = Random(9999)
Next
; access data
For i=#NServer To #NServer + #NAccesses - 1
Actions(i)\type = 2
Actions(i)\id = ServerID(Random(#NServer-1))
Actions(i)\newData = Random(9999)
Next
; remove server
Define count.l = 0
For i=#NServer+#NAccesses To 2*#NServer + #NAccesses - 1
Actions(i)\type = 3
Actions(i)\id = ServerID(count) ; the previously added ids
count + 1
Next
;
; start test
;
DisableDebugger
CA_HandleActions()
EnableDebugger
;CA_DebugElements() ; show the internals of CycleArray
DisableDebugger
HA_HandleActions()
EnableDebugger
;HA_DebugElements() ; show the internals of HashArray
Debug "Timing for CA: " + Str(ca_time)
Debug "Timing for HA: " + Str(ha_time)
[edit]
I have removed PB 4.3's function ArraySize(...) and added some alternative server-id-algorithms
[/edit]