Page 1 of 3

SQUINT 3, Sparse Quad Union Indexed Nibble Trie

Posted: Sun Jul 10, 2022 1:05 am
by idle
Image
Image
Image

Squint 3.2.2 b2
Added Merge so writes and enumerations aren't blocked from each other
This will later facilitate Acid gets and writes during enumerations, where
gets are immediate
writes are immediate
enumerations are immediate in respect to existing keys being updated during an enumeration

Squint 3.2
Added Binary key functions and an optional Hash in numeric for large keys
added basic example of binary keys and hash (hat)

Squint 3.1.4
PB 6.03 x86/x64 asm and c backend, Arm 32/64 Pi3 and PI4
Added thread safe fences for lock free reads and writes, This has slowed it down by half but there was still a chance it would step on itself and crash, the memory fences have hopefully alleviated it.
It now works on Raspberry PI4 and Pi3
added comments

squint is a versatile structure it can be used to replace both maps and lists. you can even mix numeric keys with string keys in the same squint trie and keys can be Unicode, UtF8, Ascii, integers or binary

Its an ideal structure for implementing a key / pair store, it provides indexing and prefiix queries, with its output in Lexographic sorted order. Tries are widely used in DB engines, IP tables, grep, regex and in languages like python to set and return multiple items from a function.

Squint3 is now reentrant enabling sub tries or collections, so not only can it be used as a replacement for maps and lists it can used like an in mem DB which allows you to to shortcut writes but only in a single writer thread.

Thread tests are lock free readers and a starved writer thread.
Random key look up rates over a set with 4,194,304 (keys may not exist in set)
Squint Numeric lookup items 94,865,337 p/s avg per thread 8,624,121
lookup rate 723.77 mb p/s
lookup time 10.54 ns
Squint Numeric writes items 1,629,607
Write rate 12.43 mb p/s
num items 4,194,304 mem 148.38mb keysize 16.00 mb

vs map
Map lookup items 3,419,641 p/s avg per thread 310,876
lookup rate 26.09 mb p/s
lookup time 292.43 ns
map writes items 577,383 p/s
Write rate 4.41 mb p/s
num items 4,194,304 mem 117.88mb keysize 53.88 mb

string keys

Squint lookup items 15,122,076 p/s avg per thread 1,374,734
lookup rate 158.64 mb p/s
lookup time 66.13 ns
Squint writes items 655,299
Writes rate 6.87 mb p/s
num items 4,194,304 mem 96.98mb keysize 45.87 mb

vs map
Map lookup items 3,137,418 p/s avg per thread 285,219
lookup rate 32.91 mb p/s
lookup time 318.73 ns
map writes items 546,662 p/s
Write rate 5.73 mb p/s
num items 4,194,304 mem 109.87mb keysize 45.87 mb
code
https://github.com/idle-PB/Squint3/blob ... quint3.pbi

Re: SQUINT 3, Sparse Quad Union Indexed Nibble Trie

Posted: Sun Jul 10, 2022 9:51 am
by Little John
Very cool. 8)
Many thanks, idle!

Re: SQUINT 3, Sparse Quad Union Indexed Nibble Trie

Posted: Sun Jul 10, 2022 8:54 pm
by Little John
How can I check whether a given trie or subtrie contains a particular key?
I'm missing a Find() function. :?

Re: SQUINT 3, Sparse Quad Union Indexed Nibble Trie

Posted: Sun Jul 10, 2022 9:27 pm
by idle
Use a get function, a key is the path through the tree. the only way to find it is with a get/find. If the key doesn't exist at all the operation is cheap as it returns early so if you looked for ABC and there was no A in the trie it returns after evaluating A. So it's quicker than a map
;##########################################################################
; Squint\Set(subtrie,@key,value,mode=#PB_Unicode)
;
; functions sets a node in the trie from the given key and retuns the subtrie node address
; parameters
; subtrie: either 0 for the root of the trie or a previously set subtrie node from the return of the function
;*key: adddress To a null terminated string either unicode utf8 Or Ascii
; value: any interger value Or pointer
; mode: one of #PB_Unicode, #PB_UTF8, #PB_Ascii Defaults To unicode
;###########################################################################

Global key.s = "abc"
Global value = 12345
Global subtrie

subtrie = squint\set(0,@key,value) ;sets a key "abc" with value 12345 and returns the root of the subtrie

key.s = "def"
value = 23456
squint\set(subtrie,@key,value) ;sets a key "def" in the subtrie "abc" with value 23456

;#############################################################################
; Squint\get(subtire,*key,mode=#PB_Unicode)
;
; functions looks up the key And either Return the set value Or 0
; parameters
; subtrie: either 0 for the root of squint or a previously obtained subtire from the return of a set
; *key: adddress To a null terminated string either unicode utf8 Or Ascii
; mode: one of #PB_Unicode, #PB_UTF8, #PB_Ascii Defaults To unicode
; ############################################################################

key.s = "abc"
result = squint\Get(0,@key) ;will retrun 12345
Debug result

key = "def"
result = squint\Get(SubTrie,@key) ;will retrun 23456
Debug result

key = "abc"
result = squint\Get(SubTrie,@key) ;will retrun 0 key doesn't exist
Debug result

Re: SQUINT 3, Sparse Quad Union Indexed Nibble Trie

Posted: Sat Jul 23, 2022 4:12 pm
by Little John
Ooops, I almost missed your answer.
Thank you, I am beginning to understand how all this works. ;-)

Re: SQUINT 3, Sparse Quad Union Indexed Nibble Trie

Posted: Sat Jul 23, 2022 8:23 pm
by idle
I'm not really a fan of documentation but I will add some and tidy it up. I'm currently looking into making it thread safe which will require some changes

Re: SQUINT 3, Sparse Quad Union Indexed Nibble Trie

Posted: Wed Oct 19, 2022 3:42 am
by idle
I've made a thread safe version 3.0.4a though I'm still not convinced it's 100%, it doesn't crash but there may still be an ABA problem to deal with.

Re: SQUINT 3, Sparse Quad Union Indexed Nibble Trie

Posted: Thu Jun 22, 2023 2:02 am
by idle
updated and added new tests to try and objectively evaluate the performance compared to a map

k=1,048,576 number

Squint Numeric lookup items 31,816,259 Lookup rate 242.74 mb p/s
Squint Numeric writes items 3,262,384 Write rate 24.89 mb p/s
vs
Map lookup items 10,824,576 Lookup rate 82.58 mb p/s
map writes items 1,013,125 Write rate 7.73 mb p/s


Squint lookup items 14,715,065 Lookup rate 126.30 mb p/s
Squint writes items 1,305,016 Writes rate 11.20 mb p/s
vs
Map lookup items 11,158,207 Lookup rate 95.77 mb p/s
map writes items 1,041,803 Write rate 8.94 mb p/s

Re: SQUINT 3, Sparse Quad Union Indexed Nibble Trie

Posted: Thu Jun 22, 2023 8:21 am
by pjay
Impressive, thanks for sharing.

Re: SQUINT 3, Sparse Quad Union Indexed Nibble Trie

Posted: Sun Jun 25, 2023 12:45 am
by idle
just redid the speed comparison test, removing the delay in read thread and removed the indirection of the interface thunk. The previous test I was wrapping the map read and writes in a function call to match the interface calls.
Initially the test result looked a bit odd as it was reporting around the 200m p/s mark which was a little bit more than expected. The O2 c optimizer pass is quite happy to optimize functions away and in this case there was nothing to stop it vanishing the function away as it wasn't doing anything with the result, so I added a dependency on the result

Code: Select all

 x = SquintGetNumeric(sq,num)
 cx = (1 | x)   
Squint has a distinct advantage over a map for key look ups, when a key doesn't exist squint returns earlier
the maps performance in this circumstance is very poor

Numeric lookup of random keys (keys may not be in the set)
Squint Numeric lookup items 111,102,611
lookup rate 847.65 mb p/s
Squint Numeric writes items 2,663,311
Write rate 20.32 mb p/s
thread 0 8,720,220
thread 1 11,337,695
thread 2 9,009,853
thread 3 9,020,324
thread 4 9,120,695
thread 5 9,941,223
thread 6 9,666,954
thread 7 10,848,351
thread 8 11,317,866
thread 9 11,044,209
thread 10 11,075,221
vs Map lookup of random keys (keys may not be in the set)
Map lookup items 2,885,496
lookup rate 22.01 mb p/s
map writes items 438,514
Write rate 3.35 mb p/s
thread 0 259,823
thread 1 270,904
thread 2 268,788
thread 3 256,248
thread 4 257,183
thread 5 260,441
thread 6 260,613
thread 7 270,424
thread 8 267,392
thread 9 256,486
thread 10 257,194


Squint String keys, lookup of random keys (keys may not be in the set)
Squint lookup items 19,230,687
lookup rate 201.74 mb p/s
Squint writes items 672,811
Writes rate 7.06 mb p/s
thread 0 1,844,471
thread 1 1,723,763
thread 2 1,761,871
thread 3 1,869,259
thread 4 1,726,973
thread 5 1,650,322
thread 6 1,747,549
thread 7 1,673,333
thread 8 1,656,395
thread 9 1,804,356
thread 10 1,772,395
vs map string key, lookup of random keys (keys may not be in the set)
Map lookup items 2,786,959
lookup rate 29.24 mb p/s
map writes items 338,491
Write rate 3.55 mb p/s
thread 0 254,024
thread 1 261,506
thread 2 262,693
thread 3 247,386
thread 4 240,027
thread 5 254,360
thread 6 254,512
thread 7 264,540
thread 8 262,617
thread 9 247,157
thread 10 238,137
in cases where keys most likely exist
Squint Numeric lookup items 50,236,528
lookup rate 383.27 mb p/s
Squint Numeric writes items 1,781,197
Write rate 13.59 mb p/s
thread 0 4,203,384
thread 1 4,776,911
thread 2 4,522,228
thread 3 4,778,232
thread 4 4,780,527
thread 5 4,684,312
thread 6 4,557,221
thread 7 4,519,624
thread 8 4,328,106
thread 9 4,536,076
thread 10 4,549,907
vs map
Map lookup items 10,796,036
lookup rate 82.37 mb p/s
map writes items 762,025
Write rate 5.81 mb p/s
thread 0 991,912
thread 1 952,851
thread 2 998,177
thread 3 984,023
thread 4 1,004,120
thread 5 1,006,966
thread 6 956,947
thread 7 956,277
thread 8 974,503
thread 9 997,559
thread 10 972,701
string keys
Squint lookup items 16,845,957
lookup rate 176.72 mb p/s
Squint writes items 924,734
Writes rate 9.70 mb p/s
thread 0 1,618,648
thread 1 1,549,320
thread 2 1,560,004
thread 3 1,495,213
thread 4 1,576,217
thread 5 1,569,521
thread 6 1,465,790
thread 7 1,456,990
thread 8 1,522,684
thread 9 1,543,842
thread 10 1,487,728
vs map
Map lookup items 13,216,941
lookup rate 138.65 mb p/s
map writes items 659,595
Write rate 6.92 mb p/s
thread 0 1,232,285
thread 1 1,199,693
thread 2 1,243,633
thread 3 1,234,801
thread 4 1,198,946
thread 5 1,186,700
thread 6 1,184,595
thread 7 1,146,145
thread 8 1,230,605
thread 9 1,219,681
thread 10 1,139,857

Re: SQUINT 3, Sparse Quad Union Indexed Nibble Trie

Posted: Sun Jun 25, 2023 4:43 pm
by benubi
Sorry if I am too lazy to analyze the code in it's entirety, it seems to be very optimized already.

I have two questions.

First why use UTF8 for the keys, is it for memory and/or speed?

Second is the depth of the tree equal to the length of the string? i.e. Abcdef is the 6th child in depth of the root node?

I think it would be nice to have a total memory load comparison for the tests. You could use identical random data with a fixed random seed, at least I could imagine that to work.

I did some AVL tree experiments with a ported C code I found on a website. I made around 10 versions with it, one also implements the possibility to make all nodes a linked list that can be sorted and manipulated a little and accessed sequentially without using iterator callbacks. I think SQUINT3 could be faster in nearly everything except AVL lookup maybe, and perhaps sequential access in one very special case were the AVL nodes are stored in an array (which takes advantage of memory paging or something like that, speeds up NextElement pointer reads 3x). But I didn't do any "bare metal" optimizations, just some polish or sugar over the versions that all have their pros and cons.

Re: SQUINT 3, Sparse Quad Union Indexed Nibble Trie

Posted: Sun Jun 25, 2023 10:09 pm
by idle
Utf8 is used to reduce the memory and depth. The key depth is 2 times byte length. It's implemented as a linked list where each element has an array of 1 to 16 children each node is 16 bytes
It's overall size reduction is around 64 times that of a 256 way trie for unicode input for 2 times the key operations.
I could reduce depth if I stored the remainder of the key but it ads another level of dificulty and alignment issues.
It's geared for fast lookups.

The memory requirements are generally smaller than a hash table for semi ordered data and you can imagine random keys are the worst for trie performance.

One thing I can't do is tell how much memory a pb map uses, I can make a rough estimate from its header in sdk but earlier tests suggested squint used less memory. map will have two pointers per item as it stores the map key

So with memory size
Squint Numeric lookup items 115,458,907 p/s
lookup rate 880.88 mb p/s
Squint Numeric writes items 2,309,246
Write rate 17.62 mb p/s
num items 4,194,304: mem 44.72mb, keysize 32.00 mb
vs map
Map lookup items 21,846,408 p/s
lookup rate 166.67 mb p/s
map writes items 1,088,885 p/s
Write rate 8.31 mb p/s
num items 4,194,304 mem 117.88mb keysize 53.88 mb

string
Squint lookup items 19,860,596 p/s
lookup rate 208.35 mb p/s
Squint writes items 986,432
Writes rate 10.35 mb p/s
num items 4,194,304 mem 90.38mb keysize 45.87 mb
vs map
Map lookup items 10,747,430 p/s
lookup rate 112.75 mb p/s
map writes items 812,536 p/s
Write rate 8.52 mb p/s
num items 4,194,304 mem 109.87mb keysize 45.87 mb

Re: SQUINT 3, Sparse Quad Union Indexed Nibble Trie

Posted: Mon Jun 26, 2023 1:45 am
by benubi
Wow! That's indeed excellent. Hats off. :D

Re: SQUINT 3, Sparse Quad Union Indexed Nibble Trie

Posted: Mon Jun 26, 2023 2:39 am
by idle
benubi wrote: Mon Jun 26, 2023 1:45 am Wow! That's indeed excellent. Hats off. :D
Thanks, it could certainly be made faster at the expense of memory size but it's pretty good for what it does. When you need fast look ups or sorting, that's when it really comes into its own.

Re: SQUINT 3, Sparse Quad Union Indexed Nibble Trie

Posted: Wed Aug 30, 2023 1:23 am
by idle
I've updated to v3.1.1a

Added thread safe fences for lock free reads and writes, This has slowed it down by half but there was still a chance it would step on itself and crash, the memory fences have hopefully alleviated it.

It now works on Raspberry PI4 and Pi3

I've changed the numeric functions to accept an arbitrary length it might need a bit more explanation and thought perhaps on how to implement it as its a little odd. Here's an example of storing a structure like a vector. maybe you wanted to make a Spatial Hashing you could use a the numeric functions to store a point in a 3D grid

Code: Select all

IncludeFile "squint3.pbi"

UseModule Squint
      
  Structure vec3 
    x.f
    y.f
    z.f 
  EndStructure   
  
  Global sq.isquint = SquintNew()
  Global *t.vec3,*t1.vec3,a.i,index.a 
  
  Procedure CBwalk(*key.Integer,*value.Integer,*usedata) 
    
    Protected *v.vec3 
                      ;as we have both (index / vector) and (vector / index)    
    If *value <= 100  ;need to check if the value is the index or vector       
      *v = *key  
      Debug "index val " + Str(*value)   
    Else 
      *v = *value    
      Debug "index key " + Str(*key\i & $ff) ;I indexed it with a byte so need to mask off  
    EndIf 
    
    If *v 
      Debug "vector " 
      Debug *v\x 
      Debug *v\y 
      Debug *v\z 
    EndIf  
       
    
  EndProcedure   
  
  
  For a = 1 To 100 
    
    *t = AllocateMemory(SizeOf(vec3)) 
    
    *t\x = (5000 - Random(10000)) 
    *t\y = (5000 - Random(10000)) 
    *t\z = (5000 - Random(10000)) 
       
    sq\SetNumeric(*t,a,SizeOf(vec3))   ;-set the vector as the key and item as the index   
    index = a                              ; set index variable to a as you cant use @a directly     
    sq\SetNumeric(@index,*t,1)         ;-set the index a as the key and the item as the pointer   
   
  Next 
  
  For a = 1 To 100 
   index = a                       ;you cant use @a diectly look up index  
   *t = sq\GetNumeric(@index,1)    ;look up the index 
   If *t   
    Debug *t\x 
    Debug *t\y 
    Debug *t\z 
    Debug sq\GetNumeric(*t,SizeOf(vec3))  ;look up the vector to return the index  
    Debug "----"  
  EndIf  
  Next   
  
  sq\WalkNumeric(@CBwalk())   ;dump the trie 
   
  sq\Free()