Optimization of a sorting algorithm

Just starting out? Need help? Post your questions and find answers here.
User avatar
yuki
Enthusiast
Enthusiast
Posts: 101
Joined: Sat Mar 31, 2018 9:09 pm

Re: Optimization of a sorting algorithm

Post by yuki »

wilbert wrote: Tue Oct 10, 2023 12:25 pm I updated my previous post.
I believe I could improve a tiny bit on the speed of your V8 by marking the elements without a reference a bit earlier.
Added to results table! (renamed from V8A → V8B to match convention of others)

I considered marking ref'd entries ahead of time, and didn't bother actually testing, but … wow that squeezes out quite some gains! (always equal or better to V8, with ~12.6% decrease in time spent @ 1000 elements!)
pjay
Enthusiast
Enthusiast
Posts: 281
Joined: Thu Mar 30, 2006 11:14 am

Re: Optimization of a sorting algorithm

Post by pjay »

It is my opinion that this thread is one part genius & one part crazy. Whilst it's hard to not admire the cleverness & resourcefullness on display here, there's one important factor missing.

On what I consider to be a realistically scoped UI of 50 gadgets, it takes ~20ms for my laptop to reposition, resize and redraw; ~0.4ms per gadget.

So, in two weeks, 5 pages of forum discussions and 12 iterations of code, a 50 gadget resize process has been sped up from 20.0113ms to 20.0024ms according to the timing table posted :lol:

Having said this, I do appreciate that some people just like a coding challenge, regardless of the practical purposes.
wilbert
PureBasic Expert
PureBasic Expert
Posts: 3943
Joined: Sun Aug 08, 2004 5:21 am
Location: Netherlands

Re: Optimization of a sorting algorithm

Post by wilbert »

yuki wrote: Tue Oct 10, 2023 12:43 pm I considered marking ref'd entries ahead of time, and didn't bother actually testing, but … wow that squeezes out quite some gains! (always equal or better to V8, with ~12.6% decrease in time spent @ 1000 elements!)
Does using a pointer to navigate to next make a difference with your timings ?

Code: Select all

Procedure.i SortList_V8c()
  Protected *next.Integer
  Protected *parent.MyStruct
  Protected *current.MyStruct
  
  ; Associate each list entry with its corresponding map entry.
  ;
  ; You could avoid this, and thus increase both init and sorting performance by changing
  ; the map from StructMap.MyStruct() -> *StructMap.MyStruct().
  ;
  ; In order to fit within the existing framework without structural or init changes, the
  ; "Sort" field is used to hold a pointer to the map entry's relevant entry. Such values
  ; are never actually sorted (because they're ambiguous) but are used to accelerate sort
  ; operations below.
  ForEach StructList()
    StructMap(StructList()\Entry)\Sort = @StructList()
    If StructList()\RefEntry = #Empty$
      ; Element has no reference so is considered already sorted.
      StructList()\Sort = #True
    EndIf
  Next
  
  *next = FirstElement(StructList())
  If *next
    *next - SizeOf(Integer)<<1
  EndIf
  While *next
    *current = *next + SizeOf(Integer)<<1
    *next = *next\i
    
    ; Skip if element is already sorted.
    If Not *current\Sort
      *parent = StructMap(*current\RefEntry)\Sort
      ; Detect missing reference or direct cycle, returning -1 error code.
      If Not *parent Or *parent = *current
        DebuggerWarning("Invalid reference (missing or cyclic) detected: " + *current\Entry + " -> " + *current\RefEntry)
        ProcedureReturn -1
      EndIf
      ; Set the element being inspected as active list entry as we must now move it.
      ChangeCurrentElement(StructList(), *current)
      ; If parent is already sorted, move into position by it. Otherwise, defer for later.
      If *parent\Sort
        *current\Sort = #True
        MoveElement(StructList(), #PB_List_After, *parent)
      Else
        MoveElement(StructList(), #PB_List_Last)
      EndIf
    EndIf
    
  Wend
EndProcedure
Windows (x64)
Raspberry Pi OS (Arm64)
wilbert
PureBasic Expert
PureBasic Expert
Posts: 3943
Joined: Sun Aug 08, 2004 5:21 am
Location: Netherlands

Re: Optimization of a sorting algorithm

Post by wilbert »

pjay wrote: Tue Oct 10, 2023 1:20 pm Having said this, I do appreciate that some people just like a coding challenge, regardless of the practical purposes.
That's it. It' just like a puzzle.
And for me I guess it might have to do with my age.
A habit from back in the 80's with a Zilog Z80a cpu running at 3.5 Mhz.
Windows (x64)
Raspberry Pi OS (Arm64)
User avatar
yuki
Enthusiast
Enthusiast
Posts: 101
Joined: Sat Mar 31, 2018 9:09 pm

Re: Optimization of a sorting algorithm

Post by yuki »

wilbert wrote: Tue Oct 10, 2023 1:35 pm
pjay wrote: Tue Oct 10, 2023 1:20 pm Having said this, I do appreciate that some people just like a coding challenge, regardless of the practical purposes.
That's it. It' just like a puzzle.
And for me I guess it might have to do with my age.
A habit from back in the 80's with a Zilog Z80a cpu running at 3.5 Mhz.
So much this! It's a nice brain-teaser, and seeing the performance numbers go up and up is always satisfying :mrgreen:
pjay wrote: Tue Oct 10, 2023 1:20 pm So, in two weeks, 5 pages of forum discussions and 12 iterations of code, a 50 gadget resize process has been sped up from 20.0113ms to 20.0024ms according to the timing table posted :lol:
There is for sure some premature optimisation going on, at least for the original use case :lol:

Though, for use cases like games (where every fraction of a millisecond counts), minimising UI draw time is handy. Spending 2.5687ms as part of UI rendering is 15% of frame-budget (at 60FPS), while 0.0732ms is only 0.4% of frame-budget. (Admittedly, having to handle 1000 UI elements and repeated re-layout is pretty rare for a game)
wilbert wrote: Tue Oct 10, 2023 1:31 pm Does using a pointer to navigate to next make a difference with your timings ?
It does — that trick is really handy! Table with V8C:

Code: Select all

| Size |  V1 (µs) |  V2 (µs) |  V3 (µs) |  V4 (µs) | V4B (µs) |  V5 (µs) |  V6 (µs) |  V7 (µs) | V7B (µs) | V7C (µs) |  V8 (µs) | V8B (µs) | V8C (µs) |
|------|----------|----------|----------|----------|----------|----------|----------|----------|----------|----------|----------|----------|----------|
|   10 |      1.1 |      0.9 |      1.0 |      2.6 |      0.6 |      1.9 |      0.4 |      0.8 |      0.7 |      2.1 |      0.5 |      0.5 |      0.4 |
|   20 |      2.8 |      2.0 |      1.9 |      3.7 |      1.1 |      2.9 |      0.8 |      1.6 |      1.4 |      3.1 |      1.1 |      1.0 |      0.9 |
|   30 |      5.2 |      3.4 |      2.9 |      5.1 |      1.7 |      4.2 |      1.1 |      2.4 |      2.2 |      4.1 |      1.6 |      1.5 |      1.3 |
|   50 |     11.3 |      6.8 |      5.2 |      7.5 |      2.8 |      6.5 |      1.9 |      4.0 |      3.8 |      6.2 |      2.7 |      2.4 |      2.2 |
|  100 |     34.4 |     19.1 |     10.5 |     13.5 |      5.7 |     12.4 |      4.0 |      7.7 |      7.2 |     10.9 |      5.2 |      4.7 |      4.2 |
|  200 |    112.8 |     54.4 |     20.8 |     22.6 |     10.8 |     21.7 |      7.9 |     14.8 |     14.1 |     18.4 |      9.8 |      9.1 |      7.9 |
|  300 |    244.3 |    113.6 |     33.9 |     33.5 |     17.6 |     31.7 |     12.1 |     24.2 |     22.6 |     28.4 |     15.8 |     14.1 |     13.3 |
|  500 |    663.7 |    298.4 |     69.0 |     59.4 |     35.5 |     56.4 |     23.5 |     51.4 |     48.7 |     52.9 |     32.1 |     28.6 |     27.6 |
| 1000 |   2568.7 |   1138.0 |    177.1 |    125.8 |     91.3 |    120.4 |     60.2 |    138.5 |    133.1 |    110.9 |     83.8 |     73.2 |     71.8 |
Though, I do get a little scared about PB internal changes having potential to break that sort of approach.
wilbert
PureBasic Expert
PureBasic Expert
Posts: 3943
Joined: Sun Aug 08, 2004 5:21 am
Location: Netherlands

Re: Optimization of a sorting algorithm

Post by wilbert »

yuki wrote: Tue Oct 10, 2023 1:58 pmThough, I do get a little scared about PB internal changes having potential to break that sort of approach.
It is internal but on the other hand PB commands themselves also do occasionally change and the structure of a listheader is documented in the SDK that comes with the PureBasic application.
\PureBasic\SDK\VisualC\PureLibraries\LinkedList\LinkedList.h
Windows (x64)
Raspberry Pi OS (Arm64)
User avatar
yuki
Enthusiast
Enthusiast
Posts: 101
Joined: Sat Mar 31, 2018 9:09 pm

Re: Optimization of a sorting algorithm

Post by yuki »

wilbert wrote: Tue Oct 10, 2023 2:11 pm It is internal but on the other hand PB commands themselves also do occasionally change and the structure of a listheader is documented in the SDK that comes with the PureBasic application.
\PureBasic\SDK\VisualC\PureLibraries\LinkedList\LinkedList.h
Super handy to know, thanks for that tip!
wilbert
PureBasic Expert
PureBasic Expert
Posts: 3943
Joined: Sun Aug 08, 2004 5:21 am
Location: Netherlands

Re: Optimization of a sorting algorithm

Post by wilbert »

V9; I couldn't resist :shock:

Code: Select all

Procedure.i SortList_V9()
  Protected.MyStruct *cur, *ins, *ref
  
  ; Associate each list entry with its corresponding map entry.
  ForEach StructList()
    StructList()\Sort = #False  
    StructMap(StructList()\Entry)\Sort = @StructList()
  Next
  
  ; Sort the list
  ForEach StructList()
    *cur = StructList()
    *cur\Sort = #True
    If *cur\RefEntry
      *ins = *cur
      *ref = StructMap(*ins\RefEntry)\Sort
      While *ref And Not *ref\Sort
        *ref\Sort = #True
        ChangeCurrentElement(StructList(), *ref)
        MoveElement(StructList(), #PB_List_Before, *ins)
        *ins = *ref
        If *ins\RefEntry
          *ref = StructMap(*ins\RefEntry)\Sort
        Else
          Break
        EndIf
      Wend
      If Not *ref
        DebuggerWarning("Invalid reference detected: " + *ins\Entry + " -> " + *ins\RefEntry)
        ProcedureReturn -1
      EndIf      
      ChangeCurrentElement(StructList(), *cur)
    EndIf
  Next
  
EndProcedure
Last edited by wilbert on Wed Oct 11, 2023 2:23 pm, edited 2 times in total.
Windows (x64)
Raspberry Pi OS (Arm64)
User avatar
Kurzer
Enthusiast
Enthusiast
Posts: 693
Joined: Sun Jun 11, 2006 12:07 am
Location: Near Hamburg

Re: Optimization of a sorting algorithm

Post by Kurzer »

Wow, the thread has developed rapidly. Aweseome! :D

Even if, for professional reasons, I can no longer contribute so much this week, I would like to answer a few questions.
wilbert wrote:Is there a specific reason for using this reference based approach ?
No, not really. LOL

When I started developing the WinHandler module, I only envisioned that the gadgets could be anchored to the sides of the window. But over time I found it interesting if you could anchor the gadets to other gadgets as well. And so I started to write the code for that and to extend the internal WinHandler structure.

I honestly didn't even think about it, if there is an advantage compared to just anchoring the gadget to the window. But at least it saves typing, because the anchoring of a gadget is inherited to the reference gadgets this way.

wilbert wrote:can't you use group numbers and assign all of them the same group number instead of referencing each other?
I don't know if that would make it any easier. The gadgets in a group may need to be repositioned differently.

Reference gadgets can be anchored left, right, above or below a parent gadget. If the parent gadget changes in width or height, the gadgets that are anchored to the right or below must be repositioned. Gadgets that are anchored to the left or top of the parent gadget do not need to be repositioned in this case.
This is only an incomplete example, I can imagine that RefGadgets can also "inherit" height or width of the parent gadget.

yuki wrote:Ooh, if the sort field's value isn't strictly required to convey order (but instead used purely for accelerating sorting), we can squeeze more gains!
No, in my use case the sort field is not necessary for processing the gadgetlist. If the list for the gadgets is sorted correctly, the gadget names are simply read in order in a loop.
Btw: Great progress with V8()

wilbert wrote:Timings are so inconsistent on my computer.
I think the current form of comparing might not be accurate enough when you are counting microseconds.
I cannot confirm this on my approx. 7 year old computer either. Even with only a few items in the list, the results are very very similar in each run.
Wilbert, have you measured times with debugger disabled? I made this mistake in the beginning.

pjay wrote:Having said this, I do appreciate that some people just like a coding challenge, regardless of the practical purposes.
LOL, Yes as far as I'm concerned, that's exactly how it is. Sure, I needed the code for my project, but my ambition is also in other areas that I would like to optimize processes very much and strive to keep things simple and smart.

Which, by the way, does not exempt me from sometimes overlooking the simplest things and consequently programming complete nonsense. :D

wilbert wrote:That's it. It' just like a puzzle.
And for me I guess it might have to do with my age. A habit from back in the 80's with a Zilog Z80a cpu running at 3.5 Mhz.
yuki wrote:So much this! It's a nice brain-teaser, and seeing the performance numbers go up and up is always satisfying :mrgreen:
Yes, that's exactly how it is for me too - Brothers in the spirit. :D
Me: 56 years old and started programming with a VIC-20 in the 80s.

wilbert wrote:V9; I couldn't resist :shock:
You guys are really crazy! And that's very sympathetic - LOL :D

I'm sorry that I currently can not write so much, but the Reallife.exe strikes again. I'll probably be able to get back to code next week.
Last edited by Kurzer on Wed Oct 11, 2023 4:00 pm, edited 1 time in total.
PB 6.12 x64, OS: Win 11 24H2 x64, Desktopscaling: 150%, CPU: I7 12700 H, RAM: 32 GB, GPU: Intel(R) Iris(R) Xe Graphics | NVIDIA GeForce RTX 3070, User age in 2025: 57y
"Happiness is a pet." | "Never run a changing system!"
User avatar
yuki
Enthusiast
Enthusiast
Posts: 101
Joined: Sat Mar 31, 2018 9:09 pm

Re: Optimization of a sorting algorithm

Post by yuki »

wilbert wrote: Wed Oct 11, 2023 5:45 am V9; I couldn't resist :shock:

Code: Select all

Procedure.i SortList_V9()
  Protected.MyStruct *cur, *ins, *ref
  
  ; Associate each list entry with its corresponding map entry.
  ForEach StructList()
    StructMap(StructList()\Entry)\Sort = @StructList()
  Next
  
  ; Sort the list
  ForEach StructList()
    *cur = StructList()
    *cur\Sort = #True
    If *cur\RefEntry
      *ins = *cur
      *ref = StructMap(*ins\RefEntry)\Sort
      While *ref And Not *ref\Sort
        *ref\Sort = #True
        ChangeCurrentElement(StructList(), *ref)
        MoveElement(StructList(), #PB_List_Before, *ins)
        *ins = *ref
        If *ins\RefEntry
          *ref = StructMap(*ins\RefEntry)\Sort
        Else
          Break
        EndIf
      Wend
      If Not *ref
        DebuggerWarning("Invalid reference detected: " + *ins\Entry + " -> " + *ins\RefEntry)
        ProcedureReturn -1
      EndIf      
      ChangeCurrentElement(StructList(), *cur)
    EndIf
  Next
  
EndProcedure
Very clever!

Some dramatic performance gains (PB 6.03b10 – ASM Release, Windows 11 x64, 5800X):

Code: Select all

| Size |  V1 (µs) |  V2 (µs) |  V3 (µs) |  V4 (µs) | V4B (µs) |  V5 (µs) |  V6 (µs) |  V7 (µs) | V7B (µs) | V7C (µs) |  V8 (µs) | V8B (µs) |  V9 (µs) |
|------|----------|----------|----------|----------|----------|----------|----------|----------|----------|----------|----------|----------|----------|
|   10 |      1.1 |      0.9 |      1.0 |      2.6 |      0.6 |      1.9 |      0.4 |      0.8 |      0.7 |      2.1 |      0.5 |      0.5 |      0.4 |
|   20 |      2.8 |      2.0 |      1.9 |      3.7 |      1.1 |      2.9 |      0.8 |      1.6 |      1.4 |      3.1 |      1.1 |      1.0 |      0.8 |
|   30 |      5.2 |      3.4 |      2.9 |      5.1 |      1.7 |      4.2 |      1.1 |      2.4 |      2.2 |      4.1 |      1.6 |      1.5 |      1.2 |
|   50 |     11.3 |      6.8 |      5.2 |      7.5 |      2.8 |      6.5 |      1.9 |      4.0 |      3.8 |      6.2 |      2.7 |      2.4 |      1.9 |
|  100 |     34.4 |     19.1 |     10.5 |     13.5 |      5.7 |     12.4 |      4.0 |      7.7 |      7.2 |     10.9 |      5.2 |      4.7 |      3.6 |
|  200 |    112.8 |     54.4 |     20.8 |     22.6 |     10.8 |     21.7 |      7.9 |     14.8 |     14.1 |     18.4 |      9.8 |      9.1 |      6.6 |
|  300 |    244.3 |    113.6 |     33.9 |     33.5 |     17.6 |     31.7 |     12.1 |     24.2 |     22.6 |     28.4 |     15.8 |     14.1 |     10.7 |
|  500 |    663.7 |    298.4 |     69.0 |     59.4 |     35.5 |     56.4 |     23.5 |     51.4 |     48.7 |     52.9 |     32.1 |     28.6 |     22.2 |
| 1000 |   2568.7 |   1138.0 |    177.1 |    125.8 |     91.3 |    120.4 |     60.2 |    138.5 |    133.1 |    110.9 |     83.8 |     73.2 |     59.0 |
(methodology: average across 100 runs)
(each sample for each method is isolated in a separate max-priority process. i.e., 1300 sequential RunProgram(...) calls to account for 100 tests × 13 methods)
(processes report their own timings, and only sorting is measured, excluding launch/teardown as well as effects from other tests)
(for each sample, 10 sort rounds of the respective method are run before re-initialising the list and beginning measurement)


The StructList()\Sort will have to be cleared either when invalidating changes or as part of the first loop over the elements (which adds little time).
Kurzer wrote: Wed Oct 11, 2023 8:30 am I'm sorry that I currently can not write so much, but the Reallife.exe strikes again. I'll probably be able to get back to code next week.
Aah, that program... sometimes I wonder: where's my Task Manager to pause it? :lol:
wilbert
PureBasic Expert
PureBasic Expert
Posts: 3943
Joined: Sun Aug 08, 2004 5:21 am
Location: Netherlands

Re: Optimization of a sorting algorithm

Post by wilbert »

Kurzer wrote: Wed Oct 11, 2023 8:30 am I cannot confirm this on my approx. 7 year old computer either. Even with only a few items in the list, the results are very very similar in each run.
Wilbert, have you measured times with debugger disabled? I made this mistake in the beginning.
I do disable the debugger when measuring times.
Maybe something is running in the background or maybe it has to do with the fact that I'm using Bootcamp to run windows on my mac. :?
yuki wrote: Wed Oct 11, 2023 1:18 pm The StructList()\Sort will have to be cleared either when invalidating changes or as part of the first loop over the elements (which adds little time).
You are right. I changed it in my V9 code above.
It indeed adds little time.
Windows (x64)
Raspberry Pi OS (Arm64)
Olli
Addict
Addict
Posts: 1267
Joined: Wed May 27, 2020 12:26 pm

Re: Optimization of a sorting algorithm

Post by Olli »

wilbert wrote: Wed Oct 11, 2023 5:45 am V9; I couldn't resist :shock:
It looks like my ordinal diagram page 3, but smashed like a rubix cube ! Are you sure it's only puzzles that excite you? :mrgreen:
wilbert
PureBasic Expert
PureBasic Expert
Posts: 3943
Joined: Sun Aug 08, 2004 5:21 am
Location: Netherlands

Re: Optimization of a sorting algorithm

Post by wilbert »

Olli wrote: Sat Oct 14, 2023 1:05 pm Are you sure it's only puzzles that excite you? :mrgreen:
I like the logic of logic. :shock: :)

I thought about ways to speed up the V9 code some more but the most important thing which slows things down at the moment is the map.
Hashing those strings and looking them up takes time (a numeric reference combined with an array would be much faster).
Windows (x64)
Raspberry Pi OS (Arm64)
Olli
Addict
Addict
Posts: 1267
Joined: Wed May 27, 2020 12:26 pm

Re: Optimization of a sorting algorithm

Post by Olli »

I think a binary search would do the affair, but if this targets big integers only. The string management takes lots of time else.

We could note that 2 string-to-quad conversions are interesting :
- litteral converting to MSB of quad

Code: Select all

"Gadget" ->
'G' << 56 + 
'a' << 48 +
'd' << 40 +
'g' << 32 +
'e' << 24 +
't' << 16
- and numeric converting to LSB quad

Code: Select all

"1FE00" ->
'1' << 32 +
'F' << 24 +
'E' << 16 +
'0' << 8 +
'0'
This technic (ASCII) is absolutely compatible with a binary search, on one hand, and it is humanly readable, on the other hand.

Basically, an expression would look like this :
'Window' << 16 + '51' + 'Gadget' << 16 + '1FE00'
And a dump would look like this :

Code: Select all

256 bits integer :
'Window        51Gadget     1FE00'
;3322222222221111111111
;10987654321098765432109876543210 ASCII byte range unit
Einstein said 2% of total people could imagine 5 things on the same time.
Here we have two things (a window and a gadget)
I think 4 nested things are enough.
So a key of 512 bits containing 4 litteral expressions alternating with 4 numeric values should do the affair, and should build a right, quick, easily ordered and readable key in a quads array visited by a binary search.

We could note that adopting a name whom the length passes over the 8 bytes is not a technical problem, and is absolutely transparent for a binary search, while we let the right field length for the next numeric value :

Code: Select all

'longNameWindow51Gadget     1FE00'
wilbert
PureBasic Expert
PureBasic Expert
Posts: 3943
Joined: Sun Aug 08, 2004 5:21 am
Location: Netherlands

Re: Optimization of a sorting algorithm

Post by wilbert »

Olli wrote: Sat Oct 14, 2023 6:29 pm I think a binary search would do the affair, but if this targets big integers only. The string management takes lots of time else.
Most ideal would be a simple 16 bit number between 0 and 65535 but a binary search could also help.
There are still ways to improve performance like this but the current approach with a map might already be fast enough for Kurzer.
Windows (x64)
Raspberry Pi OS (Arm64)
Post Reply