Radix Sort

Share your advanced PureBasic knowledge/code with the community.
User avatar
pdwyer
Addict
Addict
Posts: 2813
Joined: Tue May 08, 2007 1:27 pm
Location: Chiba, Japan

Radix Sort

Post by pdwyer »

Note: This is still a work in progress and needs work.

In another thread here there was a link to a sort page that mentioned a "radix" sort that I had not heard of so I looked it up. Wiki had some info but no pseudo code so I kept looking around. Some sites raved about it, some sites trashed it due to not working well with floats, strings or negative numbers.

I found a paper on how to make it work with floats and negatives and some suggestions for strings but I haven't tried that here yet, I've just attempted the basic sort. It blitzes my quicksort and sometimes beats PB's SortArray() depending on the size and distribution.

There's an ugly CopyMemory statement in there that can go if I change the structure but I haven't got that working yet so this should get faster with some work.

Here is the proc with some quicksort comparisons, remember to turn the debugger off! Perhaps others here would like to have a play too to beef this up a bit. I think it has potential :)

One more note on performance, this is for longs so it's 4 passes, if it were words it would be 2 and bytes it would be 1 so it would be 4 times faster to sort the same array size if it were just bytes!

ADDED (sorry, still playing with it :) ) I found a tweak in another article, if you set the distribution to a number like 200 so that three bytes of the long are empty it skips them, or under 65k so that two bytes are empty. This distribution is calced in the counters area already so using this tweak is cheap. This will definately speed it over the PB sort and in these cases with lots more doubleups quicksort gets really slow. (it shows too that PB sort is a great all round sort) :)

Code: Select all

Structure Bptr 
    Byte.c[0] 
EndStructure 

Procedure RadixSort(Array ArrayToSort.l(1))

    Inputcount = ArraySize(ArrayToSort())
    
    Dim RawOut.l(Inputcount)
    Dim InputValues.c(Inputcount)
    Dim OutputValues.c(Inputcount)
    Protected CanSkipLayer.l

    For ByteLayer = 0 To 3 

        Dim Counters.l(255) ;256
        Dim OffsetTable.l(255) ;256 
        CanSkipLayer.l = #False
        
        *Ptr.Bptr 
    
        ;load input bytes for layer
        For i =  0 To Inputcount ;-1 
            *Ptr = @ArrayToSort(i)  
            InputValues(i) = *Ptr\Byte[ByteLayer]
            Counters(InputValues(i)) = Counters(InputValues(i)) + 1
        Next

        If Counters(0) = Inputcount + 1  ;if data is all the same for that byte, skip sorting the whole layer
            CanSkipLayer = #True  
        EndIf

        ;Sort
        If CanSkipLayer = #False
        
            For i = 1 To 255            
                OffsetTable(i) = OffsetTable(i-1) + Counters(i-1)
            Next
        
            For i = 0 To Inputcount ;-1
                RawOut(OffsetTable(InputValues(i))) = ArrayToSort(i)
                OffsetTable(InputValues(i)) = OffsetTable(InputValues(i)) + 1
            Next
            
            ;Prepnext layer
            CopyMemory(@Rawout(),@ArrayToSort(),(Inputcount + 1) * 4)
        EndIf
    Next 

EndProcedure


Procedure QuickSort(Array List.l(1), First.l, last.l)

    MedVal.l
    hi.l
    lo.l

    If First >= Last 
        ProcedureReturn
    EndIf
    
    MedVal = List(first)
    lo = first
    hi = last

    While 1
        While List(hi) >= medval
            hi = hi - 1
            If hi <= lo
                Break
            EndIf
        Wend

        If hi <= lo 
            List(lo) = medval
            Break
        EndIf

        List(lo) = List(hi) ;swap Vals

        lo = lo + 1

        While List(lo) < medval
            lo = lo + 1
            If lo >= hi
                Break 
            EndIf
        Wend

        If lo >= hi 
            lo = hi
            List (hi) = medVal
            Break
        EndIf

        List(hi) = List(lo) ;Swap Vals
    Wend

    QuickSort(List(), first, lo -1)
    QuickSort(List(), lo +1, last)

EndProcedure

OpenConsole()
    
    Itemcount = 10000000
    Distribution = 10000000 ; also try with a lower number to see the difference
    
    Dim RawVals.l(Itemcount)
    For i = 0 To Itemcount ;-1
        RawVals(i) = Random(Distribution) 
    Next
    
    PrintN("Radix Sort")
    timer = ElapsedMilliseconds()       
    RadixSort(RawVals())      
    PrintN(Str(ElapsedMilliseconds() - timer))
    
    Dim RawVals.l(Itemcount)
    For i = 0 To Itemcount ;-1
        RawVals(i) = Random(Distribution) 
    Next
    
    PrintN("PB Sort")
    timer = ElapsedMilliseconds()       
    SortArray(RawVals(),#PB_Sort_Ascending)      
    PrintN(Str(ElapsedMilliseconds() - timer))
    
    Dim RawVals.l(Itemcount)
    For i = 0 To Itemcount ;-1
        RawVals(i) = Random(Distribution) 
    Next
    
    PrintN("Quick Sort")
    timer = ElapsedMilliseconds()       
    QuickSort(RawVals(), 0,Itemcount)     
    PrintN(Str(ElapsedMilliseconds() - timer))
    
    Input()


CloseConsole()
    
    

Paul Dwyer

“In nature, it’s not the strongest nor the most intelligent who survives. It’s the most adaptable to change” - Charles Darwin
“If you can't explain it to a six-year old you really don't understand it yourself.” - Albert Einstein
User avatar
pdwyer
Addict
Addict
Posts: 2813
Joined: Tue May 08, 2007 1:27 pm
Location: Chiba, Japan

Post by pdwyer »

This next version handles negative numbers too. I'm not updating the code above as it's good as an unsigned sort and so is useful as is. If the numbers are higher that 31bits then they may look like negative numbers at the end if displayed that way. This next code will fix the negatives and still be about 15-20% faster than pb sort for large sorts

Code: Select all



Structure Bptr 
    Byte.c[0] 
EndStructure 

Procedure RadixSort(Array ArrayToSort.l(1))

    Inputcount = ArraySize(ArrayToSort())
    
    Dim RawOut.l(Inputcount)
    Dim InputValues.c(Inputcount)
    Dim OutputValues.c(Inputcount)
    Protected CanSkipLayer.l
    Protected NegValCount.l

    For ByteLayer = 0 To 3 

        Dim Counters.l(255) ;256
        Dim OffsetTable.l(255) ;256 
        CanSkipLayer.l = #False
        
        *Ptr.Bptr 
    
        ;load input bytes for layer
        For i =  0 To Inputcount ;-1 
            *Ptr = @ArrayToSort(i)  
            InputValues(i) = *Ptr\Byte[ByteLayer]
            Counters(InputValues(i)) = Counters(InputValues(i)) + 1
        Next

        ;if data is zeros for that byte, skip sorting the whole layer
        If Counters(0) = Inputcount + 1  
            CanSkipLayer = #True  
        EndIf
        
        ;fix negatives
        If ByteLayer = 3 And CanSkipLayer = #False
            For i = 127 To 255    
                NegValCount.l = NegValCount + Counters(i)
            Next
        EndIf
            
        ;Sort
        If CanSkipLayer = #False
        
            For i = 1 To 255            
                OffsetTable(i) = OffsetTable(i-1) + Counters(i-1)
            Next
        
            For i = 0 To Inputcount ;-1
                RawOut(OffsetTable(InputValues(i))) = ArrayToSort(i)
                OffsetTable(InputValues(i)) = OffsetTable(InputValues(i)) + 1
            Next
            
            ;Prepnext layer
            If ByteLayer = 3 And NegValCount > 0
                CopyMemory(@Rawout() + ((Inputcount + 1) * 4) - (NegValCount * 4),@ArrayToSort(),(NegValCount * 4))
                CopyMemory(@Rawout(),@ArrayToSort() + (NegValCount * 4),((Inputcount + 1) * 4) - (NegValCount * 4))
            Else
                CopyMemory(@Rawout(),@ArrayToSort(),(Inputcount + 1) * 4)
            EndIf
            
        EndIf
    Next 

EndProcedure

OpenConsole()
    
    Itemcount = 20
    Distribution = 4000000000 ;(should generate some negative numbers too
    
    Dim RawVals.l(Itemcount)
    For i = 0 To Itemcount ;-1
        RawVals(i) = Random(Distribution) 
    Next

    For i = 0 To Itemcount ;-1
        Debug RawVals(i) 
    Next
    
    
    Debug " SORT !"    

    RadixSort(RawVals())      
    
    For i = 0 To Itemcount ;-1
        Debug RawVals(i) 
    Next
    


CloseConsole()
    
    
Paul Dwyer

“In nature, it’s not the strongest nor the most intelligent who survives. It’s the most adaptable to change” - Charles Darwin
“If you can't explain it to a six-year old you really don't understand it yourself.” - Albert Einstein
User avatar
pdwyer
Addict
Addict
Posts: 2813
Joined: Tue May 08, 2007 1:27 pm
Location: Chiba, Japan

Post by pdwyer »

Well, I'm still having fun with this. :D

This one sorts floats, positive and negative. It runs in about 2/3rds the time of PBs SortArray() so its a little more significant. Below is the speed test so turn off your debugger for it.

Probably the last things I need to do are cleaning up, build procs for different data types then try removing that .c type to make it unicode safe and chuck it all in an include to use.

If I'm feeling creative I'll try strings but I suspect they won't be as fast...

Code: Select all

Structure Bptr 
    Byte.c[0] 
EndStructure 

Procedure RadixSort(Array ArrayToSort.f(1))

    Inputcount = ArraySize(ArrayToSort())
    
    Dim RawOut.f(Inputcount)
    Dim InputValues.c(Inputcount)
    Dim OutputValues.c(Inputcount)
    Protected CanSkipLayer.l
    Protected NegValCount.l

    For ByteLayer = 0 To 3 

        Dim Counters.l(255) ;256
        Dim OffsetTable.l(255) ;256 
        CanSkipLayer.l = #False
        
        *Ptr.Bptr 
    
        ;load input bytes for layer
        For i =  0 To Inputcount ;-1 
            *Ptr = @ArrayToSort(i)  
            InputValues(i) = *Ptr\Byte[ByteLayer]
            Counters(InputValues(i)) = Counters(InputValues(i)) + 1
        Next

        ;if data is zeros for that byte, skip sorting the whole layer
        If Counters(0) = Inputcount + 1  
            CanSkipLayer = #True  
        EndIf
        
        ;calc negatives
        If ByteLayer = 3 And CanSkipLayer = #False
            For i = 127 To 255    
                NegValCount.l = NegValCount + Counters(i)
            Next
        EndIf
            
        ;Sort
        If CanSkipLayer = #False
            ;fix negative offsets in last byte
            If ByteLayer = 3            
                OffsetTable(0) = NegValCount;												
                For i=1 To 127
    	            OffsetTable(i) = OffsetTable(i-1) + Counters(i-1)
                Next
    
                OffsetTable(128) = 0;
                For i=129 To 255			
                    OffsetTable(i) = OffsetTable(i-1) + Counters(i-1);
                Next          
            Else 
                For i = 1 To 255            
                    OffsetTable(i) = OffsetTable(i-1) + Counters(i-1)
                Next        
            EndIf
                   
            For i = 0 To Inputcount ;-1
                RawOut(OffsetTable(InputValues(i))) = ArrayToSort(i)
                OffsetTable(InputValues(i)) = OffsetTable(InputValues(i)) + 1
            Next
            
            CopyMemory(@Rawout(),@ArrayToSort(),(Inputcount + 1) * 4)        
        EndIf
    Next 

EndProcedure


OpenConsole()
    
    Itemcount = 20000000
    Distribution = 4000000000
    
    Dim RawVals.f(Itemcount)
    For i = 0 To Itemcount ;-1
        RawVals(i) = Random(Distribution) / Random(Distribution) * -1
    Next

    For i = 0 To Itemcount ;-1
        Debug RawVals(i) 
    Next
;     
;     
     Debug " SORT !"    
    PrintN("Radix Sort")
    timer = ElapsedMilliseconds()       
    RadixSort(RawVals())      
    PrintN(Str(ElapsedMilliseconds() - timer))
    
    For i = 0 To Itemcount ;-1
        Debug RawVals(i) 
    Next
;     
    
    Dim RawVals.f(Itemcount)
    For i = 0 To Itemcount ;-1
        RawVals(i) = Random(Distribution) / Random(Distribution) 
    Next
    
    PrintN("PB Sort")
    timer = ElapsedMilliseconds()       
    SortArray(RawVals(),#PB_Sort_Ascending)      
    PrintN(Str(ElapsedMilliseconds() - timer))
 
    Input()


CloseConsole()
    
Paul Dwyer

“In nature, it’s not the strongest nor the most intelligent who survives. It’s the most adaptable to change” - Charles Darwin
“If you can't explain it to a six-year old you really don't understand it yourself.” - Albert Einstein
Post Reply