Page 1 of 1

Fastest unsorting (randomize) array method?

Posted: Mon Dec 04, 2006 11:54 pm
by zikitrake
I am seeking a fast method to 'randomize' an array of millions of data.

Example:

Code: Select all

Dim number.l(5000000)
for n = 0 to 5000000
   number(n) = n
next n
and later, to be able to 'randomize' it all the times that I want.

Can anyone help me?.

Regards!

:oops: sorry for my bad-bad-bad english :oops:

Posted: Tue Dec 05, 2006 12:17 am
by utopiomania

Code: Select all

Dim number.l(5) 
for n = 0 to 4
   number(n) = n
next n

dim rnd.l(5)
for n = 0 to 4 
   rnd(n) = number(random(4))
   debug rnd(n)
next n
end
:?:

Posted: Tue Dec 05, 2006 1:04 am
by Xombie
I think he means he wants to fill the array with his own data and then do some kind of random access to it? Like randomly swapping around the elements. ... right? Or no?

Posted: Tue Dec 05, 2006 2:49 am
by Guimauve
I guess something like this will do the job.

Code: Select all

Dim number.l(10)

For Index = 0 To 10
   number(Index) = Index
Next 

Debug "Sorted Array"

For Index = 0 To 10
   Debug number(Index)
Next 

Debug ""

For Index = 0 To 10

   IndexB = Random(10)
   
   If IndexB <> Index 
      Swap number(Index), number(IndexB)
   EndIf
   
Next
 
Debug "UnSorted Array"

For Index = 0 To 10
   Debug number(Index)
Next 

Debug ""
Regards
Guimauve

Posted: Tue Dec 05, 2006 8:28 am
by zikitrake
Xombie wrote:I think he means he wants to fill the array with his own data and then do some kind of random access to it? Like randomly swapping around the elements. ... right? Or no?
Yes!...

Guimauve yes... I use these method... but I'm searching a faster method :?

Posted: Tue Dec 05, 2006 9:04 am
by Hurga
Not sure if it´s faster, but maybe...

1. Create your array and fill in the data

2. Allocate a memory array 4 bytes per Array-item

3. store the adress of each item in the mem

4. "swap" to pointer in mem (this will do the same as swapping the elements)

NOTE: To access the scrambled data, you have to access it via the mem-pointers. With this way, the array itself kepps unchanged.
Or in other words: You create an index to your list and scramble the index...

Hope this would help a bit

Posted: Tue Dec 05, 2006 1:50 pm
by zikitrake
Thank you Hurga I'll test your 'code'

Posted: Thu Dec 07, 2006 4:32 pm
by Xombie
Just curious but rather than randomize the data itself, why not make a wrapper function to use when accessing the array?

Like...

Code: Select all

;-
Global Dim number.l(5000000) 
For n = 0 To 5000000 
   number(n) = n 
Next n 
;-
Procedure.l GetNumber()
   ;
   Protected a.l
   ;
   a = Random(5000000)
   ;
   ProcedureReturn number(a)
   ;
EndProcedure
;-
For i.l = 0 To 10
   ;
   Debug Str(i)+" : "+Str(number(i))
   ;
Next i
Debug "-------... Randomizing ... ----------"
For i.l = 0 To 10
   ;
   Debug Str(i)+" : "+Str(GetNumber())
   ;
Next i
;-
Debug "Done"
;-
End
... what do you think? If you're going to be continuously randomizing the elements wouldn't it make more sense to just randomize your access to the elements? I'm not sure since I don't know exactly what you're doing with the information.

Posted: Thu Dec 07, 2006 4:54 pm
by Xombie
Or, as an alternative to my code above, you can try this...

Code: Select all

;-
#MaxElements = 5000000
;-
Global Dim number.l(#MaxElements) 
For n = 0 To #MaxElements 
   number(n) = n 
Next n 
Global Dim keeper_number.l(#MaxElements)
;-
Procedure UnsortNumber()
   ;
   Protected iLoop.l
   ;
   For iLoop = 0 To #MaxElements
      ;
      keeper_number(iLoop) = 0
      ;
   Next iLoop
   ;
EndProcedure
Procedure.l GetNumber(i)
   ;
   Protected a.l
   ;
   Protected iLoop.l
   ;
   Protected CaughtMatch.b
   ; 
   If keeper_number(i) = -1
      ;
      ProcedureReturn number(0)
      ;
   Else
      ;
      If keeper_number(i) = 0
         ;
         Repeat
            ;
            a = Random(#MaxElements)
            ;
            For iLoop = 0 To #MaxElements
               ;
               If keeper_number(iLoop) = a : CaughtMatch = #True : EndIf
               ;
            Next iLoop
            ;
         Until CaughtMatch = #False
         ;
         keeper_number(i) = a
         ;
         ProcedureReturn number(keeper_number(i))
         ;
      Else
         ;
         ProcedureReturn number(keeper_number(i))
         ;
      EndIf
      ;
   EndIf
   ;
EndProcedure
;-
For i.l = 0 To 10
   ;
   Debug Str(i)+" : "+Str(number(i))
   ;
Next i
Debug "-------... Randomizing ... ----------"
For i.l = 0 To 10
   ;
   Debug Str(i)+" : "+Str(GetNumber(i))
   ;
Next i
Debug "-------... Extending ... ----------"
For i.l = 0 To 20
   ;
   Debug Str(i)+" : "+Str(GetNumber(i))
   ;
Next i
;
UnsortNumber()
;
Debug "-------... Unsorted ... ----------"
For i.l = 0 To 10
   ;
   Debug Str(i)+" : "+Str(GetNumber(i))
   ;
Next i
;
;-
Debug "Done"
MessageRequester("", "Done")
;-
End
You get persistence of the array indexes until UnsortNumber is called. At the expense of speed and memory, unfortunately. It's not as noticeable if you have the debugger turned off. With it enabled, it's horribly slow.

Just some different ideas for you to think about.

..: Edit :.. I just re-read what Hurga said and that's almost exactly what I did :oops:

Posted: Thu Dec 07, 2006 5:26 pm
by Hurga
Hi Xombi

Not sure, but I think (or better: hope :D ) my method should be faster....

if you use this to access the data:

Code: Select all

*l.Long

*l = (Itemtoaccess) <<2

*l\l (=item in the array)


should be a fast access. If you have more dimensions or a structure, you can access it this way, too.
Just ...

Code: Select all

*l.Long

*l = (Itemtoaccess) <<2 + OffsetOf (...)  - for structured arrays or linked lists)




*l\l (=item in the array)

Maybe zikitrake test both ways and post the results here... just curious...

Posted: Thu Dec 07, 2006 5:58 pm
by Xombie
I meant to say "nearly" what you had suggested rather than "exactly".

Hurga - I'm not following your code very well. So, you allocate a block of pointers (4 times the size of the count of items in the array). I get that part but I really don't get it after that.

Posted: Thu Dec 07, 2006 8:28 pm
by Xombie
I did some quick research online and stumbled upon an algorithm that seems to do the trick quickly. I think I have it implemented correctly.

Code: Select all

;-
#MaxElements = 5000000
;-
Global Dim number.l(#MaxElements) 
For n = 0 To #MaxElements 
   number(n) = n 
Next n 
;-
Procedure FisherYates()
   ;
   Protected i.l
   ;
   Protected j.l
   ;
   i = #MaxElements
   ;
   While i
      ;
      j = Random(i + 1)
      ;
      Swap number(i), number(j)
      ;
      i - 1
      ;
   Wend
   ;
EndProcedure
;-
For i.l = 0 To 20
   Debug Str(i)+" : "+Str(number(i))
Next i
Debug " ....................: Shuffle :...................."
FisherYates()
For i.l = 0 To 20
   Debug Str(i)+" : "+Str(number(i))
Next i
Debug " ....................: Shuffle :...................."
FisherYates()
For i.l = 0 To 20
   Debug Str(i)+" : "+Str(number(i))
Next i
;-
Debug "Done..."
;-
End
;-
The Fisher-Yates Shuffle algorithm. Even with debug turned on, it's quick. However, I'd really suggest it being tested thoroughly first.

Posted: Mon Dec 11, 2006 2:26 pm
by Hurga
I made some procs for the main handling of the mem-items, hope you can see what I mean - (showing the main idea behind) :-)

Code: Select all

; General Procs for Mem-Item handling
Procedure.l AddMemItem(Mem_P.l, Item_P.l) ; returns the new pointer to mem and add the pointer to the Mem-list
  *l.Long
  If Mem_P = 0 ; no mem reserved 
    Mem_P = AllocateMemory(4) ; a long
  Else
    Mem_P = ReAllocateMemory(Mem_P, MemorySize(Mem_P) + 4)
  EndIf
  *l = Mem_P + MemorySize(Mem_P) - 4
  *l\l = Item_P
  
  ProcedureReturn Mem_P
EndProcedure
Procedure.l CountMemItems(Mem_P.l) ; returns the number of Items here
  Back_L.l
  If Mem_P <> 0
    Back_L = MemorySize(Mem_P)>>2
  Else
    Back_L = 0
  EndIf
  ProcedureReturn Back_L
  
EndProcedure
Procedure.l SearchItemInIndex(Mem_P.l, Item_P.l) ; returns the itemnumber of these item in this index, else -1
  Back_L.l = -1
  *l.Long
  If Mem_P
    For x = 0 To MemorySize(Mem_P)>>2 -1
      *l = Mem_P + x<<2
      If *l\l = Item_P
        Back_L = x
        Break
      EndIf
    Next
  EndIf
  ProcedureReturn Back_L
EndProcedure
Procedure.l ReleaseItemMem(Mem_P.l) ; release the memory and returns 0
  *l.Long
  If Mem_P ; mem reserved 
    FreeMemory(Mem_P)
  EndIf
  ProcedureReturn 0
EndProcedure
Procedure.l DeleteMemItem(Mem_P.l, Item_L.l) ; deletes the Item
  *l.Long
  If Item_L > -1 And Item_L <= CountMemItems(Mem_P)
    *l = Mem_P + Item_L<<2
    
    ; shuffle the Items up
    For x = Item_L To MemorySize(Mem_P)>>2 - 2
      CopyMemory(Mem_P + (x+1)<<2, Mem_P + x<<2, 4)
    Next
    
    ; realloc the mem
    If MemorySize(Mem_P) - 4 = 0
      FreeMemory(Mem_P)
      Mem_P = 0
    Else
      Mem_P = ReAllocateMemory(Mem_P, MemorySize(Mem_P) - 4)
    EndIf
  EndIf
  ProcedureReturn Mem_P
EndProcedure
If you need it faster, maybe better to use the code directly (or make macros out of them)