Fastest unsorting (randomize) array method?

Just starting out? Need help? Post your questions and find answers here.
zikitrake
Addict
Addict
Posts: 868
Joined: Thu Mar 25, 2004 2:15 pm
Location: Spain

Fastest unsorting (randomize) array method?

Post by zikitrake »

I am seeking a fast method to 'randomize' an array of millions of data.

Example:

Code: Select all

Dim number.l(5000000)
for n = 0 to 5000000
   number(n) = n
next n
and later, to be able to 'randomize' it all the times that I want.

Can anyone help me?.

Regards!

:oops: sorry for my bad-bad-bad english :oops:
PB 6.21 beta, PureVision User
User avatar
utopiomania
Addict
Addict
Posts: 1655
Joined: Tue May 10, 2005 10:00 pm
Location: Norway

Post by utopiomania »

Code: Select all

Dim number.l(5) 
for n = 0 to 4
   number(n) = n
next n

dim rnd.l(5)
for n = 0 to 4 
   rnd(n) = number(random(4))
   debug rnd(n)
next n
end
:?:
Xombie
Addict
Addict
Posts: 898
Joined: Thu Jul 01, 2004 2:51 am
Location: Tacoma, WA
Contact:

Post by Xombie »

I think he means he wants to fill the array with his own data and then do some kind of random access to it? Like randomly swapping around the elements. ... right? Or no?
User avatar
Guimauve
Enthusiast
Enthusiast
Posts: 742
Joined: Wed Oct 22, 2003 2:51 am
Location: Canada

Post by Guimauve »

I guess something like this will do the job.

Code: Select all

Dim number.l(10)

For Index = 0 To 10
   number(Index) = Index
Next 

Debug "Sorted Array"

For Index = 0 To 10
   Debug number(Index)
Next 

Debug ""

For Index = 0 To 10

   IndexB = Random(10)
   
   If IndexB <> Index 
      Swap number(Index), number(IndexB)
   EndIf
   
Next
 
Debug "UnSorted Array"

For Index = 0 To 10
   Debug number(Index)
Next 

Debug ""
Regards
Guimauve
zikitrake
Addict
Addict
Posts: 868
Joined: Thu Mar 25, 2004 2:15 pm
Location: Spain

Post by zikitrake »

Xombie wrote:I think he means he wants to fill the array with his own data and then do some kind of random access to it? Like randomly swapping around the elements. ... right? Or no?
Yes!...

Guimauve yes... I use these method... but I'm searching a faster method :?
PB 6.21 beta, PureVision User
Hurga
Enthusiast
Enthusiast
Posts: 148
Joined: Thu Jul 17, 2003 2:53 pm
Contact:

Post by Hurga »

Not sure if it´s faster, but maybe...

1. Create your array and fill in the data

2. Allocate a memory array 4 bytes per Array-item

3. store the adress of each item in the mem

4. "swap" to pointer in mem (this will do the same as swapping the elements)

NOTE: To access the scrambled data, you have to access it via the mem-pointers. With this way, the array itself kepps unchanged.
Or in other words: You create an index to your list and scramble the index...

Hope this would help a bit
zikitrake
Addict
Addict
Posts: 868
Joined: Thu Mar 25, 2004 2:15 pm
Location: Spain

Post by zikitrake »

Thank you Hurga I'll test your 'code'
PB 6.21 beta, PureVision User
Xombie
Addict
Addict
Posts: 898
Joined: Thu Jul 01, 2004 2:51 am
Location: Tacoma, WA
Contact:

Post by Xombie »

Just curious but rather than randomize the data itself, why not make a wrapper function to use when accessing the array?

Like...

Code: Select all

;-
Global Dim number.l(5000000) 
For n = 0 To 5000000 
   number(n) = n 
Next n 
;-
Procedure.l GetNumber()
   ;
   Protected a.l
   ;
   a = Random(5000000)
   ;
   ProcedureReturn number(a)
   ;
EndProcedure
;-
For i.l = 0 To 10
   ;
   Debug Str(i)+" : "+Str(number(i))
   ;
Next i
Debug "-------... Randomizing ... ----------"
For i.l = 0 To 10
   ;
   Debug Str(i)+" : "+Str(GetNumber())
   ;
Next i
;-
Debug "Done"
;-
End
... what do you think? If you're going to be continuously randomizing the elements wouldn't it make more sense to just randomize your access to the elements? I'm not sure since I don't know exactly what you're doing with the information.
Xombie
Addict
Addict
Posts: 898
Joined: Thu Jul 01, 2004 2:51 am
Location: Tacoma, WA
Contact:

Post by Xombie »

Or, as an alternative to my code above, you can try this...

Code: Select all

;-
#MaxElements = 5000000
;-
Global Dim number.l(#MaxElements) 
For n = 0 To #MaxElements 
   number(n) = n 
Next n 
Global Dim keeper_number.l(#MaxElements)
;-
Procedure UnsortNumber()
   ;
   Protected iLoop.l
   ;
   For iLoop = 0 To #MaxElements
      ;
      keeper_number(iLoop) = 0
      ;
   Next iLoop
   ;
EndProcedure
Procedure.l GetNumber(i)
   ;
   Protected a.l
   ;
   Protected iLoop.l
   ;
   Protected CaughtMatch.b
   ; 
   If keeper_number(i) = -1
      ;
      ProcedureReturn number(0)
      ;
   Else
      ;
      If keeper_number(i) = 0
         ;
         Repeat
            ;
            a = Random(#MaxElements)
            ;
            For iLoop = 0 To #MaxElements
               ;
               If keeper_number(iLoop) = a : CaughtMatch = #True : EndIf
               ;
            Next iLoop
            ;
         Until CaughtMatch = #False
         ;
         keeper_number(i) = a
         ;
         ProcedureReturn number(keeper_number(i))
         ;
      Else
         ;
         ProcedureReturn number(keeper_number(i))
         ;
      EndIf
      ;
   EndIf
   ;
EndProcedure
;-
For i.l = 0 To 10
   ;
   Debug Str(i)+" : "+Str(number(i))
   ;
Next i
Debug "-------... Randomizing ... ----------"
For i.l = 0 To 10
   ;
   Debug Str(i)+" : "+Str(GetNumber(i))
   ;
Next i
Debug "-------... Extending ... ----------"
For i.l = 0 To 20
   ;
   Debug Str(i)+" : "+Str(GetNumber(i))
   ;
Next i
;
UnsortNumber()
;
Debug "-------... Unsorted ... ----------"
For i.l = 0 To 10
   ;
   Debug Str(i)+" : "+Str(GetNumber(i))
   ;
Next i
;
;-
Debug "Done"
MessageRequester("", "Done")
;-
End
You get persistence of the array indexes until UnsortNumber is called. At the expense of speed and memory, unfortunately. It's not as noticeable if you have the debugger turned off. With it enabled, it's horribly slow.

Just some different ideas for you to think about.

..: Edit :.. I just re-read what Hurga said and that's almost exactly what I did :oops:
Hurga
Enthusiast
Enthusiast
Posts: 148
Joined: Thu Jul 17, 2003 2:53 pm
Contact:

Post by Hurga »

Hi Xombi

Not sure, but I think (or better: hope :D ) my method should be faster....

if you use this to access the data:

Code: Select all

*l.Long

*l = (Itemtoaccess) <<2

*l\l (=item in the array)


should be a fast access. If you have more dimensions or a structure, you can access it this way, too.
Just ...

Code: Select all

*l.Long

*l = (Itemtoaccess) <<2 + OffsetOf (...)  - for structured arrays or linked lists)




*l\l (=item in the array)

Maybe zikitrake test both ways and post the results here... just curious...
Xombie
Addict
Addict
Posts: 898
Joined: Thu Jul 01, 2004 2:51 am
Location: Tacoma, WA
Contact:

Post by Xombie »

I meant to say "nearly" what you had suggested rather than "exactly".

Hurga - I'm not following your code very well. So, you allocate a block of pointers (4 times the size of the count of items in the array). I get that part but I really don't get it after that.
Xombie
Addict
Addict
Posts: 898
Joined: Thu Jul 01, 2004 2:51 am
Location: Tacoma, WA
Contact:

Post by Xombie »

I did some quick research online and stumbled upon an algorithm that seems to do the trick quickly. I think I have it implemented correctly.

Code: Select all

;-
#MaxElements = 5000000
;-
Global Dim number.l(#MaxElements) 
For n = 0 To #MaxElements 
   number(n) = n 
Next n 
;-
Procedure FisherYates()
   ;
   Protected i.l
   ;
   Protected j.l
   ;
   i = #MaxElements
   ;
   While i
      ;
      j = Random(i + 1)
      ;
      Swap number(i), number(j)
      ;
      i - 1
      ;
   Wend
   ;
EndProcedure
;-
For i.l = 0 To 20
   Debug Str(i)+" : "+Str(number(i))
Next i
Debug " ....................: Shuffle :...................."
FisherYates()
For i.l = 0 To 20
   Debug Str(i)+" : "+Str(number(i))
Next i
Debug " ....................: Shuffle :...................."
FisherYates()
For i.l = 0 To 20
   Debug Str(i)+" : "+Str(number(i))
Next i
;-
Debug "Done..."
;-
End
;-
The Fisher-Yates Shuffle algorithm. Even with debug turned on, it's quick. However, I'd really suggest it being tested thoroughly first.
Hurga
Enthusiast
Enthusiast
Posts: 148
Joined: Thu Jul 17, 2003 2:53 pm
Contact:

Post by Hurga »

I made some procs for the main handling of the mem-items, hope you can see what I mean - (showing the main idea behind) :-)

Code: Select all

; General Procs for Mem-Item handling
Procedure.l AddMemItem(Mem_P.l, Item_P.l) ; returns the new pointer to mem and add the pointer to the Mem-list
  *l.Long
  If Mem_P = 0 ; no mem reserved 
    Mem_P = AllocateMemory(4) ; a long
  Else
    Mem_P = ReAllocateMemory(Mem_P, MemorySize(Mem_P) + 4)
  EndIf
  *l = Mem_P + MemorySize(Mem_P) - 4
  *l\l = Item_P
  
  ProcedureReturn Mem_P
EndProcedure
Procedure.l CountMemItems(Mem_P.l) ; returns the number of Items here
  Back_L.l
  If Mem_P <> 0
    Back_L = MemorySize(Mem_P)>>2
  Else
    Back_L = 0
  EndIf
  ProcedureReturn Back_L
  
EndProcedure
Procedure.l SearchItemInIndex(Mem_P.l, Item_P.l) ; returns the itemnumber of these item in this index, else -1
  Back_L.l = -1
  *l.Long
  If Mem_P
    For x = 0 To MemorySize(Mem_P)>>2 -1
      *l = Mem_P + x<<2
      If *l\l = Item_P
        Back_L = x
        Break
      EndIf
    Next
  EndIf
  ProcedureReturn Back_L
EndProcedure
Procedure.l ReleaseItemMem(Mem_P.l) ; release the memory and returns 0
  *l.Long
  If Mem_P ; mem reserved 
    FreeMemory(Mem_P)
  EndIf
  ProcedureReturn 0
EndProcedure
Procedure.l DeleteMemItem(Mem_P.l, Item_L.l) ; deletes the Item
  *l.Long
  If Item_L > -1 And Item_L <= CountMemItems(Mem_P)
    *l = Mem_P + Item_L<<2
    
    ; shuffle the Items up
    For x = Item_L To MemorySize(Mem_P)>>2 - 2
      CopyMemory(Mem_P + (x+1)<<2, Mem_P + x<<2, 4)
    Next
    
    ; realloc the mem
    If MemorySize(Mem_P) - 4 = 0
      FreeMemory(Mem_P)
      Mem_P = 0
    Else
      Mem_P = ReAllocateMemory(Mem_P, MemorySize(Mem_P) - 4)
    EndIf
  EndIf
  ProcedureReturn Mem_P
EndProcedure
If you need it faster, maybe better to use the code directly (or make macros out of them)
Post Reply