RandomBit() or how to get random yes or no behavior.

Share your advanced PureBasic knowledge/code with the community.
freak
PureBasic Team
PureBasic Team
Posts: 5929
Joined: Fri Apr 25, 2003 5:21 pm
Location: Germany

Post by freak »

Randomness in information theory is all about entropy (more specifically Shannon entropy). Entropy basically means the "information content" of a source of data. If every bit of output is completely unrelated to the previous one, then its entropy is 1 bit. As soon as there is any bias in the source or a pattern etc, then the source provides less information and the entropy goes down.

Now you may think your source provides perfect entropy, but it doesn't. The output by your source is determined by external factors. There are many factors as you mentioned but still, if these factors are constant for a period of time then your source may observe some bias or pattern and with that its entropy goes down. If the entropy goes down, then the output is not truly random anymore in an information theoretic sense. (it may still "look random" of course). The only way to bring up the randomness in the output is to either find a new source for entropy or just output less data.

Of course this is not visible when extracting small data sets, but if you want to claim it to be "truly random" then small data sets are not enough. So what real random number generators (not pseudo-random, the comparison with PB's random is of course nonsense) do is they try to measure the amount of entropy that their source provides, and if the caller requests more random data than there is entropy available then they either try to find a new source, or block until they can provide the data.

See also here:
http://en.wikipedia.org/wiki/TRNG#Using_observed_events

Don't mind the talk about attackers, because controlling the output of your source from the outside would indeed be a hard task but look at the way such generators are constructed generally. So yes, a source like yours can be used to produce random data. But the source on its own is not perfect or "truly random" on its own. You have to collect the data it provides and estimate the amount of entropy it provides to be able to know how much random data you can actually produce. Its not a "random indefinite stream of bits" just because the CPU never stops running.

Thats what i was saying initially. Its not as simple as you put it.
> I'd never advise using RandomBit() as it is for crypto stuff, for that it's far to random, erm, unpredictable.
Sorry, but thats just the joke of the day
And why is that? It's obvious that using the TSC as a RNG would be a horrible choice, you can't have a unpredictable RNG, that's why crypto uses PRNG's instead to avoid anything resembling patterns (even naturally occurring random patterns).
Even a temporary pattern could make a key stream weak at a moment and thus allow an attacker to crack the stream or key.
Saying a source is "too random" for cryptography is just a joke, sorry.

Cryptographic key generation does not use pseudo-random number generators. Key generation uses external sources of entropy as much as possible (such as your source and others), then on top of that input runs a cryptographic save pseudo random number generator or hash function to provide the statistical properties that are needed (such as avoiding all 0's etc). Do you remember the OpenSSL problems in debian a while back ? The reason for all the trouble was that somebody removed the primary source of entropy for the key generation, leaving only the current process id as input. The generated key still "looked random" because of the processing that followed, but in truth the total pool of generated keys was very very small.
Just saying "haha your wrong" and not backing it up rubs me the wrong way, as far as I can recall there's never been any beef between us before, so where this is coming from I have no idea.
I wasn't trying to hurt your feelings. As I said, this may well fit your needs and those of a lot more people around here, but don't claim it to be perfect when it isn't.

peace :)
quidquid Latine dictum sit altum videtur
User avatar
NoahPhense
Addict
Addict
Posts: 1999
Joined: Thu Oct 16, 2003 8:30 pm
Location: North Florida

Post by NoahPhense »

freak wrote:However unpredictable you think this may be, it still not "truly random". :P

If it were that easy, everybody would be doing it this way ;)
Nothing can ever be truly random and no one can ever prove something
to be random. The appearance of random is everywhere. But that's all
about perception. I don't even think we can truly fathom randomization,
just as we cannot fathom our own non-existence. 8)

Ok, back to my porn.

- np
User avatar
Kaeru Gaman
Addict
Addict
Posts: 4826
Joined: Sun Mar 19, 2006 1:57 pm
Location: Germany

Post by Kaeru Gaman »

Nothing can ever be truly random and no one can ever prove something
to be random. The appearance of random is everywhere. But that's all
about perception.
if coincidence exist at all is still unanswered by philosophy.
chaos maths produces highly random-seeming results by highly restricted constructors.
so it truely is all about perception and appearence.
oh... and have a nice day.
SFSxOI
Addict
Addict
Posts: 2970
Joined: Sat Dec 31, 2005 5:24 pm
Location: Where ya would never look.....

Post by SFSxOI »

To me, random means no discernable repeatable predictable pattern possible to the method or outcome of the method.

To math, random is something completly different because its quantifiable in some way that even the outcome was predictable simply because an outcome was expected but we just didn't know what it would be until the math was done. But in the real world we discerne things as random simply because we don't understand them sometimes, but it doesn't mean there isn't really a predictable pattern or method somewhere and it could mean that there is no predictable pattern or method somewhere . Then you could probably drive yourself nuts trying to figure out if there was some sort of pattern or method or if the pattern or method its self was random. Then again consider that randomness affects our lives in so many ways, that car accident for example might not have happened if we turned left at the light before we stopped for coffee that morning, or had we just dropped two coins in that slot machine instead of one, or had we left the house two minutes earlier then that movie that was the last one at the video store might still have been there when we got there...etc.....

We just don't know, and never will know. Even if we could apply some theory or calculation or method and produce some sort of predictable outcome in the sense that a truely random thing could be produced even though we can't predict what that random thing will be, how do we know that the outcome is not a result of some random fluke even if we can repeat it? And, if by some chance we were to find out, by math or any other way, would we ever know that its was not just a random fluke that we did find out? I tend to think that the closest we will ever get to true randomness is to become so strange in trying to find it that we simply can no longer comprehend even our own selfs.

OK, gonna go watch that movie I rented because that one guy out there is pissed because I got the last copy before him. Now, where to stop for coffee along the way..... :)

Just a few random thoughts of my own. :)
User avatar
Rescator
Addict
Addict
Posts: 1769
Joined: Sat Feb 19, 2005 5:05 pm
Location: Norway

Post by Rescator »

You know what freak, I have to apologize because I think we are actually arguing from the same side just different viewpoints. O.o

Oh and SFSxOI?
That video you rented was registered in the store's computer, which synced up to the corporate server using the internet most likely, which caused a bunch of routers to handle the traffic, which influence the timing of other routers, which might have delayed your computer's clock update by a fraction of a second, which caused a change in the CPU cycles, which might have caused the LSB of the TSC to go 1 or 0 when it could have been the other way around. :P

And freak, as far as entropy goes of the LSB of the TSC, in theory it should be 1bit, but as at least one person noted in this thread (besides me) it is not.
In fact a 1bit with a entropy of 1bit would be 0,1,0,1,0,1,0,1,0,1,0,1 BTW! :P

So it's a good thing the LSB of the TSC is not truly random then, to avoid further clashing of the minds here freak, would you accept it if I said the RandomBit() procedure returns a "natural" random bit? (haven't checked if anything is actually called that previously though)
User avatar
Rescator
Addict
Addict
Posts: 1769
Joined: Sat Feb 19, 2005 5:05 pm
Location: Norway

Post by Rescator »

I just searched Google for: natural random bit
The second result points to the first post in this thread....
*scrambles to type the next line as quickly as possible*

Natural Random Bit (tm), you heard it here first folks, on the PureBasic (tm) forum *wink*.
User avatar
idle
Always Here
Always Here
Posts: 5097
Joined: Fri Sep 21, 2007 5:52 am
Location: New Zealand

Post by idle »

It's easy enough to test, just do a 1000 coin tosses a few hundred times and if the count is consistently between 475 and 525 then it's a perfect RANDOM normal distribution!

I still say pin a tail on it and call it a weasel! :D

Code: Select all

EnableExplicit
 

Procedure.i RandomBit()
 !RDTSC ;Read the CPU Time Stamp Counter
 !AND EAX,$1 ;We only want the least significant bit as that changes the most.
 ProcedureReturn ;Return the EAX register content. (0 or 1 in this case)
EndProcedure

Define bit.i, ct.i,a.i,b.i

For b = 1 To 100
ct=0
For a = 0 To 1000
bit=RandomBit()
Delay(1) 
 
If bit
 ct+1
EndIf

Next 

Debug "Count " + Str(ct) 

Next 

Count 521
Count 475
Count 489
Count 501
Count 503
Count 488
Count 517
Count 491
Count 500
Count 518
Count 467
Count 521
Count 496
Count 506
Count 495
Count 498
Count 520
Count 520
Count 487
Count 486
Count 477
Count 525
Count 510
Count 499
Count 503
Count 496
Count 519
Count 476
Count 503
Count 471
Count 498
Count 515
Count 486
User avatar
Rescator
Addict
Addict
Posts: 1769
Joined: Sat Feb 19, 2005 5:05 pm
Location: Norway

Post by Rescator »

Like this one? :P

Code: Select all

DisableDebugger
EnableExplicit

Procedure.i RandomBit()
 !RDTSC ;Read the CPU Time Stamp Counter
 !AND EAX,$1 ;We only want the least significant bit as that changes the most.
 ProcedureReturn ;Return the EAX register content. (0 or 1 in this case)
EndProcedure


Define.i bitone,bitzero,randone,randzero,i,n

For n=1 To 100
DisableDebugger
bitone=0
For i=1 To 1000
 If Not RandomBit()
  bitzero+1
 Else
  bitone+1
 EndIf
Next
EnableDebugger
Debug bitone
Next

Debug ""

For n=1 To 100
DisableDebugger
randone=0
For i=1 To 1000
 If Random(1)
  randone+1
 Else
  randzero+1
 EndIf
Next
EnableDebugger
Debug randone
Next
PureBasic's Random() range from 460 to 534.
RandomBit() range from 269 to 665.

So Random() has better entropy (statistical), while RandomBit() has more unpredictability (natural). Or at least should, in theory.
User avatar
Rescator
Addict
Addict
Posts: 1769
Joined: Sat Feb 19, 2005 5:05 pm
Location: Norway

Post by Rescator »

@idle: your example is better as one would not normally do it like in my post above.

I also read something else, it seems RDTSC is rather expensive and uses like 200 cpu cycles. I don't know about you, but I can't help but think that PureBasic's Random() are less than 200 cycles. (looks at freak/fred)
So for speed reasons alone, Random() might be the better choice (if one plan to generate thousands of coin tosses in a row).
User avatar
Rescator
Addict
Addict
Posts: 1769
Joined: Sat Feb 19, 2005 5:05 pm
Location: Norway

Post by Rescator »

Using idle's test code I got this: (min count 458, max count 543)
Count 499
Count 479
Count 518
Count 537
Count 529
Count 496
Count 502
Count 514
Count 503
Count 503
Count 526
Count 487
Count 516
Count 489
Count 504
Count 479
Count 509
Count 517
Count 504
Count 498
Count 505
Count 478
Count 504
Count 501
Count 509
Count 481
Count 511
Count 501
Count 492
Count 494
Count 464
Count 487
Count 484
Count 470
Count 496
Count 500
Count 508
Count 513
Count 521
Count 514
Count 496
Count 502
Count 490
Count 502
Count 543
Count 515
Count 486
Count 487
Count 504
Count 501
Count 497
Count 503
Count 475
Count 511
Count 497
Count 511
Count 523
Count 491
Count 508
Count 509
Count 458
Count 491
Count 471
Count 514
Count 499
Count 500
Count 509
Count 518
Count 497
Count 495
Count 490
Count 480
Count 504
Count 464
Count 496
Count 485
Count 504
Count 504
Count 534
Count 506
Count 501
Count 484
Count 517
Count 516
Count 475
Count 488
Count 477
Count 521
Count 486
Count 497
Count 503
Count 495
Count 521
Count 499
Count 507
Count 530
Count 508
Count 509
Count 507
Count 497
Nice spread.
User avatar
idle
Always Here
Always Here
Posts: 5097
Joined: Fri Sep 21, 2007 5:52 am
Location: New Zealand

Post by idle »

that was quick mine is still running!
freak
PureBasic Team
PureBasic Team
Posts: 5929
Joined: Fri Apr 25, 2003 5:21 pm
Location: Germany

Post by freak »

If you want to know about PB's random, here is a paper that describes its theory: http://www.agner.org/random/theory/chaosran.pdf
There is some code for it here: http://oldmill.uchicago.edu/~wilder/Cod ... andomc.htm

PB's is a RANROT type W generator. The PB Random() is my own implementation because the code by the paper author is under the GPL.
This type of generator is quite fast because it only uses addition and bit-rotation to compute the next random number. (a multiplication is required to fit the result to the maximum though)

A tip: If you use Random() with the unsigned maximum for an integer type (so $FFFFFFFF on x86, $FFFFFFFFFFFFFFFF on x64) then even this multiplication step is skipped and you get plain 32 or 64 pseudo-random bits which actually gives you better statistical properties than using another maximum. (because using a maximum that is not a power of 2 makes some numbers slightly less/more likely). Of course this may give you negative numbers, but if you only care about the bits then that isn't a problem. All this works since 4.30 only.
quidquid Latine dictum sit altum videtur
User avatar
idle
Always Here
Always Here
Posts: 5097
Joined: Fri Sep 21, 2007 5:52 am
Location: New Zealand

Post by idle »

haven't thought about this yet but looks good, plotting the results and doing a linear regression turned out as f(x) =0.01x + 126.8
test of using random() resulted in f(x) = -0.08x + 131.87

would have to do loads of tests but it does look good.
the only issue is that it needs a delay between the probes.

Code: Select all

EnableExplicit

Procedure SetBit(*num,bit)
   Protected of.i
   of = bit >> 3
   PokeB(*num+of,(PeekB(*num+of) | (1 << (bit % 8))))
EndProcedure

Procedure.i RandomBit()
 !RDTSC ;Read the CPU Time Stamp Counter
 !AND EAX,$1 ;We only want the least significant bit as that changes the most.
 ProcedureReturn ;Return the EAX register content. (0 or 1 in this case)
EndProcedure

Procedure RRandom(val)
 
 Protected n.f, num, a,bit   
 
  n = Log(val)/ Log(2)
  For a = 1 To n
  bit=RandomBit()
   Delay(1)
   If bit
    setbit(@num,a)  
   EndIf
 Next 
 
 ProcedureReturn num % val 

EndProcedure

Define bit.i, num.i,a.i,b.i

For b = 1 To 100

Debug RRandom(255)

Next 
Windows 11, Manjaro, Raspberry Pi OS
Image
AND51
Addict
Addict
Posts: 1040
Joined: Sun Oct 15, 2006 8:56 pm
Location: Germany
Contact:

Post by AND51 »

freak wrote:A tip: If you use Random() with the unsigned maximum for an integer type (so $FFFFFFFF on x86, $FFFFFFFFFFFFFFFF on x64) then even this multiplication step is skipped and you get plain 32 or 64 pseudo-random bits
That's very interesting! :)

Could you post more of these tips'n'hints? Me and others, I think, would be very interested in looking over PureBasics shoulder. The best would be to add such precious knowledge to the manual, although it is 'only' a reference. This ensures that the knowledge won't be forgotten and can easily be accessed by everybody. A less good alternative would be to collect these hints at a central point, like your PureBasic blog or an extra forum's thread (where only you have access to (to avoid spam/discussing)).

Having such a background knowledge would gretaly help us to understand PureBasic better and to optimize our programs.
PB 4.30

Code: Select all

onErrorGoto(?Fred)
User avatar
Rescator
Addict
Addict
Posts: 1769
Joined: Sat Feb 19, 2005 5:05 pm
Location: Norway

Post by Rescator »

freak wrote:A tip: If you use Random() with the unsigned maximum for an integer type (so $FFFFFFFF on x86, $FFFFFFFFFFFFFFFF on x64) then even this multiplication step is skipped and you get plain 32 or 64 pseudo-random bits which actually gives you better statistical properties
Holy cow freak, that is kickass.

Wait a sec, that means I could just do a:

Code: Select all

Define.i num,i
For i=1 To 100
 num=Random($FFFFFFFFFFFFFFFF)
 Debug num&$1 ;get LSB
Next
To get coin toss.
And for larger numbers, all I would need to do is change the mask to get the "n" number of LSB's...
Hang on, would using a bitmask be a speedier way to return the result than Random() currently does? (it could then always use the max 64bits internally, and remove that multiplication step)



@idle: you might like this test example. Note! the entropy comments are as seen on my system, yours might vary. I got 1.0 most of the time, and once in a while 0.99
Although with freak revealing the "trick", combined with bitmasking the result to get the random bits/bytes you want, for potential speed reasons alone I'm leaning towards Random() now though. (haven't done speed comparison tests yet on RandomBit() and Random(), RandomBit() must be 64 times faster than Random() to actually have a speed gain. ugh)

Code: Select all

EnableExplicit
DisableDebugger
Procedure.i RandomByte()
 !XOR ECX,ECX ;Set ECX to 0.

 !RDTSC ;get TSC (and put it in EDX:EAX, we only want the LSB in EAX though)
 !AND EAX,$1 ;discard all but the LSB in EAX (and put the result in EAX)
 !OR ECX,EAX ;OR EAX with ECX to build our bit collection

 !RDTSC
 !AND EAX,$1
 !SHL EAX,1 ;shift the bit to the left in preparation for the OR
 !OR ECX,EAX
 ;All this should probably be turned into a loop for speed and scalability (1 to 64bit).
 ;Any ASM gurus wanna give a tight loop a try?

 !RDTSC
 !AND EAX,$1
 !SHL EAX,2
 !OR ECX,EAX

 !RDTSC
 !AND EAX,$1
 !SHL EAX,3
 !OR ECX,EAX

 !RDTSC
 !AND EAX,$1
 !SHL EAX,4
 !OR ECX,EAX

 !RDTSC
 !AND EAX,$1
 !SHL EAX,5
 !OR ECX,EAX

 !RDTSC
 !AND EAX,$1
 !SHL EAX,6
 !OR ECX,EAX

 !RDTSC
 !AND EAX,$1
 !SHL EAX,7
 !OR ECX,EAX

 !MOV EAX,ECX ;copy our bit collection to EAX
 ProcedureReturn ;return EAX
EndProcedure

EnableDebugger
Define.i i,sum,num
Define.d calc
sum=0
For i=1 To 100;0000
 num=RandomByte()+1 ;we want 1-256 in this case and not 0-255 which it really returns.
 sum+num
 ;Debug RSet(Bin(num,#PB_Byte),8,"0")
 ;Debug num
Next
calc=(i*128)/sum ;calculate the summed spread.
Debug StrD(calc,2)
;If result is 1.0 this equals perfect entropy,
;you'll most likely see it showing 0.99,1.0,1.01 if you run this test multiple times,
;so it's not perfect, it all depends on number of loops and
;rounding errors of the float math used to calculate the sum,
;and the actual entropy as well.
;There is no rounding issues with RandomByte() obviously as it's integer.
Post Reply