You can use it, for example, to help filter out noise, detect text, or break simple ciphers.
https://en.wikipedia.org/wiki/Index_of_coincidence
https://www.dcode.fr/index-coincidence#q6
Code: Select all
Procedure.f CoincidenceIndex(*buf, len, Normalize=#False)
;Returns a value between 0.0 and 1.0 (if Normalize=#False), or -1 if invalid input
If (len <= 0) Or (Not *buf): ProcedureReturn -1: EndIf
;Get character counts (distribution)
Dim Cnt(256)
Protected *nextchar.Ascii
For *nextchar = *buf To *buf + (len-1)
Cnt(*nextchar\a) + 1
Next
;Calculate IC
Protected num.f, den.f, coefficient.i
For i = 0 To 255
If Cnt(i)
coefficient + 1
num + ( Cnt(i) * (Cnt(i)-1) )
den + Cnt(i)
EndIf
Next i
Protected IC.f = (num / ( den * (den - 1) ) )
If Normalize = #True: IC * coefficient: EndIf ;this is also known as returning the 'kappa-plaintext' instead of the IC
ProcedureReturn IC
EndProcedure
;TEST ... (note: this is meant to be compiled in ASCII not Unicode)
;buf$ = "QPWKALVRXCQZIKGRBPFAEOMFLJMSDZVDHXCXJYEBIMTRQWNMEAIZRVKCVKVLXNEICFZPZCZZHKMLVZVZIZRRQWDKECHOSNYXXLSPMYKVQXJTDCIOMEEXDQVSRXLRLKZHOV"
buf$ = "This is a really long English sentence. I dont really have much to discuss here other than making up long boring sentences. For example, I like sports and fruit and music and guitars and all things that are interesting and I love science and rockets and the universe and quantum physics and various youtube videos"
IdxOfCoinc.f = CoincidenceIndex(@buf$, Len(buf$))
Debug StrF(IdxOfCoinc)
English 0.0667, French 0.0778, German 0.0762, Spanish 0.0770, Italian 0.0738, Russian 0.0529
Random "A-Z" is around 0.038
Ciphertext (eg random distribution of all 0-255) is around 0.00389
"AAAAAAAAAAA" = 1.0
"ABCDEFGHIJKLMNOPQRSTUVWXYZ" = 0.0
"ABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZ" = 0.0196078438
"ABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZ" = 0.0259740259