Using PB to detect fraud on mexican elections?
Using PB to detect fraud on mexican elections?
Maybe some of you read about mexican elections. One party claims that a fraud was done.
Now, the interesting part is that some mexican journalist friend of mine ask me to invite some coders to develope small tools (because the task is so easy) to verify if that fraud was real.
I invite to as many member of this community to develope some small code to achieve this.
the task is this one:
-Need to download the database (4 megas) of the mexican election from the Federal electoral institute (IFE), its a txt file with each voting-place in one line.
-Need to read it using PB and compare if there are any artimeitcal error as one party claims.
The idea is not read all the databse each time, but maybe some random voting-places (don't know how to say it in english) and check if there are matemathical errors and show the results. (Or if you want to verify the whole data its okay)
-------------------------------
If you are interested in participating:
The database can be downloaded from
http://www.ife.org.mx/documentos/proces ... ep2006.htm
Go to the PREP2006-Presidente.zip link and you will get the 4 megas file (its the official one!!)
The file has the data in this columns:
ESTADO|NOM_ESTADO|DISTRITO|SECCION|ID_CASILLA|TIPO_CASILLA|EXT_CONTIGUA|NUM_ACTA_IMPRESO|NUM_BOLETAS_RECIBIDAS|NUM_BOLETAS_SOBRANTES|TOTAL_CIUDADANOS_VOTARON|NUM_BOLETAS_DEPOSITADAS|PAN|PAN_VCAP|ALIANZA_POR_MEXICO|APM_VCAP|POR_EL_BIEN_DE_TODOS|PBT_VCAP|NUEVA_ALIANZA|NA_VCAP|ALTERNATIVA_SOCIALDEMOCRATA|ASDC_VCAP|NUM_VOTOS_CAN_NREG|NUM_VOTOS_NULOS|TOTAL_VOTOS|CASILLA|LISTA_NOMINAL|HORA_RECEPCION_CEDAT|HORA_CAPTURA_CEDAT|HORA_REGISTRO
That means: (The ones with * are the ones that must be verified if the aritmetical result is fine or not)
State
Name of the State
District
Section
ID of voting-place
Kinf of voting place
exterior
number of act
*Number of voting tickets receiben on that voting-place (the papers where people votes)
*Number of voting tickets that was not used at the end of the day
*Total of citizens that vote at that place
*Number of voting tickets found in the ballot box
Next 9 colums are for the votes of different partys
*Number of votes canceled
*Number of nule votes
*Total of votes
Voting-place
*Number of people allowed to vote at that voting place
Time of reception
time of capture
time of capture
Okay, thats the info.
here the important thing is NOT to verify how many votes receive each party, but to detect if there was MORE or LESS ballots in the ballot box that the oneas are suppoused to be, based on the number of voting ticktes received at the beggining of the day LESS the number of tickets that was not used at the end of the day, etc.
If you see carefully, many columns must match on each voting place. IF NOT its because something strange happends.
Okay, one party claims that there are plenty errors and that is very easy to anyone to notice it doing any elemental analysis.
Ive done this by myself and the result was that ITS A MESS, there are something like 30% of voting-places with MANY MANY errors.
I hope some of you get interested and develope some easy algorithm and publish it here and if you want publish here the results you find in your tests verifying the mexican election numbers.
*The database is stored as .txt file and each voting place is one line, and each column is separed by |
So it easy to split it.
I hope my explanation is not to badin english :roll:
This is a good cause and PB coders could do a good job.
If the results are interesting it will published on a newspaper and a credit will be showed to PB community!!
Thanks a lot!!
Now, the interesting part is that some mexican journalist friend of mine ask me to invite some coders to develope small tools (because the task is so easy) to verify if that fraud was real.
I invite to as many member of this community to develope some small code to achieve this.
the task is this one:
-Need to download the database (4 megas) of the mexican election from the Federal electoral institute (IFE), its a txt file with each voting-place in one line.
-Need to read it using PB and compare if there are any artimeitcal error as one party claims.
The idea is not read all the databse each time, but maybe some random voting-places (don't know how to say it in english) and check if there are matemathical errors and show the results. (Or if you want to verify the whole data its okay)
-------------------------------
If you are interested in participating:
The database can be downloaded from
http://www.ife.org.mx/documentos/proces ... ep2006.htm
Go to the PREP2006-Presidente.zip link and you will get the 4 megas file (its the official one!!)
The file has the data in this columns:
ESTADO|NOM_ESTADO|DISTRITO|SECCION|ID_CASILLA|TIPO_CASILLA|EXT_CONTIGUA|NUM_ACTA_IMPRESO|NUM_BOLETAS_RECIBIDAS|NUM_BOLETAS_SOBRANTES|TOTAL_CIUDADANOS_VOTARON|NUM_BOLETAS_DEPOSITADAS|PAN|PAN_VCAP|ALIANZA_POR_MEXICO|APM_VCAP|POR_EL_BIEN_DE_TODOS|PBT_VCAP|NUEVA_ALIANZA|NA_VCAP|ALTERNATIVA_SOCIALDEMOCRATA|ASDC_VCAP|NUM_VOTOS_CAN_NREG|NUM_VOTOS_NULOS|TOTAL_VOTOS|CASILLA|LISTA_NOMINAL|HORA_RECEPCION_CEDAT|HORA_CAPTURA_CEDAT|HORA_REGISTRO
That means: (The ones with * are the ones that must be verified if the aritmetical result is fine or not)
State
Name of the State
District
Section
ID of voting-place
Kinf of voting place
exterior
number of act
*Number of voting tickets receiben on that voting-place (the papers where people votes)
*Number of voting tickets that was not used at the end of the day
*Total of citizens that vote at that place
*Number of voting tickets found in the ballot box
Next 9 colums are for the votes of different partys
*Number of votes canceled
*Number of nule votes
*Total of votes
Voting-place
*Number of people allowed to vote at that voting place
Time of reception
time of capture
time of capture
Okay, thats the info.
here the important thing is NOT to verify how many votes receive each party, but to detect if there was MORE or LESS ballots in the ballot box that the oneas are suppoused to be, based on the number of voting ticktes received at the beggining of the day LESS the number of tickets that was not used at the end of the day, etc.
If you see carefully, many columns must match on each voting place. IF NOT its because something strange happends.
Okay, one party claims that there are plenty errors and that is very easy to anyone to notice it doing any elemental analysis.
Ive done this by myself and the result was that ITS A MESS, there are something like 30% of voting-places with MANY MANY errors.
I hope some of you get interested and develope some easy algorithm and publish it here and if you want publish here the results you find in your tests verifying the mexican election numbers.
*The database is stored as .txt file and each voting place is one line, and each column is separed by |
So it easy to split it.
I hope my explanation is not to badin english :roll:
This is a good cause and PB coders could do a good job.
If the results are interesting it will published on a newspaper and a credit will be showed to PB community!!
Thanks a lot!!
-
- Enthusiast
- Posts: 731
- Joined: Wed Apr 21, 2004 7:12 pm
Good!!Killswitch wrote:I'd be interested in writing an application for this, but I'm not really sure what needs to be checked (sorry), could you clarify?
Yes, what need to be clarified is if the elemental math on each ballot place are okay.
Per example, verify if the number betwen the ballots that was not used at the end of the day (column 10), the number of ballots received at the begging of the day (column 9) and the total of ballots counted as votes (coumn 12) matchs.
Also if the number of citizens that votes matchs against all this numbers.
Believe it or not, in that basic arithmetics i found a big percent of errors. Per example, many voting-places have at the end of the day MORE VOTES that the ones suppoused to have if we follow some artimetic.
Per example:
One voting-place receive 400 ballots at the beggining of the elections day.
230 citizens go to vote.
But they have 190 not used ballots at the end of the day (and they should have only 170!!) Where does that 20 extra votes comes from? Numbers don't match!!!.
That shows that something goes wrong there... if this happends in an important number of voting-places, then we can feel that the whole election is not much cleaner.
What need to be verifyed is the elemental math only and check if there are errors or not.
-
- Enthusiast
- Posts: 731
- Joined: Wed Apr 21, 2004 7:12 pm
Here we go. Place in the same folder as the database.
I think it's working correctly - it outputs to a new file called Results.txt.
I hope I got this right, but this is what it checks:
(Column 9 - Column 10) <> Column 12 ;If true, it's wrong
This code is easily modifyable as you can use GetColumn to get the value of any particular column, so you can extend it to check as many different results as you want.
I think it's working correctly - it outputs to a new file called Results.txt.
I hope I got this right, but this is what it checks:
(Column 9 - Column 10) <> Column 12 ;If true, it's wrong
This code is easily modifyable as you can use GetColumn to get the value of any particular column, so you can extend it to check as many different results as you want.
Code: Select all
Procedure.s GetColumn(Column,String.s)
For t=1 To Column-1
Pos=FindString(String,"|",Pos)+1
Next t
ProcedureReturn Mid(String,Pos,FindString(String,"|",Pos)-Pos)
EndProcedure
Structure Votes
ID.l
Is.l
ShouldBe.l
EndStructure
NewList BoxTally.Votes()
If OpenFile(0,"PREP2006-Presidente.txt")
ReadString(0) : ReadString(0) ;Ignore first two lines
t=2
While Eof(0)=#False
t+1
String.s=ReadString(0)
ID=t
Received=Val(GetColumn(9,String))
NotUsed=Val(GetColumn(10,String))
InBox=Val(GetColumn(12,String))
If (Received-NotUsed)<>InBox
Wrong+1
AddElement(BoxTally())
BoxTally()\ID=ID
BoxTally()\Is=InBox
BoxTally()\ShouldBe=(Received-NotUsed)
EndIf
Wend
If CreateFile(0,"Results.txt")
WriteStringN(0,"RESULTS")
WriteStringN(0,"")
WriteStringN(0," Box Errors")
WriteStringN(0," ;Number of votes in box does not match ballets at station subtract ballets used")
WriteStringN(0," ;Line: InBox / Should Be")
ForEach BoxTally()
WriteStringN(0," "+Str(BoxTally()\ID)+": "+Str(BoxTally()\Is)+" / "+Str(BoxTally()\ShouldBe))
Next
Else
MessageRequester("Error","Could not create Results.txt")
End
EndIf
MessageRequester("Done!",Str(Wrong)+"/"+Str(t-2)+" incorrect.")
Else
MessageRequester("Error","Could not open database.")
EndIf
Last edited by Killswitch on Mon Sep 11, 2006 10:52 pm, edited 1 time in total.
~I see one problem with your reasoning: the fact is thats not a chicken~
-
- Enthusiast
- Posts: 731
- Joined: Wed Apr 21, 2004 7:12 pm
Yes, i know.Killswitch wrote:I'll upload them later, they're on my laptop.
I don't think this is conclusive though, I mean ballots can get lost, or get thrown out etc, etc.
but if you check, many times ballots appers from nowhere!!
Not all differences means ballots lost, but many many times means ballots magical appear!! he he
However, close to 50% errors, sound very bad
the winning party winds with less then 0.5% difference... sounds very suspicious!
-
- Enthusiast
- Posts: 731
- Joined: Wed Apr 21, 2004 7:12 pm
Here's the results I got running the above code:
http://www.btinternet.com/~douglas.marsh/Results.txt
But, I just ran a test for when there are more ballots in the box than were used and the results are still pretty bad: 17849 - 15% - that's terrible.
BTW I realised a small error in the code I pasted above, when it reports the number of incorrect items / total items the first two lines of the file count (which are just formatting, really) so I've edited it above. Nothing major.
http://www.btinternet.com/~douglas.marsh/Results.txt
But, I just ran a test for when there are more ballots in the box than were used and the results are still pretty bad: 17849 - 15% - that's terrible.
BTW I realised a small error in the code I pasted above, when it reports the number of incorrect items / total items the first two lines of the file count (which are just formatting, really) so I've edited it above. Nothing major.
~I see one problem with your reasoning: the fact is thats not a chicken~
-
- User
- Posts: 11
- Joined: Thu Aug 31, 2006 2:29 pm
I probably made some mistakes in this, but if I understood the structure of the database correctly, and my code is right, there are close to two million votes more than there are people who have voted?
I must've made a mistake somewhere.
Code: Select all
If OpenFile(0, "G:\PREP2006-Presidente\PREP2006-Presidente.txt")
RecordsRead.l = 0
WrongRecords.l = 0
Discrepancy = 0
Repeat
RecordsRead + 1
Record.s = ReadString(0)
If (Val(StringField(Record, 9, "|"))-Val(StringField(Record, 10, "|"))) <> Val(StringField(Record, 12, "|"))
WrongRecords + 1
;Debug Record
Discrepancy + (Val(StringField(Record, 11, "|"))-Val(StringField(Record, 12, "|")))
EndIf
Until Eof(0)
Debug RecordsRead
Debug WrongRecords
Debug Discrepancy
CloseFile(0)
EndIf
When I run the program below on Killswitch's results, it seems that the cumulative difference in lines where the number of ballots were greater than it should have been is about 1 million and the cumulative difference in lines where the number of ballots were less than it should have been is about 3 million. The net is about 2 million.John Bedlam wrote:I must've made a mistake somewhere.
Code: Select all
Enumeration
#votes
EndEnumeration
OpenFile(#votes, "votes.txt")
intot = 0: shouldtot = 0: posdiff = 0: negdiff = 0: poscount = 0: negcount = 0
While Eof(#votes) = 0
linein = 0: lineshould = 0
line.s = ReadString(#votes)
colpos = FindString(line, ":", 1)
slashpos = FindString(line, "/", 1)
linein = Val(Mid(line, colpos + 1, slashpos - colpos - 1))
intot = intot + linein
lineshould = Val(Mid(line, slashpos + 1, Len(line) - colpos - 1))
shouldtot = shouldtot + lineshould
If linein > lineshould: poscount + 1: posdiff = posdiff + linein - lineshould: Else: negcount + 1: negdiff = negdiff + lineshould - linein: EndIf
Wend
Debug shouldtot
Debug intot
Debug intot - shouldtot
Debug Posdiff
Debug Negdiff
Debug poscount
Debug negcount
Last edited by mike74 on Mon Sep 11, 2006 11:49 pm, edited 2 times in total.
Using Kohn idea, but trying to be more precise in the different kind of error i found there could be 4 different kind of errors on each voting place:
-More deposited ballots that was suppoused to be (received - left)
-Less deposited ballots that was suppoused to be (received - left)
-More ballots deposited than citizens that vote on that place
-Less ballots deposited than citizens that vote on that place
Then i change the code to this:
I don't know if made some mistake since results show TOO MANY errors in the results of the election!!
117296
More Ballots thats suppoused to be: 17849 places
Less Ballots thats suppoused to be: 39808 places
More citizens that ballots: 32129 places
Less citizens that ballots: 19905 places
More ballots than difference (received - left): 966559 ballots
Less ballots than difference (received - left): 3127090 ballots
Less deposited ballots than citizens: 3419291 ballots
More deposited ballots than citizens: 1597709 ballots
Can anybody see if im wrong here please?
-More deposited ballots that was suppoused to be (received - left)
-Less deposited ballots that was suppoused to be (received - left)
-More ballots deposited than citizens that vote on that place
-Less ballots deposited than citizens that vote on that place
Then i change the code to this:
Code: Select all
If OpenFile(0, "PREP2006-Presidente.txt")
RecordsRead.l = 0
Record.s = ReadString():Record.s = ReadString();first two lines
Repeat
RecordsRead + 1
Record.s = ReadString()
uBallotsReceived = Val(StringField(Record, 9, "|"))
uBallotsLeft = Val(StringField(Record, 10, "|"))
uCitizensThatVote = Val(StringField(Record, 11, "|"))
uDepositedBallots = Val(StringField(Record, 12, "|"))
If uBallotsReceived-uBallotsLeft < uDepositedBallots
;More ballots that was suppoused to be
WrongRecords1 + 1
Discrepancy1 + (uDepositedBallots-(uBallotsReceived-uBallotsLeft))
ElseIf uBallotsReceived-uBallotsLeft > uDepositedBallots
;Less ballots that was suppoused to be
WrongRecords2 + 1
Discrepancy2 + ((uBallotsReceived-uBallotsLeft)-uDepositedBallots)
EndIf
If uCitizensThatVote > uDepositedBallots
;More ballots deposited than citizens
WrongRecords3 + 1
Discrepancy3 + (uCitizensThatVote-uDepositedBallots)
ElseIf uCitizensThatVote < uDepositedBallots
;Less ballots deposited than citizens
WrongRecords4 + 1
Discrepancy4 + (uDepositedBallots-uCitizensThatVote)
EndIf
Until Eof(0)
Debug RecordsRead
Debug ""
Debug "More Ballots thats suppoused to be: " + Str(WrongRecords1 ) + " places"
Debug "Less Ballots thats suppoused to be: " + Str(WrongRecords2) + " places"
Debug "More citizens that ballots: " + Str(WrongRecords3 ) + " places"
Debug "Less citizens that ballots: " + Str(WrongRecords4) + " places"
Debug ""
Debug "More ballots than difference (received - left): " + Str(Discrepancy1) + " ballots"
Debug "Less ballots than difference (received - left): " + Str(Discrepancy2) + " ballots"
Debug "Less deposited ballots than citizens: " + Str(Discrepancy3) + " ballots"
Debug "More deposited ballots than citizens: " + Str(Discrepancy4) + " ballots"
Debug ""
CloseFile(0)
EndIf
117296
More Ballots thats suppoused to be: 17849 places
Less Ballots thats suppoused to be: 39808 places
More citizens that ballots: 32129 places
Less citizens that ballots: 19905 places
More ballots than difference (received - left): 966559 ballots
Less ballots than difference (received - left): 3127090 ballots
Less deposited ballots than citizens: 3419291 ballots
More deposited ballots than citizens: 1597709 ballots
Can anybody see if im wrong here please?
Same that i get using modified John's code!!mike74 wrote:Yeah, I did. I just edited my first post and the program for clarity. It appears that there were 17,849 cases where there were more ballots than there should have been, and 39,808 cases where there were less.mike74 wrote:Wait I think got it backwards.
Did you count how many ballots did we get on each case (not only how many cases with discrepancy, but the total of ballots on each discrepancy case)?
My results are very big ones, its hard for me to believe that there could be such amount of ballots missed/added.