Using PB to detect fraud on mexican elections?
Posted: Mon Sep 11, 2006 4:57 pm
Maybe some of you read about mexican elections. One party claims that a fraud was done.
Now, the interesting part is that some mexican journalist friend of mine ask me to invite some coders to develope small tools (because the task is so easy) to verify if that fraud was real.
I invite to as many member of this community to develope some small code to achieve this.
the task is this one:
-Need to download the database (4 megas) of the mexican election from the Federal electoral institute (IFE), its a txt file with each voting-place in one line.
-Need to read it using PB and compare if there are any artimeitcal error as one party claims.
The idea is not read all the databse each time, but maybe some random voting-places (don't know how to say it in english) and check if there are matemathical errors and show the results. (Or if you want to verify the whole data its okay)
-------------------------------
If you are interested in participating:
The database can be downloaded from
http://www.ife.org.mx/documentos/proces ... ep2006.htm
Go to the PREP2006-Presidente.zip link and you will get the 4 megas file (its the official one!!)
The file has the data in this columns:
ESTADO|NOM_ESTADO|DISTRITO|SECCION|ID_CASILLA|TIPO_CASILLA|EXT_CONTIGUA|NUM_ACTA_IMPRESO|NUM_BOLETAS_RECIBIDAS|NUM_BOLETAS_SOBRANTES|TOTAL_CIUDADANOS_VOTARON|NUM_BOLETAS_DEPOSITADAS|PAN|PAN_VCAP|ALIANZA_POR_MEXICO|APM_VCAP|POR_EL_BIEN_DE_TODOS|PBT_VCAP|NUEVA_ALIANZA|NA_VCAP|ALTERNATIVA_SOCIALDEMOCRATA|ASDC_VCAP|NUM_VOTOS_CAN_NREG|NUM_VOTOS_NULOS|TOTAL_VOTOS|CASILLA|LISTA_NOMINAL|HORA_RECEPCION_CEDAT|HORA_CAPTURA_CEDAT|HORA_REGISTRO
That means: (The ones with * are the ones that must be verified if the aritmetical result is fine or not)
State
Name of the State
District
Section
ID of voting-place
Kinf of voting place
exterior
number of act
*Number of voting tickets receiben on that voting-place (the papers where people votes)
*Number of voting tickets that was not used at the end of the day
*Total of citizens that vote at that place
*Number of voting tickets found in the ballot box
Next 9 colums are for the votes of different partys
*Number of votes canceled
*Number of nule votes
*Total of votes
Voting-place
*Number of people allowed to vote at that voting place
Time of reception
time of capture
time of capture
Okay, thats the info.
here the important thing is NOT to verify how many votes receive each party, but to detect if there was MORE or LESS ballots in the ballot box that the oneas are suppoused to be, based on the number of voting ticktes received at the beggining of the day LESS the number of tickets that was not used at the end of the day, etc.
If you see carefully, many columns must match on each voting place. IF NOT its because something strange happends.
Okay, one party claims that there are plenty errors and that is very easy to anyone to notice it doing any elemental analysis.
Ive done this by myself and the result was that ITS A MESS, there are something like 30% of voting-places with MANY MANY errors.
I hope some of you get interested and develope some easy algorithm and publish it here and if you want publish here the results you find in your tests verifying the mexican election numbers.
*The database is stored as .txt file and each voting place is one line, and each column is separed by |
So it easy to split it.
I hope my explanation is not to badin english :roll:
This is a good cause and PB coders could do a good job.
If the results are interesting it will published on a newspaper and a credit will be showed to PB community!!
Thanks a lot!!
Now, the interesting part is that some mexican journalist friend of mine ask me to invite some coders to develope small tools (because the task is so easy) to verify if that fraud was real.
I invite to as many member of this community to develope some small code to achieve this.
the task is this one:
-Need to download the database (4 megas) of the mexican election from the Federal electoral institute (IFE), its a txt file with each voting-place in one line.
-Need to read it using PB and compare if there are any artimeitcal error as one party claims.
The idea is not read all the databse each time, but maybe some random voting-places (don't know how to say it in english) and check if there are matemathical errors and show the results. (Or if you want to verify the whole data its okay)
-------------------------------
If you are interested in participating:
The database can be downloaded from
http://www.ife.org.mx/documentos/proces ... ep2006.htm
Go to the PREP2006-Presidente.zip link and you will get the 4 megas file (its the official one!!)
The file has the data in this columns:
ESTADO|NOM_ESTADO|DISTRITO|SECCION|ID_CASILLA|TIPO_CASILLA|EXT_CONTIGUA|NUM_ACTA_IMPRESO|NUM_BOLETAS_RECIBIDAS|NUM_BOLETAS_SOBRANTES|TOTAL_CIUDADANOS_VOTARON|NUM_BOLETAS_DEPOSITADAS|PAN|PAN_VCAP|ALIANZA_POR_MEXICO|APM_VCAP|POR_EL_BIEN_DE_TODOS|PBT_VCAP|NUEVA_ALIANZA|NA_VCAP|ALTERNATIVA_SOCIALDEMOCRATA|ASDC_VCAP|NUM_VOTOS_CAN_NREG|NUM_VOTOS_NULOS|TOTAL_VOTOS|CASILLA|LISTA_NOMINAL|HORA_RECEPCION_CEDAT|HORA_CAPTURA_CEDAT|HORA_REGISTRO
That means: (The ones with * are the ones that must be verified if the aritmetical result is fine or not)
State
Name of the State
District
Section
ID of voting-place
Kinf of voting place
exterior
number of act
*Number of voting tickets receiben on that voting-place (the papers where people votes)
*Number of voting tickets that was not used at the end of the day
*Total of citizens that vote at that place
*Number of voting tickets found in the ballot box
Next 9 colums are for the votes of different partys
*Number of votes canceled
*Number of nule votes
*Total of votes
Voting-place
*Number of people allowed to vote at that voting place
Time of reception
time of capture
time of capture
Okay, thats the info.
here the important thing is NOT to verify how many votes receive each party, but to detect if there was MORE or LESS ballots in the ballot box that the oneas are suppoused to be, based on the number of voting ticktes received at the beggining of the day LESS the number of tickets that was not used at the end of the day, etc.
If you see carefully, many columns must match on each voting place. IF NOT its because something strange happends.
Okay, one party claims that there are plenty errors and that is very easy to anyone to notice it doing any elemental analysis.
Ive done this by myself and the result was that ITS A MESS, there are something like 30% of voting-places with MANY MANY errors.
I hope some of you get interested and develope some easy algorithm and publish it here and if you want publish here the results you find in your tests verifying the mexican election numbers.
*The database is stored as .txt file and each voting place is one line, and each column is separed by |
So it easy to split it.
I hope my explanation is not to badin english :roll:
This is a good cause and PB coders could do a good job.
If the results are interesting it will published on a newspaper and a credit will be showed to PB community!!
Thanks a lot!!