Database storage or flat file?
Database storage or flat file?
Okay guys, question...
My website is owning the vps server it's on. I never expected the site to get this big (around 40k on alexa) and it receives millions of views a month... anyway I'm trying to optimize the code and everything to handle all of this.
It basically is a picture/video gallery website with a forum. It allows users to register, comment on pictures, vote on pictures, and save favorites. (voting on pictures does not require user registration)
Anyway, voting on pictures requires storing each ip address that votes for each picture so the same ip cannot vote twice. There are a total of approximately 60,000 ratings. Right now it is a mysql database table with each rating value, its corresponding ip address, and its corresponding picture id. (I totally never expected the site to get this large and contain this much data). Now each time a picture page is loaded, the database searches through the table to find if the row with the ip address for the current picture id exists. If it does, don't allow the user to rate the picture again. If it doesn't exist, allow that ip address to vote.
Now here's the problem - should I switch to flat files to store the ip address, by having a file name with the picture id in it (ex: ratingsipID.txt) and then on each line in the file store an ip address. Then when the picture is loaded search through the correct file for the specific ip address to see if they already rated the picture? Or should I just stick to the mysql database? I guess what I'm asking is the limits on using a database vs file in certain cases and what would be more efficient?
Thanks
My website is owning the vps server it's on. I never expected the site to get this big (around 40k on alexa) and it receives millions of views a month... anyway I'm trying to optimize the code and everything to handle all of this.
It basically is a picture/video gallery website with a forum. It allows users to register, comment on pictures, vote on pictures, and save favorites. (voting on pictures does not require user registration)
Anyway, voting on pictures requires storing each ip address that votes for each picture so the same ip cannot vote twice. There are a total of approximately 60,000 ratings. Right now it is a mysql database table with each rating value, its corresponding ip address, and its corresponding picture id. (I totally never expected the site to get this large and contain this much data). Now each time a picture page is loaded, the database searches through the table to find if the row with the ip address for the current picture id exists. If it does, don't allow the user to rate the picture again. If it doesn't exist, allow that ip address to vote.
Now here's the problem - should I switch to flat files to store the ip address, by having a file name with the picture id in it (ex: ratingsipID.txt) and then on each line in the file store an ip address. Then when the picture is loaded search through the correct file for the specific ip address to see if they already rated the picture? Or should I just stick to the mysql database? I guess what I'm asking is the limits on using a database vs file in certain cases and what would be more efficient?
Thanks
- Joakim Christiansen
- Addict
- Posts: 2452
- Joined: Wed Dec 22, 2004 4:12 pm
- Location: Norway
- Contact:
I don't have the money to hire someone to program it for me. I never planned on the site getting so large where I would be having these problems, but now I have to recode most of the site to optimize it to fix these problems.
Current I use clicksor for ads but they do not pay out well at all, so I need to switch to another company. I cannot use adsense because they will not accept the website into the program.
Current I use clicksor for ads but they do not pay out well at all, so I need to switch to another company. I cannot use adsense because they will not accept the website into the program.
- Rook Zimbabwe
- Addict
- Posts: 4322
- Joined: Tue Jan 02, 2007 8:16 pm
- Location: Cypress TX
- Contact:
Well... flat file is certainly a way to go. It would require extensive reworking of your filebase though. You would have to set up a file for each image. That would take a bit of time with a large image base
Are the images grouped in any way? You might create smaller databases if you have some form of arrangement.
I am supposed to have adsense, but they never sent me the links to use!
Are the images grouped in any way? You might create smaller databases if you have some form of arrangement.
I am supposed to have adsense, but they never sent me the links to use!
- DoubleDutch
- Addict
- Posts: 3220
- Joined: Thu Aug 07, 2003 7:01 pm
- Location: United Kingdom
- Contact:
Store the IP addresses as a 32-bit number (or bugger for ipv6) and store them sequentially. When a new IP address hits a picture - search the list using a binary search and store it in the missing position.
Most VPS will allow you to write code that on the server, so you shouldn't have a problem with a tiny linux program.
Most VPS will allow you to write code that on the server, so you shouldn't have a problem with a tiny linux program.
https://deluxepixel.com <- My Business website
https://reportcomplete.com <- School end of term reports system
https://reportcomplete.com <- School end of term reports system
- Kaeru Gaman
- Addict
- Posts: 4826
- Joined: Sun Mar 19, 2006 1:57 pm
- Location: Germany
- DoubleDutch
- Addict
- Posts: 3220
- Joined: Thu Aug 07, 2003 7:01 pm
- Location: United Kingdom
- Contact:
prob porn. 

https://deluxepixel.com <- My Business website
https://reportcomplete.com <- School end of term reports system
https://reportcomplete.com <- School end of term reports system
lol
well what I did for now was store the ips in a file related to the picture id (a primary key in the pictures table) and then have the ips stored in that file. If a picture is viewed and the file doesn't exist for that picture already it will be generated then... I'll see how this works and I have to do with for other data as well. fun fun
well what I did for now was store the ips in a file related to the picture id (a primary key in the pictures table) and then have the ips stored in that file. If a picture is viewed and the file doesn't exist for that picture already it will be generated then... I'll see how this works and I have to do with for other data as well. fun fun
why not simple store in a cookie?
User can change IP and can eat cookie, so no difference
User can change IP and can eat cookie, so no difference

PureBasic 5.73 | SpiderBasic 2.30 | Windows 10 Pro (x64) | Linux Mint 20.1 (x64)
Old bugs good, new bugs bad! Updates are evil: might fix old bugs and introduce no new ones.

Old bugs good, new bugs bad! Updates are evil: might fix old bugs and introduce no new ones.

Re: Database storage or flat file?
I'm pretty amazed how good porn actually still works on the internet.
Congrats to your success!
Taken that aside, do you think saving IPs over a longer period of time than
let's say a day is a good way at all? I mean, seeing that a lot of people are
having dynamic IPs, wouldn't you prevent votes from different users?
I mean... are you having 60.000 votes an hour/a day/week/month?
While a DB would most definitely give you a decent a speed increase, you'll
have to take into account that each user on the DB needs memory to be
reserved. I'm not sure know if that's a change for the better considering the
amount of traffic on your site.
Congrats to your success!
Taken that aside, do you think saving IPs over a longer period of time than
let's say a day is a good way at all? I mean, seeing that a lot of people are
having dynamic IPs, wouldn't you prevent votes from different users?
I mean... are you having 60.000 votes an hour/a day/week/month?
While a DB would most definitely give you a decent a speed increase, you'll
have to take into account that each user on the DB needs memory to be
reserved. I'm not sure know if that's a change for the better considering the
amount of traffic on your site.
Good programmers don't comment their code. It was hard to write, should be hard to read.
I agree, IP's aren't the way to go, the flat file will also keep getting bigger with stale IP's.
Also, if serveral people go through a proxy (they look at por..er..pics in their lunch hour) then they will not all be able to vote anyway as they have 1 IP shared.
I would use a single cookie to assign each person a unique ID (as you said they don't register) then have a real database and track whatever you like by ID. It would also give you a better idea or your visitors, return rates etc.
A table for the user profiles, a linked table for votes, a nother for access log, another to link them to a registered user if they are one etc
With this many visitors it's probably time to get a bit more serious, you might not have much money now but with that many visitors it's likely you will have money soon if you play your cards right
Also, if serveral people go through a proxy (they look at por..er..pics in their lunch hour) then they will not all be able to vote anyway as they have 1 IP shared.
I would use a single cookie to assign each person a unique ID (as you said they don't register) then have a real database and track whatever you like by ID. It would also give you a better idea or your visitors, return rates etc.
A table for the user profiles, a linked table for votes, a nother for access log, another to link them to a registered user if they are one etc
With this many visitors it's probably time to get a bit more serious, you might not have much money now but with that many visitors it's likely you will have money soon if you play your cards right

Paul Dwyer
“In nature, it’s not the strongest nor the most intelligent who survives. It’s the most adaptable to change” - Charles Darwin
“If you can't explain it to a six-year old you really don't understand it yourself.” - Albert Einstein
“In nature, it’s not the strongest nor the most intelligent who survives. It’s the most adaptable to change” - Charles Darwin
“If you can't explain it to a six-year old you really don't understand it yourself.” - Albert Einstein