Page 1 of 2

Database storage or flat file?

Posted: Fri Oct 31, 2008 5:21 am
by Matt
Okay guys, question...

My website is owning the vps server it's on. I never expected the site to get this big (around 40k on alexa) and it receives millions of views a month... anyway I'm trying to optimize the code and everything to handle all of this.

It basically is a picture/video gallery website with a forum. It allows users to register, comment on pictures, vote on pictures, and save favorites. (voting on pictures does not require user registration)

Anyway, voting on pictures requires storing each ip address that votes for each picture so the same ip cannot vote twice. There are a total of approximately 60,000 ratings. Right now it is a mysql database table with each rating value, its corresponding ip address, and its corresponding picture id. (I totally never expected the site to get this large and contain this much data). Now each time a picture page is loaded, the database searches through the table to find if the row with the ip address for the current picture id exists. If it does, don't allow the user to rate the picture again. If it doesn't exist, allow that ip address to vote.

Now here's the problem - should I switch to flat files to store the ip address, by having a file name with the picture id in it (ex: ratingsipID.txt) and then on each line in the file store an ip address. Then when the picture is loaded search through the correct file for the specific ip address to see if they already rated the picture? Or should I just stick to the mysql database? I guess what I'm asking is the limits on using a database vs file in certain cases and what would be more efficient?

Thanks

Posted: Fri Oct 31, 2008 1:33 pm
by Kale
tbh, if i had this problem (and it is a nice problem to have) i would probably weight the cost of hiring a DB guru against the savings made on bandwidth, storage, etc.. £500 spent getting a guy to do it properly might save you £1000's in the year. Just a thought mind. :)

Posted: Fri Oct 31, 2008 2:33 pm
by Joakim Christiansen
Sorry for being off topic here :P But do you use adsense or anything like that? If not, then try it, you'll earn a lot with that many visitors!

Btw, having an "index" on the IPs might speed it up. But I'm not sure.

Posted: Fri Oct 31, 2008 2:50 pm
by Matt
I don't have the money to hire someone to program it for me. I never planned on the site getting so large where I would be having these problems, but now I have to recode most of the site to optimize it to fix these problems.

Current I use clicksor for ads but they do not pay out well at all, so I need to switch to another company. I cannot use adsense because they will not accept the website into the program.

Posted: Fri Oct 31, 2008 2:56 pm
by Rook Zimbabwe
Well... flat file is certainly a way to go. It would require extensive reworking of your filebase though. You would have to set up a file for each image. That would take a bit of time with a large image base

Are the images grouped in any way? You might create smaller databases if you have some form of arrangement.

I am supposed to have adsense, but they never sent me the links to use!

Posted: Fri Oct 31, 2008 3:47 pm
by DoubleDutch
Store the IP addresses as a 32-bit number (or bugger for ipv6) and store them sequentially. When a new IP address hits a picture - search the list using a binary search and store it in the missing position.

Most VPS will allow you to write code that on the server, so you shouldn't have a problem with a tiny linux program.

Posted: Fri Oct 31, 2008 4:28 pm
by Kaeru Gaman
[OT] .. what website is it? I'd like to view some pics....

Posted: Fri Oct 31, 2008 6:22 pm
by DoubleDutch
prob porn. ;)

Posted: Fri Oct 31, 2008 6:45 pm
by Matt
lol
well what I did for now was store the ips in a file related to the picture id (a primary key in the pictures table) and then have the ips stored in that file. If a picture is viewed and the file doesn't exist for that picture already it will be generated then... I'll see how this works and I have to do with for other data as well. fun fun

Posted: Fri Oct 31, 2008 6:46 pm
by ts-soft
why not simple store in a cookie?
User can change IP and can eat cookie, so no difference :wink:

Posted: Fri Oct 31, 2008 8:09 pm
by Matt
Yeah I was thinking about that, but then thats gonna be a crap load of cookies on their part if they rate a lot of pictures though.

Re: Database storage or flat file?

Posted: Sat Nov 01, 2008 12:16 am
by traumatic
I'm pretty amazed how good porn actually still works on the internet.
Congrats to your success!

Taken that aside, do you think saving IPs over a longer period of time than
let's say a day is a good way at all? I mean, seeing that a lot of people are
having dynamic IPs, wouldn't you prevent votes from different users?
I mean... are you having 60.000 votes an hour/a day/week/month?

While a DB would most definitely give you a decent a speed increase, you'll
have to take into account that each user on the DB needs memory to be
reserved. I'm not sure know if that's a change for the better considering the
amount of traffic on your site.

Posted: Sat Nov 01, 2008 3:42 am
by pdwyer
I agree, IP's aren't the way to go, the flat file will also keep getting bigger with stale IP's.

Also, if serveral people go through a proxy (they look at por..er..pics in their lunch hour) then they will not all be able to vote anyway as they have 1 IP shared.

I would use a single cookie to assign each person a unique ID (as you said they don't register) then have a real database and track whatever you like by ID. It would also give you a better idea or your visitors, return rates etc.

A table for the user profiles, a linked table for votes, a nother for access log, another to link them to a registered user if they are one etc

With this many visitors it's probably time to get a bit more serious, you might not have much money now but with that many visitors it's likely you will have money soon if you play your cards right :)

Posted: Sat Nov 01, 2008 5:05 pm
by Matt
Well right now the server seems to be handling the files much better than using the database. And no it's not porn. :lol:

Posted: Wed Nov 05, 2008 3:00 am
by Matt
Now I have no idea. server looks like its exploding.