Download the forum?

For everything that's not in any way related to PureBasic. General chat etc...
techjunkie
Addict
Addict
Posts: 1126
Joined: Wed Oct 15, 2003 12:40 am
Location: Sweden
Contact:

Download the forum?

Post by techjunkie »

Hi guys and gals!

Anyone that have downloaded the whole PB Forum for offline browsing? Is this possible? I'm switching ISP at home and really NEED the forum - it's a great resource of info.

Or is it all PHP and a database? :cry:
Image
(\__/)
(='.'=) This is Bunny. Copy and paste Bunny into your
(")_(") signature to help him gain world domination.
PB
PureBasic Expert
PureBasic Expert
Posts: 7581
Joined: Fri Apr 25, 2003 5:24 pm

Re: Download the forum?

Post by PB »

Funnily enough, I was just trialling this today: http://www.surfoffline.com/

It's not free, but it's easy to use and I just tested it with the "Tips and Tricks"
section of these forums. I set it to go one link deep (you can set more) and
the results were as follows:

The offline "Tips and Tricks" page:

Image

After clicking the first tip (about how to bold a gadget):

Image

So, as you can see, it certainly seems possible to download the entire
forums with it. In fact, I may even register this app because of this. :)

@Fred: Just how big are these forums (in GB) anyway?
I compile using 5.31 (x86) on Win 7 Ultimate (64-bit).
"PureBasic won't be object oriented, period" - Fred.
Dare2
Moderator
Moderator
Posts: 3321
Joined: Sat Dec 27, 2003 3:55 am
Location: Great Southern Land

Post by Dare2 »

Hiya,

Before you spend your dosh -

- you can get free (and open source) software that does this. Eg Httrack or winHttrack. There are also free asp/php/etc scripts and etc that scrape sites. dmoz is always being scraped.

But, better check if these forums have a bandwidth choke, or lock, or charge, because I saw somewhere that there are considerable megs of databased stuff, let alone all the bumph.

So if just a few of us did this now, it might be curtains for the forums for rest of the month!

Probably better to ask to get just the articles database made available somewhere. That won't bring down all the gumph as well.

Or write a spider that pulls down just the threads/posts.

Just a thought.

:)



Edit:

In fact, scattered about the forums there are code snippets that touch on this and I think I recall someone has actually started or shown a bot. Anyhow they are fairly easy to write.
@}--`--,-- A rose by any other name ..
techjunkie
Addict
Addict
Posts: 1126
Joined: Wed Oct 15, 2003 12:40 am
Location: Sweden
Contact:

Post by techjunkie »

Dare2 wrote:Hiya,

Before you spend your dosh -

- you can get free (and open source) software that does this. Eg Httrack or winHttrack. There are also free asp/php/etc scripts and etc that scrape sites. dmoz is always being scraped.
Thanks! Yeah - I've used Httrack before, I'll give it a shoot - and see if it can handle forums like this.
Image
(\__/)
(='.'=) This is Bunny. Copy and paste Bunny into your
(")_(") signature to help him gain world domination.
PB
PureBasic Expert
PureBasic Expert
Posts: 7581
Joined: Fri Apr 25, 2003 5:24 pm

Post by PB »

> I've used Httrack before, I'll give it a shoot - and see if it can handle forums like this

I just tried it, and yep, it worked fine with the same "Tips and Tricks" test
that I did with SurfOffline. It's not as easy to use as SurfOffline, but being
free makes it very attractive. :)
I compile using 5.31 (x86) on Win 7 Ultimate (64-bit).
"PureBasic won't be object oriented, period" - Fred.
MrMat
Enthusiast
Enthusiast
Posts: 762
Joined: Sun Sep 05, 2004 6:27 am
Location: England

Post by MrMat »

This has been asked before but it would be great if the forums could be purchased on CD. Saves bandwidth downloading it and might make Fred a little money on the side.
Last edited by MrMat on Fri Sep 02, 2005 2:44 pm, edited 1 time in total.
Mat
Dare2
Moderator
Moderator
Posts: 3321
Joined: Sat Dec 27, 2003 3:55 am
Location: Great Southern Land

Post by Dare2 »

I wonder how big they are. :)

I had a potential client - note the potential, lost it :( - who was unhappy with his service and solutions provider and wanted to know if we would take him on as a client.

So I Httrack-ed his site, so I could get a feel for what we were in for.

Zillions of bytes later .... his bandwith alloc hammered ... :)

But that's not why we lost the deal. He sold his business and the new owner was quite happy to keep things as they were.
@}--`--,-- A rose by any other name ..
PB
PureBasic Expert
PureBasic Expert
Posts: 7581
Joined: Fri Apr 25, 2003 5:24 pm

Post by PB »

> it would be great if the forums could be purchased on CD

Probably need a DVD, I reckon (unless zipped for extraction). :)

The problem is that the forums change constantly, so whatever is saved now
is going to obsolete after just one week. So then what? We can't download
the site again, and again, etc... the bandwidth costs would be huge. I guess
that's why it's never been seriously considered as an option.
I compile using 5.31 (x86) on Win 7 Ultimate (64-bit).
"PureBasic won't be object oriented, period" - Fred.
Brice Manuel

Post by Brice Manuel »

Nice to know the forums will be undergoing DoS attacks by ceratin users :?

I agree with Paul at IBasic in permanently banning people who do such things and turning them in to their ISP and letting the datacenter who hosts the forums take whatever legal action they can against these people.

Very abusive, mean and selfish behavior no matter how you try and spin it.
Dare2
Moderator
Moderator
Posts: 3321
Joined: Sat Dec 27, 2003 3:55 am
Location: Great Southern Land

Post by Dare2 »

:?:
@}--`--,-- A rose by any other name ..
Dare2
Moderator
Moderator
Posts: 3321
Joined: Sat Dec 27, 2003 3:55 am
Location: Great Southern Land

Post by Dare2 »

Hi PB,

If you write your own scraper, then after the initial DL or CD purchase or whatever you just get the latest posts. So run say once a week and pick up posts since the last DLed post.

Aside: Which is no more bandwidth consuming than browsing. It is the initial get that is going to clobber your bandwidth. I assume, being an Aussie, that you have some limit before choking or cut-off is applied? I know I do.

If not, I want your ISP's name, now! :)
@}--`--,-- A rose by any other name ..
MrMat
Enthusiast
Enthusiast
Posts: 762
Joined: Sun Sep 05, 2004 6:27 am
Location: England

Post by MrMat »

PB wrote:Probably need a DVD, I reckon (unless zipped for extraction). :)
Could be! It might just fit on CD though (90k articles according to the main page).
PB wrote:The problem is that the forums change constantly, so whatever is saved now is going to obsolete after just one week. So then what?
That's true but if people want up to date information they should visit the site. If they want a handy backup with thousands of useful posts that are available offline, available when the forums are down and faster for people with slow connections then an archive would be very handy. Whether on CD or DVD i think it is much preferable to people downloading the forums individually...
Mat
PB
PureBasic Expert
PureBasic Expert
Posts: 7581
Joined: Fri Apr 25, 2003 5:24 pm

Post by PB »

> Nice to know the forums will be undergoing DoS attacks by ceratin users

When I said I tested it with the "Tips and Tricks" section, I did not download
the ENTIRE section -- I only downloaded a few pages of links and no more
than 1 link deep (or did you miss that in my original post?). Also look at the
screenshots I provided -- I didn't even download all graphics -- mainly just
the text, for testing purposes as stated.

I would NEVER download this entire site without prior permission. I agree
that Fred should make the forums available on DVD once per month or so,
he'd make a killing on selling them! :)
I compile using 5.31 (x86) on Win 7 Ultimate (64-bit).
"PureBasic won't be object oriented, period" - Fred.
techjunkie
Addict
Addict
Posts: 1126
Joined: Wed Oct 15, 2003 12:40 am
Location: Sweden
Contact:

Post by techjunkie »

PB wrote:> I've used Httrack before, I'll give it a shoot - and see if it can handle forums like this

I just tried it, and yep, it worked fine with the same "Tips and Tricks" test
that I did with SurfOffline. It's not as easy to use as SurfOffline, but being
free makes it very attractive. :)
Yeah... It works ok! :)

Image

but I'll think I'll take a small part each time... It's 87 meg now...
Image
(\__/)
(='.'=) This is Bunny. Copy and paste Bunny into your
(")_(") signature to help him gain world domination.
Dare2
Moderator
Moderator
Posts: 3321
Joined: Sat Dec 27, 2003 3:55 am
Location: Great Southern Land

Post by Dare2 »

hehe. And that was just the AmigaOS forum! :)
@}--`--,-- A rose by any other name ..
Post Reply