ReceiveHTTPFile() truncates file? & how to validate image?

Just starting out? Need help? Post your questions and find answers here.
jassing
Addict
Addict
Posts: 1765
Joined: Wed Feb 17, 2010 12:00 am

ReceiveHTTPFile() truncates file? & how to validate image?

Post by jassing »

I'm using ReceiveHTTPFile() to download images - however, out of 55,000 images downloaded, over 1/2 were "truncated" -- not complete downloads.

This is being called in a thread, and there is nothing that terminates a thread - but the thread is completing before the download is complete.

Is there an alternative to download a file?

Also; I tried using this to validate the image:

Code: Select all

Procedure IsValidImage( cFile.s )
	Protected bValid.b = #False;
	Protected nImage = LoadImage(#PB_Any, cFile) 
	If IsImage(nImage)
		bValid = #True
		FreeImage(nImage)
	EndIf
	
	ProcedureReturn bValid
EndProcedure
Unfortunately, it will load an incomplete image, so it's not a good way to validate.

Any ideas?
User avatar
luis
Addict
Addict
Posts: 3876
Joined: Wed Aug 31, 2005 11:09 pm
Location: Italy

Re: ReceiveHTTPFile() truncates file? & how to validate imag

Post by luis »

I don't know about the PB function, I used it recently to automate a job I were doing and so I used it for a quick and dirty program. It worked well with more than 300,000 files but this does not prove anything. Not seeing your multithreaded code I don't have any suggestion.

But yes, to answer your question there are a lot of http download routines posted in this forum, some are a lot more versatile than the straightforward one coming with PB itself. You can try them to see if the problem is NOT in your code. One is my HTTPGetFromWeb and you can find the link on my page, but there are more, and crossplatform too.

About the check for validity, no idea on how to do it in a simple way if PB gives you the ok for a truncated image. Didn't check it. Would like to know a quick/general method.
"Have you tried turning it off and on again ?"
A little PureBasic review
jassing
Addict
Addict
Posts: 1765
Joined: Wed Feb 17, 2010 12:00 am

Re: ReceiveHTTPFile() truncates file? & how to validate imag

Post by jassing »

Thanks -- it's not the thread, removed all threads so that's not the cause.
I have used several of the code bits I found here but they left in some non-image data (for instance, header info, that I stripped away, but still saw downloads with $0d $0a 2000 $0d $0a inserted in various spots.
more didn't work (earlier pb?) than did.
Right now, I've modified the code so that any downloaded file less than 5000 bytes is considered "bad" and deleted/redownloaded... this seems to be working...
Trond
Always Here
Always Here
Posts: 7446
Joined: Mon Sep 22, 2003 6:45 pm
Location: Norway

Re: ReceiveHTTPFile() truncates file? & how to validate imag

Post by Trond »

Are you checking that the result of ReceiveHTTPFile() is not 0?

A solution would be to install wget and use RunProgram() to make wget download the files. Wget is known for working well.
jassing
Addict
Addict
Posts: 1765
Joined: Wed Feb 17, 2010 12:00 am

Re: ReceiveHTTPFile() truncates file? & how to validate imag

Post by jassing »

Trond wrote:Are you checking that the result of ReceiveHTTPFile() is not 0?

A solution would be to install wget and use RunProgram() to make wget download the files. Wget is known for working well.
Of course I am... that's what was frustrating -- it was saying it was downloaded; bu tthe resulting file was truncated...

using wget is an excellent idea...

I'm a bit surprised that there doesn't seem to be a 'validate jpg format' function/dll/api out there...
Indeed, a few image viewers I used, even attempted to load the invalid images, but would hang..

thanks
-j
User avatar
djes
Addict
Addict
Posts: 1806
Joined: Sat Feb 19, 2005 2:46 pm
Location: Pas-de-Calais, France

Re: ReceiveHTTPFile() truncates file? & how to validate imag

Post by djes »

Sorry to wake up this thread, but as I had the problem... It could be resolved by looking at file size here : viewtopic.php?f=13&t=72424&p=539293#p539293
Sadly it needs to have a double http request, so I will not use it by now, only deleting the incomplete file.
Post Reply