Retreiving a web page?

Just starting out? Need help? Post your questions and find answers here.
BackupUser
PureBasic Guru
PureBasic Guru
Posts: 16777133
Joined: Tue Apr 22, 2003 7:42 pm

Post by BackupUser »

Restored from previous forum. Originally posted by PB.

Can somebody please show me how to load a web page or document?
For example, to load the contents of http://www.google.com/index.html
into a variable? Thanks!
BackupUser
PureBasic Guru
PureBasic Guru
Posts: 16777133
Joined: Tue Apr 22, 2003 7:42 pm

Post by BackupUser »

Restored from previous forum. Originally posted by Mr.Skunk.

Do you absolutely want to load it into a VARIABLE or does a memory location/pointer could be ok?


Mr Skunk

Mr Skunk's PureBasic Web Page
http://www.skunknet.fr.st
BackupUser
PureBasic Guru
PureBasic Guru
Posts: 16777133
Joined: Tue Apr 22, 2003 7:42 pm

Post by BackupUser »

Restored from previous forum. Originally posted by PB.
Do you absolutely want to load it into a VARIABLE or does a memory location/pointer could be ok?
Either way would be fine.
BackupUser
PureBasic Guru
PureBasic Guru
Posts: 16777133
Joined: Tue Apr 22, 2003 7:42 pm

Post by BackupUser »

Restored from previous forum. Originally posted by Mr.Skunk.

Here is a progr to read data in a variable...

Code: Select all

;
; ------------------------------------------------------------
;
;   Internet acces - example file version 1
;
;
; ------------------------------------------------------------
;




initgadget(1)


If OpenWindow(0, 200, 200, 1000, 500, #PB_Window_SystemMenu | #PB_Window_MinimizeGadget | #PB_Window_MaximizeGadget, "Network example")


  If CreateGadgetList(windowid())
    listviewgadget(1,10,10,970,470)
  EndIf


  ; The buffer to read the web page
  a$=space(4999) ;4999 is the max value, then PB crash (internal PB or windows limit???)

  initnetwork()
  DrawingOutput(WindowID())


  host$="[url]http://www.google.com[/url]"
  webpage$="http://"+host$+"/index.html"


  If OpenNetworkConnection(host$,80)


    com$="GET "+webpage$+" HTTP/1.1"+chr(13)+chr(10)
    com$=com$+"Accept: */*"+chr(13)+chr(10)
    com$=com$+"Accept: text/html"+chr(13)+chr(10)
    com$=com$+"Host: "+host$+chr(13)+chr(10)
    com$=com$+"User-Agent: HTTP-For-PureBasic"+chr(13)+chr(10)
    com$=com$+chr(13)+chr(10)


    sendnetworkdata(0,@com$,len(com$))


    receivenetworkdata(0,@a$,4999) ;First read the cookie (not always a cookie but google has one...)
    receivenetworkdata(0,@a$,4999) ; then read the web page itself (else we will display the cookie :-p)


    a$=striptrail(a$)
    Repeat
      a=findstring(a$,chr(10),0)
      c$=left(a$,a-1)
      a$=right(a$,len(a$)-a)
      addgadgetitem(1,-1,c$)
    Until a$="" 


  Else
    addgadgetitem(1,-1,"connection error...")
  EndIf


  closenetworkconnection()


  waitguievent()


EndIf
StopDrawing()
End 

To use memory bank, use this modified listing
(The way i use to read and display the text from memory is far not the better for speed, it has just been quickly written for the example).

Code: Select all

;
; ------------------------------------------------------------
;
;   Internet acces - example file Version 2
;
;
; ------------------------------------------------------------
;





initgadget(1)


If OpenWindow(0, 200, 200, 1000, 500, #PB_Window_SystemMenu | #PB_Window_MinimizeGadget | #PB_Window_MaximizeGadget, "Network example")


  If CreateGadgetList(windowid())
    listviewgadget(1,10,10,970,470)
  EndIf


  ; The buffer to read the web page (10kb)
  *bank.l=allocatememorybank(0,10000,0)

  initnetwork()
  DrawingOutput(WindowID())


  host$="[url]http://www.google.com[/url]"
  webpage$="http://"+host$+"/index.html"


  If OpenNetworkConnection(host$,80)


    com$="GET "+webpage$+" HTTP/1.1"+chr(13)+chr(10)
    com$=com$+"Accept: */*"+chr(13)+chr(10)
    com$=com$+"Accept: text/html"+chr(13)+chr(10)
    com$=com$+"Host: "+host$+chr(13)+chr(10)
    com$=com$+"User-Agent: HTTP-For-PureBasic"+chr(13)+chr(10)
    com$=com$+chr(13)+chr(10)


    sendnetworkdata(0,@com$,len(com$))


    receivenetworkdata(0,*bank,10000) ;First read the cookie
    receivenetworkdata(0,*bank,10000) ; then read the web page itself (else we will display the cookie :-p)


    For i=*bank To *bank+10000
      If peekb(i)=10
        a$=striptrail(a$)
        If a$""
          addgadgetitem(1,-1,a$)
        EndIf
        a$=""
        i=i+1
      EndIf  
      a$=a$+chr(peekb(i))
    Next


  Else
    addgadgetitem(1,-1,"connection error...")
  EndIf


  closenetworkconnection()


  waitguievent()


EndIf
StopDrawing()
End  
Hope it helps...


Edited by - mr.skunk on 10 September 2001 05:03:06
BackupUser
PureBasic Guru
PureBasic Guru
Posts: 16777133
Joined: Tue Apr 22, 2003 7:42 pm

Post by BackupUser »

Restored from previous forum. Originally posted by PB.

Thanks Mr Skunk -- it works great! :)
BackupUser
PureBasic Guru
PureBasic Guru
Posts: 16777133
Joined: Tue Apr 22, 2003 7:42 pm

Post by BackupUser »

Restored from previous forum. Originally posted by Paul.

How can you detect when the data stream has finished?
This code is fine if the web page is small but if it is large, not all the data is collected.

I have tried using a loop to grab data a number fo times but if all the data is collected before the loop is finished, my app seems to freeze up.
BackupUser
PureBasic Guru
PureBasic Guru
Posts: 16777133
Joined: Tue Apr 22, 2003 7:42 pm

Post by BackupUser »

Restored from previous forum. Originally posted by ricardo.

Hi friend, hope you remember me : )

I tried this codes but seems that don't compile anymore with PB new version, could you help me ?

I want to read the content of a web page into a variable, the result of a search in an engine. More specific, the content of the search result in audiogalaxy (http://www.audiogalaxy.com)

Thanks
BackupUser
PureBasic Guru
PureBasic Guru
Posts: 16777133
Joined: Tue Apr 22, 2003 7:42 pm

Post by BackupUser »

Restored from previous forum. Originally posted by ricardo.

now i can compile it (your strin2 lib was missing)
but the variable is so small

let me try your another example : )
BackupUser
PureBasic Guru
PureBasic Guru
Posts: 16777133
Joined: Tue Apr 22, 2003 7:42 pm

Post by BackupUser »

Restored from previous forum. Originally posted by ricardo.


ReceiveNetworkData

Dont accept *Bank as parameter (expects a string instead)
All my problem is 'where to store the result for a large web page'

Can you help me?

Thanks
BackupUser
PureBasic Guru
PureBasic Guru
Posts: 16777133
Joined: Tue Apr 22, 2003 7:42 pm

Post by BackupUser »

Restored from previous forum. Originally posted by ricardo.

Its horrible this code mine, but i put it like an example of
what is the problem

host$ = "http://www.audiogalaxy.com"
webpage$="http://www.audiogalaxy.com/list/searche ... tr=beatles"
a$ = Space(4999)
b$ = Space(4999)
c$ = Space(4999)
d$ = Space(4999)
e$ = Space(4999)
f$ = Space(4999)
g$ = Space(4999)
h$ = Space(4999)
i$ = Space(4999)
j$ = Space(4999)
k$ = Space(4999)
l$ = Space(4999)
m$ = Space(4999)






Result = InitNetwork()

If OpenNetworkConnection(host$,80)

com$="GET "+webpage$+" HTTP/1.1"+Chr(13)+Chr(10)
com$=com$+"Accept: */*"+Chr(13)+Chr(10)
com$=com$+"Accept: text/html"+Chr(13)+Chr(10)
com$=com$+"Host: "+host$+Chr(13)+Chr(10)
com$=com$+"User-Agent: HTTP-For-PureBasic"+Chr(13)+Chr(10)
com$=com$+Chr(13)+Chr(10)


SendNetworkData(0,@com$,Len(com$))

ReceiveNetworkData(0,@a$,Len(a$))
ReceiveNetworkData(0,@b$,Len(b$))
ReceiveNetworkData(0,@c$,Len(c$))
ReceiveNetworkData(0,@d$,Len(d$))
ReceiveNetworkData(0,@e$,Len(e$))
ReceiveNetworkData(0,@f$,Len(f$))
ReceiveNetworkData(0,@g$,Len(g$))
ReceiveNetworkData(0,@h$,Len(h$))
ReceiveNetworkData(0,@i$,Len(i$))
ReceiveNetworkData(0,@j$,Len(j$))
ReceiveNetworkData(0,@k$,Len(k$))
ReceiveNetworkData(0,@l$,Len(l$))
ReceiveNetworkData(0,@m$,Len(m$))

a$ = StripTrail(a$)
b$ = StripTrail(b$)
c$ = StripTrail(c$)
d$ = StripTrail(d$)
e$ = StripTrail(e$)
f$ = StripTrail(f$)
g$ = StripTrail(g$)
h$ = StripTrail(h$)
i$ = StripTrail(i$)
j$ = StripTrail(j$)
k$ = StripTrail(k$)
l$ = StripTrail(l$)
m$ = StripTrail(m$)


;MessageRequester("",b$,0)
Result = OpenFile(1, "c:\windows\escritorio\rrr.html")
WriteString(a$ + b$ + c$ + d$ + e$ + f$ + g$ + h$ + i$ + j$ + k$ + l$ + m$)
CloseNetworkConnection()
CloseFile(1)


EndIf
End


And even in this CRAZY code sometimes loose some data


Mr.Skrunk Help me !!!!!

Ricardo
BackupUser
PureBasic Guru
PureBasic Guru
Posts: 16777133
Joined: Tue Apr 22, 2003 7:42 pm

Post by BackupUser »

Restored from previous forum. Originally posted by Paul.

I think there are problems with the ReceiveNetworkData() commands.
Data is either missing, garbled, or just not received when I try to work with it.

I can do the exact same things in BlitzBasic using the TCPStream command and not have any problems at all.

I have mentioned this to Fred but have never had any answers that resolved the problem.

There are other problems like I can't open an address like '64.114.97.126' but I can open this 'host-126.ken-64-114-97.norcomcable.ca' (which is the same thing).
BackupUser
PureBasic Guru
PureBasic Guru
Posts: 16777133
Joined: Tue Apr 22, 2003 7:42 pm

Post by BackupUser »

Restored from previous forum. Originally posted by ricardo.

Hi Paul,

But it seems to me to be something with the string, because if i use many strings then i can get all the web page.
In fact, i can do my code this way, but its a lame solution and, however if for some reason a web are bigger than my string size, something will be lost.

Then im looking for a more advanced solution.
Im sure that Mr.Skunk or Fred could help us !!
BackupUser
PureBasic Guru
PureBasic Guru
Posts: 16777133
Joined: Tue Apr 22, 2003 7:42 pm

Post by BackupUser »

Restored from previous forum. Originally posted by Paul.

Well... Now I'm confused :(
This code would not work for me before but I thought I'd try it again (using PB 2.60a now) and to my surprise, it somewhat works now.

Go figure ?!
Of course data is still missing here and there.
My guess is becasue we are unable to read a constant stream in one pass?

(I substituted your addresses into the code)

Code: Select all

host$ = "[url]http://www.audiogalaxy.com[/url]"
webpage$="[url]http://www.audiogalaxy.com/list/searches.php?SID=865364193c248002b7be0f7657b8ef96&searchType=0&searchStr=beatles[/url]"

InitNetwork() 
NewList temp.s()

If OpenNetworkConnection(host$,80)

  com$="GET "+webpage$+" HTTP/1.1"+Chr(13)+Chr(10) 
  com$=com$+"Accept: */*"+Chr(13)+Chr(10) 
  com$=com$+"Accept: text/html"+Chr(13)+Chr(10) 
  com$=com$+"Host: "+host$+Chr(13)+Chr(10) 
  com$=com$+"User-Agent: HTTP-For-PureBasic"+Chr(13)+Chr(10) 
  com$=com$+Chr(13)+Chr(10)
  
  SendNetworkData(0,@com$,Len(com$))

  Repeat 
    a$ = Space(4999)
    result=ReceiveNetworkData(0,@a$,Len(a$))
    AddElement(temp())
    temp()=StripTrail(a$)
  Until result=0
  
  If CreateFile(0,"temp.txt")
    ResetList(temp())
    While NextElement(temp())
      WriteStringN(temp())
    Wend
    CloseFile(0)
  EndIf
  MessageRequester("","Done",0)
EndIf
  
End


Edited by - paul on 25 November 2001 03:09:38
BackupUser
PureBasic Guru
PureBasic Guru
Posts: 16777133
Joined: Tue Apr 22, 2003 7:42 pm

Post by BackupUser »

Restored from previous forum. Originally posted by ricardo.


Well, i made some little changes to MrSkunk codes because at starts it dosent compiles with me.
Then after a review i get the idea that the problem was on the variable (buffer), in the size of the buffer.
Thats why we need to store it in some kind of packages, but again, this is not a satisfactory solution.
In Rapid-Q its did it in 1 pass, and all the info stored in one buffer.
Even in Power Basic there are some troubles with it, only in VB (and maybe in Delphi) it can be done with easy.
I think that Pure Basic compiles better executables that all this compiler (maybe like Power Basic) and this Network function are easy, but... maybe we need a little more explained help.

In fact, mi principal suggestion to Fred was this one: more explained help.
I know he is coding, but in someways, the help files are critical if he wants more registered users. Most users leaves an application if they don't understand something...

Thanks

Ricardo
BackupUser
PureBasic Guru
PureBasic Guru
Posts: 16777133
Joined: Tue Apr 22, 2003 7:42 pm

Post by BackupUser »

Restored from previous forum. Originally posted by ricardo.

To get the webpage we can use wininet.dll too
(InternetOpen, InternetOpenUrl, InternetReadFile and InternetCloseHandle functions) BUT the problem remains the same... the buffer.
How can we work arround it?
Post Reply