Page 1 of 2
Read the contents of a Web Page
Posted: Sat Apr 14, 2012 1:31 pm
by charvista
I was trying to load an existing webpage in a variable, but no luck.
The manuals says under WebGadget():
- GetGadgetItemText(): The following constants can be used to get information (Windows only):
#PB_Web_HtmlCode : Get the html code from the gadget.
Plus:
Note: The following features do not work with the Mozilla ActiveX on windows (#PB_Web_Mozilla flag)
So, I have two questions:
1. How to get the the content of a webpage? (syntax...)
2. How to know which explorer is used? (IE, Firefox, Safari,...)
So far, I have:
Code: Select all
OpenWindow(0, 0, 0, 600, 300, "WebGadget", #PB_Window_SystemMenu | #PB_Window_ScreenCentered)
WebGadget(0, 10, 10, 580, 280, "http://www.purebasic.com")
WebPage.s=GetGadgetItemText(0,#PB_Web_HtmlCode|#PB_Web_Mozilla)
Debug WebPage
Repeat
Until WaitWindowEvent() = #PB_Event_CloseWindow
The Debug does not return what I expected. I am using Firefox on Windows 7.
Thanks for any suggestion

Re: Read the contents of a Web Page
Posted: Sat Apr 14, 2012 2:04 pm
by charvista
I already found out that the flag #PB_Web_Mozilla has to be used in the WebGadget() function, not in GetGadgetItemText() !
But still no luck.....
Code: Select all
OpenWindow(0, 0, 0, 600, 300, "WebGadget", #PB_Window_SystemMenu | #PB_Window_ScreenCentered)
WebGadget(0, 10, 10, 580, 280, "http://www.purebasic.com",#PB_Web_Mozilla)
WebPage.s=GetGadgetItemText(0,#PB_Web_HtmlCode)
Debug WebPage
Debug Len(WebPage)
Repeat
Until WaitWindowEvent() = #PB_Event_CloseWindow
Re: Read the contents of a Web Page
Posted: Sat Apr 14, 2012 2:30 pm
by Kiffi
you have to wait until the page is loaded completely
Code: Select all
WebGadget(0, 10, 10, 580, 280, "http://www.purebasic.com")
While GetGadgetAttribute(0, #PB_Web_Busy) <> 0
WindowEvent()
Wend
WebPage.s=GetGadgetItemText(0,#PB_Web_HtmlCode)
Greetings ... Kiffi
Re: Read the contents of a Web Page
Posted: Sat Apr 14, 2012 3:20 pm
by charvista
Indeed Kiffi ! You are right that the page was not yet downloaded completely, so the information could not be retrieved.
I felt that it would be highly logical that GetGadgetItemText had an embedded waiting-until-ready function... hence I did not think about that!
Thank you Kiffi !!!

Re: Read the contents of a Web Page
Posted: Sat Apr 14, 2012 3:33 pm
by charvista
Ok, now that it works, let me share with you the procedure I was busy to write.
The procedure simply gets the webpage, in a transparent way.
Code: Select all
Procedure.s GetHtmlCode(URL.s)
GhostWin=OpenWindow(#PB_Any,0,0,600,300,"",#PB_Window_Invisible)
WebGad=WebGadget(#PB_Any,10,10,580,280,URL.s)
While GetGadgetAttribute(WebGad,#PB_Web_Busy)<>0
WindowEvent()
Wend
WebPage.s=GetGadgetItemText(WebGad,#PB_Web_HtmlCode)
CloseWindow(GhostWin)
ProcedureReturn WebPage.s
EndProcedure
Debug GetHtmlCode("http://www.purebasic.com")
Have fun!

Re: Read the contents of a Web Page
Posted: Sat Apr 14, 2012 4:16 pm
by charvista
Hmm, it works very well, but not with all webpages, why not?
Please try with:
http://ip.xxoo.net/
Cheers
Re: Read the contents of a Web Page
Posted: Sun Apr 15, 2012 1:39 am
by MachineCode
Re: Read the contents of a Web Page
Posted: Sun Apr 15, 2012 11:39 am
by charvista
I retried to check again... and still no luck with ip.xxoo.net (among some others as well).
With *and* without the flag #PB_Web_Mozilla in the WebGadget().
MachineCode, are you using Mozilla Firefox as well?
My testcomputer: Windows 7 32-bit, PB 4.60, Firefox 11.0
Code: Select all
Procedure.s GetHtmlCode(URL.s)
GhostWin=OpenWindow(#PB_Any,0,0,600,300,"",#PB_Window_Invisible)
WebGad=WebGadget(#PB_Any,10,10,580,280,URL.s,#PB_Web_Mozilla)
While GetGadgetAttribute(WebGad,#PB_Web_Busy)<>0
WindowEvent()
Wend
WebPage.s=GetGadgetItemText(WebGad,#PB_Web_HtmlCode)
CloseWindow(GhostWin)
ProcedureReturn WebPage.s
EndProcedure
C$=GetHtmlCode("http://ip.xxoo.net")
Debug C$
Debug Len(C$)
still returns a blanco line, LEN = 0.
Kiffi's addition "wait-until-ready" is checking on #PB_Web_Busy, so I don't see what I am missing here, because it seems to work on MachineCode's computer....
Re: Read the contents of a Web Page
Posted: Sun Apr 15, 2012 12:58 pm
by MachineCode
charvista wrote:MachineCode, are you using Mozilla Firefox as well?
I just used the code snippet from the post by you at 3:33 pm. Here's a copy of the "Debug Output" window:
Code: Select all
<html>
<head>
<meta name="viewport" content="initial-scale=1.0, user-scalable=no" />
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta name="keywords" content="IP,NSLOOKUP,IP Address,IP City,IP Country" />
<meta name="description" content="IP Information" />
<meta name="robots" content="all">
<meta name="programmed" content="C.K. Yang" />
<meta name="copyright" content="C.K. Yang" />
<title>IP Information</title>
<style type="text/css">
<!--
body,td,th {
font-family: Verdana, Arial, Georgia, 微軟æ£é»‘é«", sans-serif;
FONT-SIZE: 13px;
}
input, textarea, select, button {
FONT-SIZE: 13px;
font-family: Verdana, Arial, Georgia, 微軟æ£é»‘é«", sans-serif;
}
.table {
border-top: thin solid #CCCCCC;
border-right: thick solid #CCCCCC;
border-bottom: thick solid #CCCCCC;
border-left: thin solid #CCCCCC;
}
a:link {
color: #003247;
}
a:visited {
color: #003247;
}
a:hover {
color: #10212C;
}
a:active {
color: #10212C;
}
-->
</style>
<script type="text/javascript" src="http://maps.google.com/maps/api/js?sensor=false"></script>
<script type="text/javascript">
var map;
function initialize() {
var myLatlng = new google.maps.LatLng(-27, 133);
var myOptions = {
zoom: 4,
center: myLatlng,
mapTypeId: google.maps.MapTypeId.ROADMAP
}
map = new google.maps.Map(document.getElementById("gMap"), myOptions);
var infowindow = new google.maps.InfoWindow({
content: '<font size="1"><B>Australia</B><BR><BR>123.200.192.77</font>'
});
var marker = new google.maps.Marker({
position: myLatlng,
map: map,
title:"Australia"
});
google.maps.event.addListener(marker, 'click', function() {
infowindow.open(map,marker);
});
}
</script>
<script type="text/javascript">
window.google_analytics_uacct = "UA-359219-7";
</script>
<script type="text/javascript">
var _gaq = _gaq || [];
_gaq.push(['_setAccount', 'UA-359219-7']);
_gaq.push(['_trackPageview']);
(function() {
var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;
ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);
})();
</script>
</head>
<body onload="initialize()">
<table width="100%" border="0" cellpadding="0" cellspacing="0">
<tr>
<td align="center" valign="middle">
<table width="900" border="0" cellpadding="5" cellspacing="1" bgcolor="#ffffff">
<tr height="30">
<td bgcolor="#ffffff" align="left"><div align="left"><img src="./images/flags/us.png" align="absmiddle" border="0"> <a href="?L=en">English</a> | <img src="./images/flags/tw.png" align="absmiddle" border="0"> <a href="?L=tw">æ£é«"䏿–‡</a> | <img src="./images/flags/cn.png" align="absmiddle" border="0"> <a href="?L=cn">简ä½"䏿–‡</a></div></td>
<td width="500" bgcolor="#ffffff" align="right"><div align="right">
<!-- AddThis Button BEGIN -->
<div class="addthis_toolbox addthis_default_style " addthis:url="http://ip.xxoo.net">
<a class="addthis_button_facebook_like" fb:like:layout="button_count"></a>
<a class="addthis_button_tweet"></a>
<a class="addthis_button_google_plusone" g:plusone:size="medium"></a>
<a class="addthis_counter addthis_pill_style"></a>
</div>
<script type="text/javascript">var addthis_config = {"data_track_addressbar":true};</script>
<script type="text/javascript" src="http://s7.addthis.com/js/300/addthis_widget.js#pubid=chikaeyang"></script>
<!-- AddThis Button END -->
</div></td>
</tr>
</table>
<table width="900" border="0" cellpadding="5" cellspacing="1" bgcolor="#cccccc">
<tr height="50"><form method="POST" action="">
<td width="40%" bgcolor="#ffffff"><div align="right"><B>Search:</B></div></td>
<td width="60%" bgcolor="#ffffff"><div align="left"><input type="text" name="ip" size="20" value=123.200.192.77> <input type="submit" name="Mode" value="Go"></div></td></form>
</tr>
<tr height="30">
<td bgcolor="#ffffff"><div align="right"><B>IP Address:</B></div></td>
<td bgcolor="# [...]
Re: Read the contents of a Web Page
Posted: Sun Apr 15, 2012 1:28 pm
by Danilo
Works here, too.
Maybe insert "While WindowEvent():Wend" to make sure all events are processed.
Code: Select all
Procedure.s GetHtmlCode(URL.s)
GhostWin=OpenWindow(#PB_Any,0,0,600,300,"",#PB_Window_Invisible)
WebGad=WebGadget(#PB_Any,10,10,580,280,URL.s,#PB_Web_Mozilla)
While WindowEvent():Wend
While GetGadgetAttribute(WebGad,#PB_Web_Busy)<>0
While WindowEvent():Wend
Wend
While WindowEvent():Wend
WebPage.s=GetGadgetItemText(WebGad,#PB_Web_HtmlCode)
CloseWindow(GhostWin)
ProcedureReturn WebPage.s
EndProcedure
C$=GetHtmlCode("http://ip.xxoo.net")
If OpenConsole()
PrintN(C$)
PrintN(Str(Len(C$)))
Input()
EndIf
Re: Read the contents of a Web Page
Posted: Sun Apr 15, 2012 1:33 pm
by Nubcake
I've noticed GetGadgetItemText(#PB_Web_HtmlCode) doesn't return everything in the webgadget. Anyone care to explain why ?
Re: Read the contents of a Web Page
Posted: Sun Apr 15, 2012 5:36 pm
by charvista
@MachineCode
Yes, that is what is supposed to obtain.
@Danilo
I tried with your modified code. Still no luck, see picture.
@Nubcake
Correct, the contents of the page that MachineCode pasted is not complete. But if it was copied from the Debug window, then it is normal, because the debug window cuts the very long lines.
The problem lies in PureBasic, because a function from another language which does exact the same, is working well...

Re: Read the contents of a Web Page
Posted: Sun Apr 15, 2012 7:31 pm
by Nubcake
@Nubcake
The problem lies in PureBasic, because a function from another language which does exact the same, is working well...

Will anyone see to this issue if it is one ? Anyway I was searching and I found a very useful thread which returns the exact HTML code of the WebGadget instead of having things changed with GetGadgetItemText()
http://www.purebasic.fr/english/viewtop ... +html+code
Re: Read the contents of a Web Page
Posted: Sun Apr 15, 2012 11:07 pm
by MachineCode
charvista wrote:The problem lies in PureBasic
What problem? It works fine for two of us.
Re: Read the contents of a Web Page
Posted: Sun Apr 15, 2012 11:20 pm
by Foz
What is wrong with using ReceiveHTTPFile(url, filename)?