Get current webgadget HTML?

Just starting out? Need help? Post your questions and find answers here.
Seymour Clufley
Addict
Addict
Posts: 1264
Joined: Wed Feb 28, 2007 9:13 am
Location: London

Get current webgadget HTML?

Post by Seymour Clufley »

Does anyone know a way to get the current HTML code inside the webgadget?

I believe it could be done by inserting a JS function that would return the document.body.innerHTML, but that would require the whole JS<>PB framework and I'm wondering if there's a simpler way.

Thanks for any help,
Seymour.
Sparkie
PureBatMan Forever
PureBatMan Forever
Posts: 2307
Joined: Tue Feb 10, 2004 3:07 am
Location: Ohio, USA

Post by Sparkie »

Code: Select all

GetGadgetItemText(#WebGadget, #PB_Web_HtmlCode)
What goes around comes around.

PB 5.21 LTS (x86) - Windows 8.1
Seymour Clufley
Addict
Addict
Posts: 1264
Joined: Wed Feb 28, 2007 9:13 am
Location: London

Post by Seymour Clufley »

Thanks Sparkie, but GetGadgetItemText doesn't return the current HTML. It returns the most recently loaded HTML.

This example should show what I mean. It contains three different kinds of "editable elements". If you type text into any of them, it is not returned by GetGadgetItemText.

Code: Select all

c13.s = Chr(13)
c34.s = Chr(34)

html.s = "<html><head></head>" +c13
html + "<body>" +c13

html + "<FORM name="+c34+"form"+c34+" ACTION="+c34+c34+" ENCTYPE="+c34+"application/x-www-form-urlencoded"+c34+" METHOD="+c34+"POST"+c34+">" +c13
html + "Form INPUT:<BR>" +c13
html + "<INPUT TYPE=Text Value="+c34+c34+" style="+c34+"width:400px; "+c34+">" +c13
html + "<BR><BR>" +c13

html + "Form TEXTAREA:<BR>" +c13
html + "<TEXTAREA style="+c34+"width:400px; height:100px; "+c34+"></TEXTAREA>" +c13
html + "</FORM>" +c13

html + "ContentEditable DIV:" +c13
html + "<DIV contenteditable=true style="+c34+"width:400px; height:100px; border:1px solid #000; "+c34+">&nbsp;</DIV>" +c13
html + "</body></html>"




ww=450
wh=400
win = OpenWindow(#PB_Any,0,0,ww,wh,"Test",#PB_Window_ScreenCentered)
wg = WebGadget(#PB_Any,0,0,ww,360,"")
SetGadgetItemText(wg,#PB_Web_HtmlCode,html)

button = ButtonGadget(#PB_Any,20,wh-30,150,20,"Get webgadget HTML")

Repeat
    we=WindowEvent()
    If we=#PB_Event_Gadget
        If EventGadget()=button
            MessageRequester("Webgadget contents",GetGadgetItemText(wg,#PB_Web_HtmlCode),0)
        EndIf
    EndIf
    Delay(10)
Until GetAsyncKeyState_(#VK_ESCAPE)
Also, returning the most recently loaded HTML instead of the current state of the code doesn't account for any changes made to it by the browser. I can only speak about IE. It reorders style attributes and capitalises all the attribute names. And if this line is included in the code:
<LINK rel="stylesheet" type="text/css" href="css.css" />
IE will change it to:
<LINK href=css.css type=text/css rel=stylesheet>
So I wonder if there is a way to get the current state of the code?
User avatar
netmaestro
PureBasic Bullfrog
PureBasic Bullfrog
Posts: 8451
Joined: Wed Jul 06, 2005 5:42 am
Location: Fort Nelson, BC, Canada

Post by netmaestro »

You could use javascript to get the contents or get the webbrowser interface with:

Code: Select all

wb2.IWEBBROWSER2 = GetWindowLongPtr_(GadgetID(wg), #GWL_USERDATA)
where one of the methods available should do the trick. I'm no expert on this interface so I couldn't say which one but there are some on the forum who could. Also, because this interface is used by many other programming languages, the answer could be found by surfing places like codeguru.com etc.
BERESHEIT
Seymour Clufley
Addict
Addict
Posts: 1264
Joined: Wed Feb 28, 2007 9:13 am
Location: London

Post by Seymour Clufley »

Thanks Lloyd.

It seems you can get the current HTML using IPersistStream. Here is some code I found...

Code: Select all

IHTMLDocument2* pDoc = ...;
IStream* pMyStream = ...;

IPersistStreamInit* pPersist = 0;
HRESULT hr = pDoc->QueryInterface(IID_IPersistStreamInit, (void**)&pPersist);
if (SUCCEEDED(hr) && pPersist) {
    hr = pPersist->Save(pMyStream, true);
    pPersist->Release();
}
Does anyone have any idea what to do with that?! I've never heard of IPersist stuff before but it seems to be the only way apart from JavaScript - and JS may not be viable for what I'm trying to do.

I can't find any examples of working with IPersist in this forum, so if anyone could help I'd be very grateful.

Also, it occurs to me that I may just be doing what GetGadgetItemText already does. (Until I get this working there's no way of knowing whether it retrieves the current HTML, or the most recently loaded HTML like GetGadgetItemText.) Does anyone know if GGIT uses the IPersist method?
Seymour Clufley
Addict
Addict
Posts: 1264
Joined: Wed Feb 28, 2007 9:13 am
Location: London

Post by Seymour Clufley »

Another way may be through using:
> IWebBrowser2
>> IHTMLDOCUMENT3
>>> getElementsByTagName (either "html", or "body" and "head" - will have to get the DocType separately)
>>>> outerHTML.
Seymour Clufley
Addict
Addict
Posts: 1264
Joined: Wed Feb 28, 2007 9:13 am
Location: London

Post by Seymour Clufley »

I was about to embark on this when I decided to look through some old .pb files.

Unbelievably I tackled this very problem less than a year ago and got the solution. What a diddy.

Code: Select all

Global c13.s = Chr(13)
Global c34.s = Chr(34)


Procedure EGWordWrapping(g.i,wrap.b)
  
  If IsGadget(g)
      If wrap
          SendMessage_(GadgetID(g), #EM_SETTARGETDEVICE, #Null, 0)
      Else
          SendMessage_(GadgetID(g), #EM_SETTARGETDEVICE, #Null, $FFFFFF)
      EndIf
  EndIf
  
EndProcedure

Procedure.b RemoveEGBorder(g.i)
  
  If IsGadget(g)
      egid = GadgetID(g)
      style = GetWindowLong_(egid, #GWL_EXSTYLE)
      newstyle = style &(~#WS_EX_CLIENTEDGE)
      SetWindowLong_(egid, #GWL_EXSTYLE, newstyle)
      SetWindowPos_(egid, 0, 0, 0, 0, 0, #SWP_SHOWWINDOW | #SWP_NOSIZE | #SWP_NOMOVE | #SWP_FRAMECHANGED) ; required for this to work on my Win98
  EndIf
  
EndProcedure



Procedure.b R(str.s)
  MessageRequester("Report",str,0)
EndProcedure




DataSection 
  
  IID_IHTMLDocument2: ; {332C4425-26CB-11D0-B483-00C04FD90119} 
    Data.l $332C4425 
    Data.w $26CB, $11D0 
    Data.b $B4, $83, $00, $C0, $4F, $D9, $01, $19    
  
EndDataSection 




Procedure GetBrowser(g.i) ; By Zapman Inspired by Fr34k
  If IsGadget(g)
    If GetGadgetText(g) = ""
      SetGadgetText(g, "about:blank") ; to avoid error when using Browser
      While WindowEvent():Wend
    EndIf
    ;
    Browser.IWebBrowser2 = GetWindowLong_(GadgetID(g), #GWL_USERDATA)
    If Browser
      Ready = 0
      ct = 0
      While Ready < 4 And ct<200
        WindowEvent()
        State = 0
        If Browser\get_ReadyState(@BrowserState.i) = #S_OK
          If BrowserState = 4
            Ready + 1
          EndIf
        EndIf
        If Ready = 0 : Delay(5) : EndIf
        ct + 1
      Wend
    EndIf
    ProcedureReturn Browser
  EndIf
EndProcedure


Procedure GetDocumentDispatch(g.i) ;  By Zapman Inspired by Fr34k
  ; Example: DocumentDispatch.IDispatch = GetDocumentDispatch(WebGadget)
  ; Do not forget to release DocumentDispatch when finished to use it
  Browser.IWebBrowser2 = GetBrowser(g)
  If Browser
    If Browser\get_Document(@DocumentDispatch.IDispatch) = #S_OK
      ProcedureReturn DocumentDispatch
    EndIf
  EndIf
EndProcedure

Procedure.s WBGetHTML(g.i)
  ; Retrieve the all HTML content of the document which is in the Webgadget
  
  DocumentDispatch.IDispatch = GetDocumentDispatch(g)
  If DocumentDispatch
    If DocumentDispatch\QueryInterface(?IID_IHTMLDocument2, @Document.IHTMLDocument2) = #S_OK And Document
      Document\get_body(@Element.IHTMLElement); Get the <BODY> Element
      If Element
        If Element\get_parentElement(@Parent.IHTMLElement) = #S_OK And Parent; Get the <HTML> Element
          Parent\get_outerHTML(@bstr)
          Parent\Release()
        EndIf
        Element\Release()
      EndIf
      Document\Release()
    EndIf
    DocumentDispatch\Release()
  EndIf
  
  HTML.s
  If bstr
    HTML = PeekS(bstr, -1, #PB_Unicode) ; get the whole text of the document
    SysFreeString_(bstr)
  EndIf
  
  ;SetClipboardText(HTML$)
  ProcedureReturn HTML
  
EndProcedure


html.s = "<html><head></head>" +c13
html + "<body>" +c13

html + "<FORM name="+c34+"form"+c34+" ACTION="+c34+c34+" ENCTYPE="+c34+"application/x-www-form-urlencoded"+c34+" METHOD="+c34+"POST"+c34+">" +c13
html + "Form INPUT:<BR>" +c13
html + "<INPUT TYPE=Text Value="+c34+c34+" style="+c34+"width:400px; "+c34+">" +c13
html + "<BR><BR>" +c13

html + "Form TEXTAREA:<BR>" +c13
html + "<TEXTAREA style="+c34+"width:400px; height:100px; "+c34+"></TEXTAREA>" +c13
html + "</FORM>" +c13

html + "ContentEditable DIV:" +c13
html + "<DIV contenteditable=true style="+c34+"width:400px; height:100px; border:1px solid #000; "+c34+">&nbsp;</DIV>" +c13
html + "</body></html>"


ww=600
wgw=450
wgh=360
wh=wgh+100
win = OpenWindow(#PB_Any,0,0,ww,wh,"Test",#PB_Window_ScreenCentered)
wg = WebGadget(#PB_Any,0,0,wgw,wgh,"")
SetGadgetItemText(wg,#PB_Web_HtmlCode,html)

EG=EditorGadget(#PB_Any,5,wgh+10,ww-10,wh-40-(wgh+10))
RemoveEGBorder(EG)
EGWordWrapping(EG,#True)
t.s = "Above is a webgadget containing three different 'inputtable elements'."+c13+"Type some stuff into these elements then press the two buttons below to see the difference in the HTML returned."+c13+"Unfortunately this solution is Windows-only."
SetGadgetText(EG,t)

button1 = ButtonGadget(#PB_Any,20,wh-30,150,20,"GetGadgetItemText")
button2 = ButtonGadget(#PB_Any,20+150+20,wh-30,150,20,"API method")

Repeat
    we=WindowEvent()
    If we=#PB_Event_Gadget
        Select EventGadget()
            Case button1
                MessageRequester("Webgadget contents",GetGadgetItemText(wg,#PB_Web_HtmlCode),0)
            Case button2
                R(WBGetHTML(wg))
        EndSelect
    EndIf
    Delay(10)
Until GetAsyncKeyState_(#VK_ESCAPE)
As far as I can see, it gets everything except the DOCTYPE. I'll work on that.
Post Reply