Page 1 of 3

IE IHTMLDocument2 interface

Posted: Thu Sep 08, 2005 8:41 pm
by DevilDog
Is it possible to get the IHTMLDocument2 interface to an open IE window or is that only applicable to a webgadget?

How can I do that?

Posted: Thu Sep 08, 2005 11:50 pm
by fweil
Would it not be easier to DL the page ? Using URLDownloadToFile_() ?

IE IHTMLDocument2 interface

Posted: Fri Sep 09, 2005 12:18 am
by DevilDog
well I need to use the get_innerhtml and put_innerhtml as well as the mousemove event of the browser.

All of these I can do when using the webgadget and getting the htmldocument2 interface to it but I can't figure out how to do the same with an open window of internet explorer.

I've seen references to doing that with C++ in newsgroup discussions but I don't know C++ in order to convert the code.

more info

Posted: Fri Sep 09, 2005 8:51 pm
by DevilDog
OK. I have this code below which is modified from the IETool source code.
First I get a handle to IE and call EnumChildWindows with it.

Code: Select all

hWnd.l = FindWindow_("IEFrame", vbNullString) 
EnumChildWindows_(hWnd, @EnumChildProc(), 1)
This code below loops through the various windows in IE until it finds one with a class of "Internet Explorer_Server".

Then with the next two lines, the "GetWindowLong_" function works fine as far as I can tell, but the "get_document" line crash and closes the application with an error of "Invalid Memory Access". Any one know what I'm doing wrong? :cry:

Code: Select all

Procedure EnumChildProc(hwnd, lParam)
  class.s = Space(40)
  GetClassName_(hwnd, @class, 40) 
  If Class = "Internet Explorer_Server" 
     WebObject.IWebBrowser2 = GetWindowLong_(hwnd, #GWL_USERDATA) 
     If WebObject\get_Document(@DocumentDispatch.IDispatch) = #S_OK 
          MessageRequester("Test","found!")
     EndIf
   EndIf

....

Re: more info

Posted: Fri Sep 09, 2005 9:32 pm
by ricardo
DevilDog wrote:OK. I have this code below which is modified from the IETool source code.
First I get a handle to IE and call EnumChildWindows with it.

Code: Select all

hWnd.l = FindWindow_("IEFrame", vbNullString) 
EnumChildWindows_(hWnd, @EnumChildProc(), 1)
This code below loops through the various windows in IE until it finds one with a class of "Internet Explorer_Server".

Then with the next two lines, the "GetWindowLong_" function works fine as far as I can tell, but the "get_document" line crash and closes the application with an error of "Invalid Memory Access". Any one know what I'm doing wrong? :cry:

Code: Select all

Procedure EnumChildProc(hwnd, lParam)
  class.s = Space(40)
  GetClassName_(hwnd, @class, 40) 
  If Class = "Internet Explorer_Server" 
     WebObject.IWebBrowser2 = GetWindowLong_(hwnd, #GWL_USERDATA) 
     If WebObject\get_Document(@DocumentDispatch.IDispatch) = #S_OK 
          MessageRequester("Test","found!")
     EndIf
   EndIf

....
I tried to do the same some time ago with same results. Cant make it works.

Posted: Fri Sep 09, 2005 11:45 pm
by freak
Your code only works with the PB webgadget, because it puts the Interface pointer in the #GWL_USERDATA
field of the window on gadget creation. (for the other gadget functions to access it)
IE does not do that.

I have put together a code from the page here: http://support.microsoft.com/kb/q249232/

Code: Select all

Procedure.l GetIHTMLDocument(ExplorerServerWindow)

  HtmlDoc.IHTMLDocument = 0
     
  OleAcc = OpenLibrary(#PB_Any, "OLEACC.DLL")
  If OleAcc And IsFunction(OleAcc, "ObjectFromLresult")

    Message = RegisterWindowMessage_("WM_HTML_GETOBJECT")
    SendMessageTimeout_(ExplorerServerWindow, Message, 0, 0, #SMTO_ABORTIFHUNG, 1000, @MessageResult)
      
    CallFunction(OleAcc, "ObjectFromLresult", MessageResult, ?IID_IHTMLDocument, 0, @HtmlDoc)    
  
    CloseLibrary(OleAcc)
  EndIf
  
  ProcedureReturn HtmlDoc
EndProcedure

Procedure EnumChildProc(hwnd, *hServer.LONG)
  Class$ = Space(100)
  GetClassName_(hwnd, @Class$, 100)
  If Class$ = "Internet Explorer_Server"
    *hServer\l = hwnd ; server window found.
    ProcedureReturn #False
  Else
    ProcedureReturn #True
  EndIf 
EndProcedure

DataSection

  IID_IHTMLDocument: ; {626FC520-A41E-11CF-A731-00A0C9082637}
    Data.l $626FC520
    Data.w $A41E, $11CF
    Data.b $A7, $31, $00, $A0, $C9, $08, $26, $37
    
  IID_IHTMLDocument2: ; {332C4425-26CB-11D0-B483-00C04FD90119}
    Data.l $332C4425
    Data.w $26CB, $11D0
    Data.b $B4, $83, $00, $C0, $4F, $D9, $01, $19    

EndDataSection


; CoInitialize around the whole thing is important as otherwise it won't work.
CoInitialize_(0)

; find the IE server window:
;
hWnd.l = FindWindow_("IEFrame", 0)
EnumChildWindows_(hWnd, @EnumChildProc(), @ServerWindow)

; get the interface:
;
Document.IHTMLDocument = GetIHTMLDocument(ServerWindow)
If Document 

  If Document\QueryInterface(?IID_IHTMLDocument2, @Document2.IHTMLDocument2) = #S_OK
  
    ; successfully retrieved the IHTMLDocument2 interface here,
    ; a small example follows.
  
    If Document2\get_title(@bstr_title) = #S_OK
    
      length = WideCharToMultiByte_(#CP_ACP, 0, bstr_title, -1, 0, 0, 0, 0)
      Title$ = Space(length)
      WideCharToMultiByte_(#CP_ACP, 0, bstr_title, -1, @Title$, length, 0, 0)
      
      MessageRequester("Document Title", Title$)
      
      SysFreeString_(bstr_title)
    EndIf    
  
    Document2\Release()
  EndIf  

  Document\Release()
EndIf

CoUninitialize_()  

you da man freak!

Posted: Fri Sep 09, 2005 11:53 pm
by DevilDog
I was about to post some vb 6 code I found using oleacc which I hadn't been able to get to work when you posted this showing the answer to what I was about to ask. How cool is that? Answers before asking!

Thanks freak, that does seem to work and I know there's been quite a few people asking how to do this.

COOL! :P

Posted: Sun Sep 11, 2005 2:44 am
by ricardo
I dont understand how to do this, per example (VB6 code):

Code: Select all

For X = SpanNumber To doc.body.All.tags("span").length
      If doc.body.All.tags("span")(X).ClassName = "text" Then
        	innerText$ = doc.body.All.tags("span")(X).innerHTML
	innertextnew$ = "something " + innerText$ 
	doc.body.All.tags("span")(X).insertAdjacentHTML "afterbegin", innertextnew$
                SpanNumber = X 'agregue
              End If
      End If
Next
Its possible?

Posted: Sun Sep 11, 2005 5:37 am
by DevilDog
Ricardo,
I'm not sure but I think you need the interface IHTMLDocument3 and then use getElementsByName but beyond that I can't say more.

I have my own question for anyone. I have the following commands which seem to work as they return S_OK.

Code: Select all

  If Document2\get_selection(@Selection.IHTMLSelectionObject) = #S_OK
      If Selection\createRange(@SelectedText.TextRange) = #S_OK
I'm not sure about the second line if I'm using @SelectedText.TextRange correct although it does return #S_OK. Assuming it's fine, how do I retrieve the selected text from the TextRange object?

The createRange method is defined as follows:

Code: Select all

HRESULT createRange(
          IDispatch **range
);
The TextRange object has a property called "text" but what's the proper syntax to retrieve it's contents? The "text" property syntas is defined as:

Code: Select all

TextRange.text [ = sTxt ]
How do I implement that in PB?

Posted: Tue Sep 13, 2005 1:15 am
by freak
TextRange is a scripting object. Thats why you do not get a direct interface pointer, but the IDispatch interface to access it.
You have to use the Invoke() method to actually do something with the object.

Here is a working example: (i hope the code is understandable)

Code: Select all

Structure DISPPARAMS
  *rgvarg;            // Array of arguments.
  *rgdispidNamedArgs;   // Dispatch IDs of named arguments.
  cArgs.l;            // Number of arguments.
  cNamedArgs.l;         // Number of named arguments.
EndStructure

#DISPATCH_METHOD = $1
#DISPATCH_PROPERTYGET = $2
#DISPATCH_PROPERTYPUT = $4
#DISPATCH_PROPERTYPUTREF = $8


Enumeration   ; VARENUM
  #VT_EMPTY = 0
  #VT_NULL = 1
  #VT_I2 = 2
  #VT_I4 = 3
  #VT_R4 = 4
  #VT_R8 = 5
  #VT_CY = 6
  #VT_DATE = 7
  #VT_BSTR = 8
  #VT_DISPATCH = 9
  #VT_ERROR = 10
  #VT_BOOL = 11
  #VT_VARIANT = 12
  #VT_UNKNOWN = 13
  #VT_DECIMAL = 14
  #VT_I1 = 16
  #VT_UI1 = 17
  #VT_UI2 = 18
  #VT_UI4 = 19
  #VT_I8 = 20
  #VT_UI8 = 21
  #VT_INT = 22
  #VT_UINT = 23
  #VT_VOID = 24
  #VT_HRESULT = 25
  #VT_PTR = 26
  #VT_SAFEARRAY = 27
  #VT_CARRAY = 28
  #VT_USERDEFINED = 29
  #VT_LPSTR = 30
  #VT_LPWSTR = 31
  #VT_RECORD = 36
  #VT_INT_PTR = 37
  #VT_UINT_PTR = 38
  #VT_FILETIME = 64
  #VT_BLOB = 65
  #VT_STREAM = 66
  #VT_STORAGE = 67
  #VT_STREAMED_OBJECT = 68
  #VT_STORED_OBJECT = 69
  #VT_BLOB_OBJECT = 70
  #VT_CF = 71
  #VT_CLSID = 72
  #VT_VERSIONED_STREAM = 73
  #VT_BSTR_BLOB = $fff
  #VT_VECTOR = $1000
  #VT_ARRAY = $2000
  #VT_BYREF = $4000
  #VT_RESERVED = $8000
  #VT_ILLEGAL = $ffff
  #VT_ILLEGALMASKED = $fff
  #VT_TYPEMASK = $fff
EndEnumeration

Structure VARIANT
  vt.w
  wReserved1.w
  wReserved2.w
  wReserved3.w
  StructureUnion
    llVal.LARGE_INTEGER
    lVal.l
    bVal.b
    iVal.w
    fltVal.f
    dblVal.LARGE_INTEGER
    boolVal.l
    bool.l
    scode.l
    cyVal.l
    date.l
    bstrVal.l
   *punkVal.IUnknown
   *pdispVal.IDispatch
   *parray.l
   *pbVal.BYTE
   *piVal.WORD
   *plVal.LONG
   *pllVal.LARGE_INTEGER
   *pfltVal.FLOAT
   *pdblVal.LARGE_INTEGER
   *pboolVal.LONG
   *pbool.LONG
   *pscode.LONG
   *pcyVal.LONG
   *pdate.LONG
   *pbstrVal.LONG
   *ppunkVal.LONG
   *ppdispVal.LONG
   *pparray.LONG
   *pvarVal.VARIANT
    byref.l
    cVal.b
    uiVal.w
    ulVal.l
    ullVal.LARGE_INTEGER
    intVal.l
    uintVal.l
   *pdecVal.LONG
   *pcVal.BYTE
   *puiVal.WORD
   *pulVal.LONG
   *pullVal.LARGE_INTEGER
   *pintVal.LONG
   *puintVal.LONG                      
    decVal.l
  EndStructureUnion
EndStructure

Procedure ErrorMessage(Value)
  Message$ = Space(3000)
  FormatMessage_(#FORMAT_MESSAGE_IGNORE_INSERTS|#FORMAT_MESSAGE_FROM_SYSTEM, 0, Value, 0, @Message$, 3000, 0)
  MessageRequester("Error","Error:"+Chr(13)+Message$, #MB_ICONERROR)
EndProcedure

Procedure MakeBSTR(String$)
  Unicode$ = Space(Len(String$)*2+2)
  MultiByteToWideChar_(#CP_ACP, 0, @String$, -1, @Unicode$, Len(String$)*2+2)
  ProcedureReturn SysAllocString_(@Unicode$)
EndProcedure

Procedure.s ReadBSTR(bstr)
  length = WideCharToMultiByte_(#CP_ACP, 0, bstr, -1, 0, 0, 0, 0)
  Text$ = Space(length)
  WideCharToMultiByte_(#CP_ACP, 0, bstr, -1, @Text$, length, 0, 0)   
  ProcedureReturn Text$
EndProcedure

Procedure.l GetIHTMLDocument2(ExplorerServerWindow)

  HtmlDoc.IHTMLDocument2 = 0
     
  OleAcc = OpenLibrary(#PB_Any, "OLEACC.DLL")
  If OleAcc And IsFunction(OleAcc, "ObjectFromLresult")

    Message = RegisterWindowMessage_("WM_HTML_GETOBJECT")
    SendMessageTimeout_(ExplorerServerWindow, Message, 0, 0, #SMTO_ABORTIFHUNG, 1000, @MessageResult)
      
    CallFunction(OleAcc, "ObjectFromLresult", MessageResult, ?IID_IHTMLDocument2, 0, @HtmlDoc)    
  
    CloseLibrary(OleAcc)
  EndIf
  
  ProcedureReturn HtmlDoc
EndProcedure

Procedure EnumChildProc(hwnd, *hServer.LONG)
  Class$ = Space(100)
  GetClassName_(hwnd, @Class$, 100)
  If Class$ = "Internet Explorer_Server"
    *hServer\l = hwnd ; server window found.
    ProcedureReturn #False
  Else
    ProcedureReturn #True
  EndIf 
EndProcedure

DataSection

  IID_IHTMLDocument2: ; {332C4425-26CB-11D0-B483-00C04FD90119}
    Data.l $332C4425
    Data.w $26CB, $11D0
    Data.b $B4, $83, $00, $C0, $4F, $D9, $01, $19  
    
  IID_NULL: ; {00000000-0000-0000-0000-000000000000}
    Data.l $00000000
    Data.w $0000, $0000
    Data.b $00, $00, $00, $00, $00, $00, $00, $00      

EndDataSection


; CoInitialize around the whole thing is important as otherwise it won't work.
CoInitialize_(0)

; find the IE server window:
;
hWnd.l = FindWindow_("IEFrame", 0)
EnumChildWindows_(hWnd, @EnumChildProc(), @ServerWindow)

; get the interface:
;
Document.IHTMLDocument2 = GetIHTMLDocument2(ServerWindow)
If Document 
  
  If Document\get_selection(@Selection.IHTMLSelectionObject) = #S_OK
  
    ; first we should check the type of the selection, because if a control
    ; is selected, you will get a different object from createRange()!
    ;
    If Selection\get_type(@bstr_string) = #S_OK
      Type$ = ReadBSTR(bstr_string)
      SysFreeString_(bstr_string)
    Else
      Type$ = ""
    EndIf
    
    Select LCase(Type$)
      Case "none"
        MessageRequester("", "Nothing is selected.")
      
      Case "text"
        
        ; ok, get the IDispatch for the TextRange object
        ;
        If Selection\createRange(@TextRangeDispatch.IDispatch) = #S_OK
                  
          ; bstr_name = MakeBSTR("htmlText") ; use this to get html code
          bstr_name = MakeBSTR("text")          
          
          ; get the dispid of the property we want to get
          ;          
          If TextRangeDispatch\GetIDsOfNames(?IID_NULL, @bstr_name, 1, 0, @dispid.l) = #S_OK
          
            arguments.DISPPARAMS\cArgs = 0
          
            ; now read the actual property.
            ;
            result= TextRangeDispatch\Invoke(dispid, ?IID_NULL, 0, #DISPATCH_PROPERTYGET, @arguments, @varResult.VARIANT, 0, 0)
            If result = #S_OK And varResult\vt = #VT_BSTR
                 
              SelectedText$ = ReadBSTR(varResult\bstrVal)
              SysFreeString_(varResult\bstrVal)
              
              MessageRequester("Selected Text:", SelectedText$)
              
            Else
              ErrorMessage(result)
            EndIf
            
            SysFreeString_(bstr_name)

          EndIf
        
          TextRangeDispatch\Release()
        EndIf        
      
      Case "control"
        MessageRequester("", "A control is selected.")
      
      Default
        MessageRequester("", "Error!")
        
    EndSelect
  
    Selection\Release()
  EndIf

  Document\Release()
EndIf

CoUninitialize_()  
[edit]
You can even replace the selected text by inserting this after the messagerequester that displays the text ;) :

Code: Select all

              newtext.VARIANT\vt = #VT_BSTR
              newtext\bstrVal = MakeBSTR("Hello World!")
              
              arguments\cArgs = 1
              arguments\cNamedArgs = 0
              arguments\rgvarg = @newtext
              
              TextRangeDispatch\Invoke(dispid, ?IID_NULL, 0, #DISPATCH_PROPERTYPUT, @arguments, 0, 0, 0)
@ricardo: what exactly is your code supposed to do?

Cool Solution!

Posted: Tue Sep 13, 2005 1:53 am
by DevilDog
Freak, that's awesome. There's stuff in that code that I've been wondering how to do since I started looking into this.

I hope others can benefit from this as I believe there's quite a bit of good stuff here for those of use interested in interacting with IE!

Thanks again :D

Posted: Tue Sep 13, 2005 3:05 am
by ricardo
freak wrote:@ricardo: what exactly is your code supposed to do?
Hi Freak!!

My code

Code: Select all

For X = SpanNumber To doc.body.All.tags("span").length 
      If doc.body.All.tags("span")(X).ClassName = "text" Then 
           innerText$ = doc.body.All.tags("span")(X).innerHTML 
   innertextnew$ = "something " + innerText$ 
   doc.body.All.tags("span")(X).insertAdjacentHTML "afterbegin", innertextnew$ 
                SpanNumber = X 'agregue 
              End If 
      End If 
Next 
Looks all the tags (in this case 'span') to find which ones have some specific classname if so, get the innerHTML, add something to it (or replace it) and put it back with the changes.

The 'afterbegin' is some of the possibilities when inserting some text into a tag.

Lets say i want to walk the dom Tree to all the 'spam' tag elements, find which ones have the classname 'text' and read its inner HTML, and based on its contecnt replace the inner HTML of each one with some specific text.

I will REALLY appreciate if you help me with this, im trying to understand how DOM could be managed from PB since i use a lot DOM in many of my apps :D

Thanks in advance!

Posted: Tue Sep 13, 2005 5:51 pm
by DevilDog
Freak,
Your code works great! I'm still trying to digest some of what you're doing but I got a question dealing with mouse events on IE.

I'm looking at capturing the mousemove event in IE and checking the element that is under it by using ElementFromPoint, which all worked great when I was using the WebGadget. But for IE, should I use

Code: Select all

HRESULT IHTMLDocument2::put_onmousemove(VARIANT v);
to capture the mousemove in IE? The VARIANT is suppossed to be a pointer to the procedure I have to handle the mousemove. How do I pass it the pointer to my procedure?

I don't know where to look for help in figuring out this stuff. MSDN gives you the general outline but expects you to already know about pointers, etc.

Don't get me wrong, I love learning about all of this and I know it will open all kinds of doors in the kinds of applications I will write in the future, I just hope I don't alienate you or anyone else by asking for help.

Posted: Wed Sep 14, 2005 12:18 am
by freak
Ok, here is some code for enumerating and reading the DOM tree.

First some structures and procedures needed by the two following
examples:

Code: Select all

Enumeration   ; VARENUM
  #VT_EMPTY = 0
  #VT_NULL = 1
  #VT_I2 = 2
  #VT_I4 = 3
  #VT_R4 = 4
  #VT_R8 = 5
  #VT_CY = 6
  #VT_DATE = 7
  #VT_BSTR = 8
  #VT_DISPATCH = 9
  #VT_ERROR = 10
  #VT_BOOL = 11
  #VT_VARIANT = 12
  #VT_UNKNOWN = 13
  #VT_DECIMAL = 14
  #VT_I1 = 16
  #VT_UI1 = 17
  #VT_UI2 = 18
  #VT_UI4 = 19
  #VT_I8 = 20
  #VT_UI8 = 21
  #VT_INT = 22
  #VT_UINT = 23
  #VT_VOID = 24
  #VT_HRESULT = 25
  #VT_PTR = 26
  #VT_SAFEARRAY = 27
  #VT_CARRAY = 28
  #VT_USERDEFINED = 29
  #VT_LPSTR = 30
  #VT_LPWSTR = 31
  #VT_RECORD = 36
  #VT_INT_PTR = 37
  #VT_UINT_PTR = 38
  #VT_FILETIME = 64
  #VT_BLOB = 65
  #VT_STREAM = 66
  #VT_STORAGE = 67
  #VT_STREAMED_OBJECT = 68
  #VT_STORED_OBJECT = 69
  #VT_BLOB_OBJECT = 70
  #VT_CF = 71
  #VT_CLSID = 72
  #VT_VERSIONED_STREAM = 73
  #VT_BSTR_BLOB = $fff
  #VT_VECTOR = $1000
  #VT_ARRAY = $2000
  #VT_BYREF = $4000
  #VT_RESERVED = $8000
  #VT_ILLEGAL = $ffff
  #VT_ILLEGALMASKED = $fff
  #VT_TYPEMASK = $fff
EndEnumeration

Structure VARIANT
  vt.w
  wReserved1.w
  wReserved2.w
  wReserved3.w
  StructureUnion
    llVal.LARGE_INTEGER
    lVal.l
    bVal.b
    iVal.w
    fltVal.f
    dblVal.LARGE_INTEGER
    boolVal.l
    bool.l
    scode.l
    cyVal.l
    date.l
    bstrVal.l
   *punkVal.IUnknown
   *pdispVal.IDispatch
   *parray.l
   *pbVal.BYTE
   *piVal.WORD
   *plVal.LONG
   *pllVal.LARGE_INTEGER
   *pfltVal.FLOAT
   *pdblVal.LARGE_INTEGER
   *pboolVal.LONG
   *pbool.LONG
   *pscode.LONG
   *pcyVal.LONG
   *pdate.LONG
   *pbstrVal.LONG
   *ppunkVal.LONG
   *ppdispVal.LONG
   *pparray.LONG
   *pvarVal.VARIANT
    byref.l
    cVal.b
    uiVal.w
    ulVal.l
    ullVal.LARGE_INTEGER
    intVal.l
    uintVal.l
   *pdecVal.LONG
   *pcVal.BYTE
   *puiVal.WORD
   *pulVal.LONG
   *pullVal.LARGE_INTEGER
   *pintVal.LONG
   *puintVal.LONG                      
    decVal.l
  EndStructureUnion
EndStructure

Structure VARIANT_SPLIT
  StructureUnion
    Variant.VARIANT
    Split.l[4]
  EndStructureUnion
EndStructure

Interface IHTMLElementCollection_FIXED
  QueryInterface(a,b)
  AddRef()
  Release()
  GetTypeInfoCount(a)
  GetTypeInfo(a,b,c)
  GetIDsOfNames(a,b,c,d,e)
  Invoke(a,b,c,d,e,f,g,h)
  toString(a)
  put_length(a)
  get_length(a)
  get__newEnum(a)
  item(a1,a2,a3,a4,b1,b2,b3,b4,c)
  tags(a,b)
EndInterface

Procedure MakeBSTR(String$)
  Unicode$ = Space(Len(String$)*2+2)
  MultiByteToWideChar_(#CP_ACP, 0, @String$, -1, @Unicode$, Len(String$)*2+2)
  ProcedureReturn SysAllocString_(@Unicode$)
EndProcedure

Procedure.s ReadBSTR(bstr)
  length = WideCharToMultiByte_(#CP_ACP, 0, bstr, -1, 0, 0, 0, 0)
  Text$ = Space(length)
  WideCharToMultiByte_(#CP_ACP, 0, bstr, -1, @Text$, length, 0, 0)   
  ProcedureReturn Text$
EndProcedure

Procedure.l GetIHTMLDocument2(ExplorerServerWindow)

  HtmlDoc.IHTMLDocument2 = 0
     
  OleAcc = OpenLibrary(#PB_Any, "OLEACC.DLL")
  If OleAcc And IsFunction(OleAcc, "ObjectFromLresult")

    Message = RegisterWindowMessage_("WM_HTML_GETOBJECT")
    SendMessageTimeout_(ExplorerServerWindow, Message, 0, 0, #SMTO_ABORTIFHUNG, 1000, @MessageResult)
      
    CallFunction(OleAcc, "ObjectFromLresult", MessageResult, ?IID_IHTMLDocument2, 0, @HtmlDoc)    
  
    CloseLibrary(OleAcc)
  EndIf
  
  ProcedureReturn HtmlDoc
EndProcedure

Procedure EnumChildProc(hwnd, *hServer.LONG)
  Class$ = Space(100)
  GetClassName_(hwnd, @Class$, 100)
  If Class$ = "Internet Explorer_Server"
    *hServer\l = hwnd ; server window found.
    ProcedureReturn #False
  Else
    ProcedureReturn #True
  EndIf 
EndProcedure

DataSection

  IID_IHTMLDocument2: ; {332C4425-26CB-11D0-B483-00C04FD90119}
    Data.l $332C4425
    Data.w $26CB, $11D0
    Data.b $B4, $83, $00, $C0, $4F, $D9, $01, $19  
    
  IID_IHTMLElement: ; {3050F1FF-98B5-11CF-BB82-00AA00BDCE0B}
    Data.l $3050F1FF
    Data.w $98B5, $11CF
    Data.b $BB, $82, $00, $AA, $00, $BD, $CE, $0B    
    
  IID_IHTMLElementCollection: ; {3050F21F-98B5-11CF-BB82-00AA00BDCE0B}
    Data.l $3050F21F
    Data.w $98B5, $11CF
    Data.b $BB, $82, $00, $AA, $00, $BD, $CE, $0B        
    
  IID_NULL: ; {00000000-0000-0000-0000-000000000000}
    Data.l $00000000
    Data.w $0000, $0000
    Data.b $00, $00, $00, $00, $00, $00, $00, $00      

EndDataSection
A few words:

To enumerate the objects (in whatever order), you always have to deal with the
IHTMLElementCollection interface, which represents a set of elements.
Well, here is a problem: The much needed IHTMLElementCollection::item() method
expects two VARIANT structures (which are 16 bytes) not as a reference, but
directly on the stack. The problem is that PB does not support that.

So here is the workaround:
See the "VARIANT_SPLIT" structure above. It contains a normal VARIANT structure
in union with 4 long values.
Also there is the "IHTMLElementCollection_FIXED" interface, which is identical to IHTMLElementCollection, except that in the place
where the variants must be, there are 4 separate long values.
Now to call the item() method, you fill the VARIANT part of the VARIANT_SPLIT structure
normally and then call the method with the 4 separate Split[x] values. (in order from 0 to 3)
This will put the VARIANT correctly on the stack as the method expects it.
(See below for an example)

Ok now lets go for a simple enumeration example as a start:

Code: Select all

; CoInitialize around the whole thing is important as otherwise it won't work.
CoInitialize_(0)

; find the IE server window:
;
hWnd.l = FindWindow_("IEFrame", 0)
EnumChildWindows_(hWnd, @EnumChildProc(), @ServerWindow)

; get the interface:
;
Document.IHTMLDocument2 = GetIHTMLDocument2(ServerWindow)
If Document 
  
  ; There are a number of methods that return IHTMLElementCollection.
  ; use the get_all() method to get a full list of elements. 
  ; (this does not tell you which element is contained in which other one)
  ; For now, lets just get all the links.
  ;
  ;If Document\get_all(@LinkCollection.IHTMLElementCollection_FIXED) = #S_OK
  If Document\get_links(@LinkCollection.IHTMLElementCollection_FIXED) = #S_OK
       
    ; read the number of elements in this collection
    ;
    LinkCollection\get_length(@Total)
    Debug "Number of Links: "+Str(Total)
    
    ; the attribute we want to read from the links    
    bstr_attribute = MAkeBSTR("href") 
    
    ; now go through the elements
    For index = 0 To Total-1
    
      ; To simply get an element by index in the collection, all we need is one
      ; VARIANT of long type with the index, and use it twice in the item() call.
      ;
      varIndex.VARIANT_SPLIT\Variant\vt = #VT_I4
      varIndex\Variant\lVal = index      
      ElementDispatch.IDispatch = 0
      
      ; The whole VARIANT must be used twice (as 1st and 2nd param) so there are
      ; 8 longs to use here:
      ;
      If LinkCollection\item(varIndex\split[0], varIndex\split[1], varIndex\split[2], varIndex\split[3], varIndex\split[0], varIndex\split[1], varIndex\split[2], varIndex\split[3], @ElementDispatch.IDispatch) = #S_OK
        If ElementDispatch ; must check this value according to the docs, as even on failure, #S_OK is returned
        
          Text$ = ""
          Target$ = ""
          
          ; From the item() method, you always get an IDispatch interface representing the
          ; element. From it, you can query for another interface.
          ; The IHTMLElement type of interfaces are supported by all elements, and provide
          ; allready much information. Additionally, elements have special interfaces to support
          ; their specific functionality. (Like IHTMLFormElement, IHTMLFontElement, ...)
          ;
          ; Lets get the generic IHTMLElement interface:
          ;
          If ElementDispatch\QueryInterface(?IID_IHTMLElement, @Element.IHTMLElement) = #S_OK              
          
            ; lets read the text inside the element. This is quite simple.
            ; there are also methods for innerHTML, outerHTML, outerText, and those to set these values
            ;          
            If Element\get_innerText(@bstr_text) = #S_OK
              Text$ = ReadBSTR(bstr_text)
              SysFreeString_(bstr_text) ; make sure returned strings are freed
            EndIf
            
            ; This reads an attribute, the bstr string was created above.
            ; The result can be a VARIANT_BSTR or VARIANT_BOOL depending on the attribute.
            ;
            If Element\getAttribute(bstr_attribute, 0, @varResult.VARIANT) = #S_OK
              If varResult\vt = #VT_BSTR
                Target$ = ReadBSTR(varResult\bstrVal)
                SysFreeString_(varResult\bstrVal)
              EndIf
            EndIf                       
            
            Element\Release() ; make sure all interface pointers are properly released.
          EndIf          
          
          Debug Text$+"  =>  "+Target$        
                  
          ElementDispatch\Release()
        EndIf
      EndIf            
    
    Next index
    
    SysFreeString_(bstr_attribute)
  
    LinkCollection\Release()
  EndIf
           
  Document\Release()
EndIf

CoUninitialize_()
This second one is a more complex example. It scans the DOM tree in the
order of the elements and fills a treegadget. You can edit the HTML code
of an element and modify it with the "Update Code" button to update the element.
(you should re-scan the tree after that to account for any changes made)

Code: Select all

; Structure to store our tag info. We actually do not release any of the
; items pointers, but store them here so we can later change their data.
;
Structure TagInfo
  Element.IHTMLElement
  Name$
  Class$
  Code$
  ID$  
  Title$
EndStructure

NewList HTMLTags.TagInfo()

Enumeration
  #GADGET_Progress
  #GADGET_Tree
  #GADGET_Info
  #GADGET_Code
  #GADGET_Splitter1
  #GADGET_Splitter2
  #GADGET_SetCode
  #GADGET_Refresh
EndEnumeration

Global ProcessedTags, TotalTags ; to monitor the reading progress.

; This procedure recursively walks through the tree of elements.
; It adds their info to the LinkedList and fills the TreeGadget.
; It starts with one element, and recursively walks through all children.
;
; (this must not be on the whole document, it can also be used to enumerate
; all childs of any Html element. We just start with the body tag, so we get the
; whole thing.)
;
Procedure WalkDOMTree(CurrentElement.IHTMLElement)

  AddElement(HTMLTags())
  
  HTMLTags()\Element = CurrentElement ; save the interface pointer to this element

  ; reading the information about the element is not that difficult.
  ; its just a matter of using the right method.
  ;
  If CurrentElement\get_tagName(@bstr_name) = #S_OK
    HTMLTags()\Name$ = ReadBSTR(bstr_name)
    SysFreeString_(bstr_name)
  EndIf
  
  If CurrentElement\get_className(@bstr_class) = #S_OK
    HTMLTAgs()\Class$ = ReadBSTR(bstr_class)
    SysFreeString_(bstr_class)
  EndIf
  
  If CurrentElement\get_outerHTML(@bstr_code) = #S_OK
    HTMLTags()\Code$ = ReadBSTR(bstr_code)
    SysFreeString_(bstr_code)
  EndIf
  
  If CurrentElement\get_id(@bstr_id) = #S_OK
    HTMLTags()\ID$ = ReadBSTR(bstr_id)
    SysFreeString_(bstr_id)
  EndIf
  
  If CurrentElement\get_title(@bstr_title) = #S_OK
    HTMLTags()\Title$ = ReadBSTR(bstr_title)
    SysFreeString_(bstr_title)
  EndIf
  
  AddGadgetItem(#GADGET_Tree, -1, HTMLTags()\Name$+" (class="+Chr(34)+HTMLTags()\Class$+Chr(34)+")")  

  ; now we get all the direct child elements.
  ; get_children() returns an IDispatch, from which we can query for a IHTMLElementCollection,
  ; and enumerate the items like in the last code.
  ;
  If CurrentElement\get_children(@ChildDispatch.IDispatch) = #S_OK
  
    ; get the IHTMLElementCollection interface
    If ChildDispatch\QueryInterface(?IID_IHTMLElementCollection, @ChildCollection.IHTMLElementCollection_FIXED) = #S_OK
      
      ; get the count
      If ChildCollection\get_length(@childcount) = #S_OK
      
        OpenTreeGadgetNode(#GADGET_Tree)
        
        ; go through the elements like in the last code.
        For index = 0 To childcount-1
          varIndex.VARIANT_SPLIT\VAriant\vt = #VT_I4
          varIndex\Variant\lVal = index
          
          ElementDispatch.IDispatch = 0
          If ChildCollection\item(varIndex\split[0], varIndex\split[1], varIndex\split[2], varIndex\split[3], varIndex\split[0], varIndex\split[1], varIndex\split[2], varIndex\split[3], @ElementDispatch.IDispatch) = #S_OK
            If ElementDispatch ; must check this value according to the docs
            
              NewElement.IHTMLElement = 0
              If ElementDispatch\QueryInterface(?IID_IHTMLElement, @NewElement.IHTMLElement) = #S_OK              
                ;
                ; recursively call this function for each found element.
                ;
                WalkDOMTree(NewElement)
              EndIf
            
              ElementDispatch\Release() ; release the dispach interface, not the element itself though
            EndIf
          EndIf          
        
        Next index
      
        CloseTreeGadgetNode(#GADGET_Tree)    
      EndIf      
    
      ChildCollection\Release()
    EndIf
    
    ChildDispatch\Release()  
  EndIf  
  
  ; update the progress
  ProcessedTags + 1
  If TotalTags <> 0
    SetGadgetState(#GADGET_Progress, ProcessedTags*1000/TotalTags)
    While WindowEvent(): Wend
  EndIf
           
EndProcedure

; displays the information of a tag in the gadgets:
;
Procedure DisplayElement(index)
  SelectElement(HTMLTags(), index)
  
  Info$ = "Tag: "+HTMLTags()\Name$ + Chr(13)+Chr(10)
  Info$ + "Class: "+HTMLTags()\Class$ + Chr(13)+Chr(10)
  Info$ + "ID: "+HTMLTags()\ID$ + Chr(13)+Chr(10)
  Info$ + "Title: "+HTMLTags()\Title$ + Chr(13)+Chr(10)
  
  SetGadgetText(#GADGET_Info, Info$)
  SetGadgetText(#GADGET_Code, HTMLTags()\Code$)   
EndProcedure

; CoInitialize around the whole thing is important as otherwise it won't work.
CoInitialize_(0)

; find the IE server window:
;
hWnd.l = FindWindow_("IEFrame", 0)
EnumChildWindows_(hWnd, @EnumChildProc(), @ServerWindow)

; get the interface:
;
Document.IHTMLDocument2 = GetIHTMLDocument2(ServerWindow)
If Document = 0
  MessageRequester("Error", "Could not connect to IE interfaces!")
  End
EndIf

; first some GUI code for display:
;
If OpenWindow(0, 0, 0, 500, 400, #PB_Window_ScreenCentered|#PB_Window_SystemMenu|#PB_Window_SizeGadget|#PB_Window_MaximizeGadget|#PB_Window_MinimizeGadget, "IE DOM Viewer")
  If CreateGadgetList(WindowID())
  
    ProgressBarGadget(#GADGET_Progress, 5, 5, 490, 15, 0, 1000)    
  
    TreeGadget(#GADGET_Tree, 0, 0, 0, 0)
    TextGadget(#GADGET_Info, 0, 0, 0, 0, "", #PB_Text_Border)
    EditorGadget(#GADGET_Code, 0, 0, 0, 0)
    SplitterGadget(#GADGET_Splitter1, 0, 0, 0, 0, #GADGET_Info, #GADGET_Code)
    SplitterGadget(#GADGET_Splitter2, 5, 25, 490, 340, #GADGET_Tree, #GADGET_Splitter1, #PB_Splitter_Vertical)
    
    ButtonGadget(#GADGET_Refresh, 5, 370, 100, 25, "Refresh Tree")
    ButtonGadget(#GADGET_SetCode, 395, 370, 100, 25, "Update Code")
    
    ; Ok, now we scan the tree. We get the body element to start scanning there:
    ;    
    If Document\get_body(@Body.IHTMLElement) = #S_OK
    
      ; get_all() returns a list of all items tool, but not with child/parent relations.
      ; we still use it here though to get a count of all the items in the body of the document.
      ;
      TotalTags = 0
      If Body\get_all(@AllDispatch.IDispatch) = #S_OK
        If AllDispatch\QueryInterface(?IID_IHTMLElementCollection, @AllElements.IHTMLElementCollection) = #S_OK              
          AllElements\get_length(@TotalTags)
          AllElements\Release()
        EndIf
        AllDispatch\Release()
      EndIf
      ProcessedTags = 0
    
      WalkDOMTree(Body) ; start scanning the tree on the body element
    EndIf
  
    ; expand the whole treegadget
    lastitem = CountGadgetItems(#GADGET_Tree) - 1
    For i = 0 To lastitem
      SetGadgetItemState(#GADGET_TRee, i, #PB_Tree_Expanded)
    Next i  
    
    SetGadgetState(#GADGET_Tree, 0)    
    DisplayElement(0)
    
    ; Main loop
    ;
    Repeat
      Event = WaitWindowEvent()
      
      If Event = #PB_Event_Gadget
        EventGadgetID = EventGadgetID()
      
        If EventGadgetID = #GADGET_Tree
          index = GetGadgetState(#GADGET_Tree)
          If index <> -1
            DisplayElement(index)
          Else
            SetGadgetText(#GADGET_Info, "")
            SetGadgetText(#GADGET_Code, "")
          EndIf
          
        ElseIf EventGadgetID = #GADGET_SetCode
          index = GetGadgetState(#GADGET_Tree)
          If index <> -1
            SelectElement(HTMLTags(), index)
            ;
            ; Changing the html code of the element is no big deal.
            ; Its just about calling put_outerHTML() with the bstr string of the
            ; new code using the IHTMLElement pointer we stored in the LinkedList
            ;            
            HTMLTags()\Code$ = GetGadgetText(#GADGET_Code)
            bstr_code = MakeBSTR(HTMLTags()\Code$)            
            HTMLTags()\Element\put_outerHTML(bstr_code)            
            SysFreeString_(bstr_code)            
          EndIf
          
        ElseIf EventGadgetID = #GADGET_Refresh   
        
          ; re-scan the document tree.
          ; first make sure all element pointers are properly released to prevent memory leaks          
          ;
          ForEach HTMLTags()
            HTMLTags()\Element\Release()
          Next
          
          ClearList(HTMLTags())
          ClearGadgetItemList(#GADGET_Tree)
          SetGadgetState(#GADGET_Progress, 0)      
          
          ; Do all the scan stuff like above again
          ;
          If Document\get_body(@Body.IHTMLElement) = #S_OK
          
            TotalTags = 0
            If Body\get_all(@AllDispatch.IDispatch) = #S_OK
              If AllDispatch\QueryInterface(?IID_IHTMLElementCollection, @AllElements.IHTMLElementCollection) = #S_OK              
                AllElements\get_length(@TotalTags)
                AllElements\Release()
              EndIf
              AllDispatch\Release()
            EndIf
            ProcessedTags = 0          
          
            WalkDOMTree(Body)
          EndIf
        
          lastitem = CountGadgetItems(#GADGET_Tree) - 1
          For i = 0 To lastitem
            SetGadgetItemState(#GADGET_TRee, i, #PB_Tree_Expanded)
          Next i       
          SetGadgetState(#GADGET_Tree, 0)           
          DisplayElement(0)
        
        EndIf
      
      ElseIf Event = #PB_Event_SizeWindow
        ResizeGadget(#GADGET_Progress, 5, 5, WindowWidth()-10, 15)
        ResizeGadget(#GADGET_Splitter2, 5, 25, WindowWidth()-10, WindowHeight()-60)
        ResizeGadget(#GADGET_Refresh, 5, WindowHeight()-30, 100, 25)
        ResizeGadget(#GADGET_SetCode, WindowWidth()-105, WindowHeight()-30, 100, 25)
      
      EndIf
      
    Until Event = #PB_Event_CloseWindow
    
  EndIf
EndIf

; make sure all pointers are properly released at the end
;
ForEach HTMLTags()
  HTMLTags()\Element\Release()
Next
    
Document\Release()

CoUninitialize_()  
DevilDog: I will look on the event stuff later.

Cool!

Posted: Wed Sep 14, 2005 12:32 am
by DevilDog
Freak, this is amazing stuff.

There's a ton here to learn and I've seen so much of it up in MSDN site and wondered how to do it.

Now I can compare the MSDN documentation and how it describes the syntax for commands and then compare it to what you have in your examples and I should be able to understand all of this a lot better.

I can't wait to see if the event stuff is possible.

I hope everyone else can find this useful as well.

Thanks!