Stop WebGadget accessing the internet
Stop WebGadget accessing the internet
Hi, I want to preview some HTML files in a WebGadget, but the HTML is trying to load data from the internet to display (naturally). Is there a way to prevent that? All I want to do is show the local HTML data of the file, without loading any external resources (which can take a lot of time). Thanks!
[Edit] Posted a pic further down below to show what I mean -> viewtopic.php?p=556007#p556007
[Edit] Posted a pic further down below to show what I mean -> viewtopic.php?p=556007#p556007
Last edited by BarryG on Fri Jun 12, 2020 10:17 am, edited 1 time in total.
-
- Always Here
- Posts: 6426
- Joined: Fri Oct 23, 2009 2:33 am
- Location: Wales, UK
- Contact:
Re: Stop WebGadget accessing the internet
Hi Barry
If you look at the source of the HTML file, it's probably loading javascript files. You can download them, then modify the paths in the HTML. Similar thing for images etc.
If you look at the source of the HTML file, it's probably loading javascript files. You can download them, then modify the paths in the HTML. Similar thing for images etc.
IdeasVacuum
If it sounds simple, you have not grasped the complexity.
If it sounds simple, you have not grasped the complexity.
Re: Stop WebGadget accessing the internet
I'm just trying show a quick preview of the file without the time-consuming wait for it to load all external images, sounds, and so on. HTML is also very complicated so I could never be 100% sure that I've removed all external links, especially with redirections that may be present. That's why I was hoping there'd be an easy way to not have it load them.
Maybe there's a way to cancel (with a callback?) the WebGadget once the local file is initially loaded into it? I don't know. I've tried with #PB_Web_Stop on the WebGadget() but that doesn't always help because some external links have already started to be loaded, with no way to stop them.
Unplugging my ethernet cable does the job perfectly, so ideally I'd like a non-admin programming way to do that. Anyone got any sure-fire ideas? Thanks.
Maybe there's a way to cancel (with a callback?) the WebGadget once the local file is initially loaded into it? I don't know. I've tried with #PB_Web_Stop on the WebGadget() but that doesn't always help because some external links have already started to be loaded, with no way to stop them.
Unplugging my ethernet cable does the job perfectly, so ideally I'd like a non-admin programming way to do that. Anyone got any sure-fire ideas? Thanks.
Last edited by BarryG on Fri Jun 12, 2020 8:01 am, edited 1 time in total.
Re: Stop WebGadget accessing the internet
Set a proxy to localhost or any other wrong address.
(Doesn't matter if you don't have a proxy)But depending on the content of the page, there may be error messages.
(Doesn't matter if you don't have a proxy)
Code: Select all
HTTPProxy("socks4://127.0.0.1")
Re: Stop WebGadget accessing the internet
Tried that, but HTTPProxy() doesn't work for a WebGadget()? It didn't make any difference here; the WebGadget() locked-up while loading a million external links from the local HTML file. I just need to quickly show the basic HTML, not the links.
Re: Stop WebGadget accessing the internet
You're right, I just reread the doc and it's only used for a few functions and not for the webgadget.
HTTPProxy()
Specify a proxy to use for the following HTTP commands: GetHTTPHeader(), ReceiveHTTPFile(), ReceiveHTTPMemory(), HTTPRequest() and HTTPRequestMemory().
HTTPProxy()
Specify a proxy to use for the following HTTP commands: GetHTTPHeader(), ReceiveHTTPFile(), ReceiveHTTPMemory(), HTTPRequest() and HTTPRequestMemory().
Re: Stop WebGadget accessing the internet
Open the website in a browser and save it locally. A new folder is created that contains all the scripts and images.
This saved website is 100% offline, no access to the internet required
This saved website is 100% offline, no access to the internet required
Re: Stop WebGadget accessing the internet
Hello, give this a try.
It's based on a very old example by hm.
It looks a little complex, but it seems to do the job for me. Unfortunately I currently don't have much time to test further.
Code: Select all
EnableExplicit
;------------------------------------------------------------------------------
;- * IDispatch implementation
;------------------------------------------------------------------------------
;- Constants
#DISPID_AMBIENT_DLCONTROL = -5512
#DLCTL_DLIMAGES = $00000010
#DLCTL_VIDEOS = $00000020
#DLCTL_BGSOUNDS = $00000040
#DLCTL_NO_SCRIPTS = $00000080
#DLCTL_NO_JAVA = $00000100
#DLCTL_NO_RUNACTIVEXCTLS = $00000200
#DLCTL_NO_DLACTIVEXCTLS = $00000400
#DLCTL_DOWNLOADONLY = $00000800
#DLCTL_NO_FRAMEDOWNLOAD = $00001000
#DLCTL_RESYNCHRONIZE = $00002000
#DLCTL_PRAGMA_NO_CACHE = $00004000
#DLCTL_NO_BEHAVIORS = $00008000
#DLCTL_NO_METACHARSET = $00010000
#DLCTL_URL_ENCODING_DISABLE_UTF8 = $00020000
#DLCTL_URL_ENCODING_ENABLE_UTF8 = $00040000
#DLCTL_FORCEOFFLINE = $10000000
#DLCTL_NO_CLIENTPULL = $20000000
#DLCTL_SILENT = $40000000
#DLCTL_OFFLINEIFNOTCONNECTED = $80000000
#DLCTL_OFFLINE = #DLCTL_OFFLINEIFNOTCONNECTED
;------------------------------------------------------------------------------
;- Structures
Structure IDispatch_Functions
QueryInterface.i
AddRef.i
Release.i
GetTypeInfoCount.i
GetTypeInfo.i
GetIDsOfNames.i
Invoke.i
EndStructure
Structure IDispatch_Object
*IDispatch.IDispatch
RefCount.l
EndStructure
Global NewList g_IDispatch_Objects.IDispatch_Object()
;------------------------------------------------------------------------------
;- IUnknown methods
Procedure IDispatch_QueryInterface(*THIS.IDispatch_Object, *iid.IID, *Object.INTEGER)
If *Object = 0
ProcedureReturn #E_INVALIDARG
ElseIf CompareMemory(*iid, ?IID_IUnknown, SizeOf(IID)) Or CompareMemory(*iid, ?IID_IDispatch, SizeOf(IID))
*Object\i = *THIS
*THIS\RefCount + 1
ProcedureReturn #S_OK
Else
*Object\i = 0
ProcedureReturn #E_NOINTERFACE
EndIf
EndProcedure
Procedure IDispatch_AddRef(*THIS.IDispatch_Object)
*THIS\RefCount + 1
ProcedureReturn *THIS\RefCount
EndProcedure
Procedure IDispatch_Release(*THIS.IDispatch_Object)
*THIS\RefCount - 1
If *THIS\RefCount <= 0
ChangeCurrentElement(g_IDispatch_Objects(), *THIS)
DeleteElement(g_IDispatch_Objects())
ProcedureReturn 0
Else
ProcedureReturn *THIS\RefCount
EndIf
EndProcedure
;------------------------------------------------------------------------------
;- IDispatch methods
Procedure IDispatch_GetTypeInfoCount(*THIS.IDispatch_Object, *pctinfo.INTEGER)
ProcedureReturn #E_NOTIMPL
EndProcedure
Procedure IDispatch_GetTypeInfo(*THIS.IDispatch_Object, iTInfo.l, lcid.l, *pptInfo.INTEGER)
ProcedureReturn #E_NOTIMPL
EndProcedure
Procedure IDispatch_GetIDsOfNames(*THIS.IDispatch_Object, *riid.IID, rgszNames.i, cNames.l, lcid.l, *rgDispID.INTEGER)
ProcedureReturn #E_NOTIMPL
EndProcedure
Procedure IDispatch_Invoke(*THIS.IDispatch_Object, dispIdMember.l, *riid.IID, lcid.l, wFlags.w, *pDispParams.DISPPARAMS, *pVarResult.Variant, pExcpInfo.i, puArgErr.i)
If dispIdMember = #DISPID_AMBIENT_DLCONTROL
*pVarResult\vt = #VT_I4
; *pVarResult\lVal = #DLCTL_NO_JAVA | #DLCTL_NO_DLACTIVEXCTLS | #DLCTL_NO_RUNACTIVEXCTLS | #DLCTL_SILENT
*pVarResult\lVal = #DLCTL_SILENT
Debug "**** IDispatch::Invoke() #DISPID_AMBIENT_DLCONTROL"
Debug *pVarResult\lVal
Debug "****"
ProcedureReturn #S_OK
EndIf
ProcedureReturn #DISP_E_MEMBERNOTFOUND
EndProcedure
;------------------------------------------------------------------------------
;- Data section
DataSection
_IDispatch_Functions:
Data.i @IDispatch_QueryInterface()
Data.i @IDispatch_AddRef()
Data.i @IDispatch_Release()
Data.i @IDispatch_GetTypeInfoCount()
Data.i @IDispatch_GetTypeInfo()
Data.i @IDispatch_GetIDsOfNames()
Data.i @IDispatch_Invoke()
IID_IUnknown: ; {00000000-0000-0000-C000-000000000046}
Data.l $00000000
Data.w $0000, $0000
Data.b $C0, $00, $00, $00, $00, $00, $00, $46
IID_IDispatch: ; {00020400-0000-0000-C000-000000000046}
Data.l $00020400
Data.w $0000, $0000
Data.b $C0, $00, $00, $00, $00, $00, $00, $46
EndDataSection
;------------------------------------------------------------------------------
;- * IOleClientSite implementation
;------------------------------------------------------------------------------
;- Structures
Structure IOleClientSite_Functions
QueryInterface.i
AddRef.i
Release.i
SaveObject.i
GetMoniker.i
GetContainer.i
ShowObject.i
OnShowWindow.i
RequestNewObjectLayout.i
EndStructure
Structure IOleClientSite_Object
*IOleClientSite.IOleClientSite
RefCount.l
EndStructure
Global NewList g_IOleClientSite_Objects.IOleClientSite_Object()
;------------------------------------------------------------------------------
;- IUnknown methods
Procedure IOleClientSite_QueryInterface(*THIS.IOleClientSite_Object, *iid.IID, *Object.INTEGER)
If *Object = 0
ProcedureReturn #E_INVALIDARG
ElseIf CompareMemory(*iid, ?IID_IUnknown, SizeOf(IID)) Or CompareMemory(*iid, ?IID_IOleClientSite, SizeOf(IID))
*Object\i = *THIS
*THIS\RefCount + 1
ProcedureReturn #S_OK
; return pointer to IDispatch object (IDispatch is queried by the webbrowser control on its initialization)
ElseIf CompareMemory(*iid, ?IID_IDispatch, SizeOf(IID))
*Object\i = @g_IDispatch_Objects()
ProcedureReturn #S_OK
Else
*Object\i = 0
ProcedureReturn #E_NOINTERFACE
EndIf
EndProcedure
Procedure IOleClientSite_AddRef(*THIS.IOleClientSite_Object)
*THIS\RefCount + 1
ProcedureReturn *THIS\RefCount
EndProcedure
Procedure IOleClientSite_Release(*THIS.IOleClientSite_Object)
*THIS\RefCount - 1
If *THIS\RefCount <= 0
ChangeCurrentElement(g_IOleClientSite_Objects(), *THIS)
DeleteElement(g_IOleClientSite_Objects())
ProcedureReturn 0
Else
ProcedureReturn *THIS\RefCount
EndIf
EndProcedure
;------------------------------------------------------------------------------
;- IOleClientSite methods
Procedure IOleClientSite_SaveObject(*THIS.IOleClientSite_Object)
ProcedureReturn #E_NOTIMPL
EndProcedure
Procedure IOleClientSite_GetMoniker(*THIS.IOleClientSite_Object, dwAssign.l, dwWhichMoniker.l, mk.i )
ProcedureReturn #E_NOTIMPL
EndProcedure
Procedure IOleClientSite_GetContainer(*THIS.IOleClientSite_Object, container.i)
ProcedureReturn #E_NOTIMPL
EndProcedure
Procedure IOleClientSite_ShowObject(*THIS.IOleClientSite_Object)
ProcedureReturn #E_NOTIMPL
EndProcedure
Procedure IOleClientSite_OnShowWindow(*THIS.IOleClientSite_Object, fShow.l)
ProcedureReturn #E_NOTIMPL
EndProcedure
Procedure IOleClientSite_RequestNewObjectLayout(*THIS.IOleClientSite_Object)
ProcedureReturn #E_NOTIMPL
EndProcedure
;------------------------------------------------------------------------------
;- Data section
DataSection
_IOleClientSite_Functions:
Data.i @IOleClientSite_QueryInterface()
Data.i @IOleClientSite_AddRef()
Data.i @IOleClientSite_Release()
Data.i @IOleClientSite_SaveObject()
Data.i @IOleClientSite_GetMoniker()
Data.i @IOleClientSite_GetContainer()
Data.i @IOleClientSite_ShowObject()
Data.i @IOleClientSite_OnShowWindow()
Data.i @IOleClientSite_RequestNewObjectLayout()
; IOleClientSite
; {00000118-0000-0000-C000-000000000046}
IID_IOleClientSite:
Data.l $00000118
Data.w $0000, $0000
Data.b $C0, $00, $00, $00, $00, $00, $00, $46
EndDataSection
Procedure ResizeWebgadget(gd.i, x.l, y.l, width.l, height.l, redraw.b = #True)
Define.i hwGd
Define.IWebBrowser2 wb
Define.IOleObject oObj
Define.IOleInPlaceObject ipObj
Define.RECT rc
hwGd = GadgetID(gd)
wb = GetWindowLongPtr_(hwGd, #GWLP_USERDATA)
If wb
If wb\QueryInterface(?IID_IOleObject, @oObj) = #S_OK
If oObj\QueryInterface(?IID_IOleInPlaceObject, @ipObj) = #S_OK
;Use MoveWindow() or ResizeGadget()
MoveWindow_(hwGd, x, y, width, height, redraw)
;ResizeGadget(gd, x, y, width, height) maybe this calls SetObjectRects() again ?
GetClientRect_(hwGd, @rc)
ipObj\SetObjectRects(@rc, @rc)
ipObj\Release()
EndIf
oObj\Release()
EndIf
EndIf
EndProcedure
DataSection
; IOleObject
; {00000112-0000-0000-C000-000000000046}
IID_IOleObject:
Data.l $00000112
Data.w $0000, $0000
Data.b $C0, $00, $00, $00, $00, $00, $00, $46
; IOleControl
; {B196B288-BAB4-101A-B69C-00AA00341D07}
IID_IOleControl:
Data.l $B196B288
Data.w $BAB4, $101A
Data.b $B6, $9C, $00, $AA, $00, $34, $1D, $07
;("00000113-0000-0000-C000-000000000046")
IID_IOleInPlaceObject:
Data.l $00000113
Data.w $0000, $0000
Data.b $C0, $00, $00, $00, $00, $00, $00, $46
EndDataSection
;------------------------------------------------------------------------------
;- * Main program
Enumeration
#Window_Main
EndEnumeration
Enumeration
#Button_Start
#Web_0
EndEnumeration
Define.IOleObject oleObject
Define.IOleControl oleControl
Define.i Event, WindowID, GadgetID, EventType
If OpenWindow(#Window_Main, 302, 15, 800, 620, "pb-klicker", #PB_Window_SystemMenu | #PB_Window_SizeGadget | #PB_Window_TitleBar )
ButtonGadget(#Button_Start, 20, 560, 110, 40, "Start")
WebGadget(#Web_0, 20, 20, 760, 515, "about:blank")
; WebGadget initialization
Global myBrowser.IWebBrowser2 = GetWindowLongPtr_(GadgetID(#Web_0), #GWLP_USERDATA)
If myBrowser\QueryInterface(?IID_IOleObject, @oleObject) = #S_OK
; new IDispatch object
AddElement(g_IDispatch_Objects())
g_IDispatch_Objects()\IDispatch = ?_IDispatch_Functions
; new IOleClientSite object
AddElement(g_IOleClientSite_Objects())
g_IOleClientSite_Objects()\IOleClientSite = ?_IOleClientSite_Functions
; tell the webbrowser client about our IOleClientSite object
If oleObject\SetClientSite(@g_IOleClientSite_Objects()) = #S_OK
If myBrowser\QueryInterface(?IID_IOleControl, @oleControl) = #S_OK
; tell the webbrowser control that Ambient Properties have changed so it queries our IOleClientSite object for the IDispatch interface and calls IDispatch::Invoke with DISPID_AMBIENT_DLCONTROL
oleControl\OnAmbientPropertyChange(#DISPID_AMBIENT_DLCONTROL)
oleControl\Release()
EndIf
EndIf
oleObject\Release()
EndIf
EndIf
Repeat
Event = WaitWindowEvent()
WindowID = EventWindow()
GadgetID = EventGadget()
EventType = EventType()
If Event = #PB_Event_Gadget
If GadgetID = #Button_Start
SetGadgetText(#Web_0, "https://www.bing.com/images/search")
ElseIf GadgetID = #Web_0
EndIf
EndIf
Until Event = #PB_Event_CloseWindow ; End of the event loop
End
It looks a little complex, but it seems to do the job for me. Unfortunately I currently don't have much time to test further.
Re: Stop WebGadget accessing the internet
Derren, I can't do that. No internet access for the WebGadget() is a requirement of this request, plus my app might be run from a read-only location. It works perfectly if I pull out my ethernet cable, so that's what I'd like to achieve (without pulling it out).Derren wrote:Open the website in a browser and save it locally. A new folder is created that contains all the scripts and images.
This saved website is 100% offline, no access to the internet required
I read about put_Offline() for the IWebBrowser2 interface for WebGadgets, but that didn't work either -> https://docs.microsoft.com/en-us/previo ... v%3Dvs.85)
My code was:Microsoft wrote:In offline mode, the browser is forced to read HTML pages from the local cache instead of reading from the source document online.
Code: Select all
Browser.IWebBrowser2=GetWindowLongPtr_(GadgetID(#WebGadget),#GWL_USERDATA)
Browser\put_Offline(#VARIANT_TRUE)
Last edited by BarryG on Fri Jun 12, 2020 12:29 pm, edited 5 times in total.
Re: Stop WebGadget accessing the internet
There is the solution to stop the network card while the file is loading but it takes several seconds to reestablish the connection.
(netsh /? to see commands)
Another solution: use an application firwall to deny output access to this application, but if it's for distribution, the other PCs must have an application FW.
Breaking the connection can anyway cause error messages (ie Error 404). It depends on the page.
The only clean way is to filter the whole page and remove the links, but if the page contains for example javascript with dynamic links it can be difficult. If it's only standard HTML, it's easy with the RegEx
Finally, if the "useful" part of the page is known, we can proceed in reverse order: Load the page (with ReceivHTTPFile|Memory), extract the "useful" part and rebuild a page.
This can be long but easy if the page almost always contains the same data (e.g. reports).

(netsh /? to see commands)
Another solution: use an application firwall to deny output access to this application, but if it's for distribution, the other PCs must have an application FW.
Breaking the connection can anyway cause error messages (ie Error 404). It depends on the page.
The only clean way is to filter the whole page and remove the links, but if the page contains for example javascript with dynamic links it can be difficult. If it's only standard HTML, it's easy with the RegEx
Finally, if the "useful" part of the page is known, we can proceed in reverse order: Load the page (with ReceivHTTPFile|Memory), extract the "useful" part and rebuild a page.
This can be long but easy if the page almost always contains the same data (e.g. reports).

Re: Stop WebGadget accessing the internet
firace, your code seems to be 99% okay when I use the #DLCTL_NO_SCRIPTS flag with it. I say 99% because it's still a tad slow to show the HTML data, but hey, two seconds to show it is better than over two minutes without it! I'll see how I can clean up your code to plug into my app.
Marc56us, maybe there's a programmatic way (with Windows) that I can add my exe to the Windows firewall temporarily? I'll look into that approach, too, but it seems to need admin rights with the "netsh" command that you mentioned. Damn.
Thanks guys. (And I'm still open to any other suggestions; I don't consider this "Solved" yet, haha).
Marc56us, maybe there's a programmatic way (with Windows) that I can add my exe to the Windows firewall temporarily? I'll look into that approach, too, but it seems to need admin rights with the "netsh" command that you mentioned. Damn.
Thanks guys. (And I'm still open to any other suggestions; I don't consider this "Solved" yet, haha).
Re: Stop WebGadget accessing the internet
Are you not in control of the data you want to display in your webgadget??
Just download your website before shipping your programm.
I don't see a way to run HTML-Code that contains images and CSS in the webgadget from a "read only" location. If you can't put your html file on the user's computer, than you need to read the code from the memory, including turning external CSS into internal and converting any and all images to Base64.
Just download your website before shipping your programm.
I don't see a way to run HTML-Code that contains images and CSS in the webgadget from a "read only" location. If you can't put your html file on the user's computer, than you need to read the code from the memory, including turning external CSS into internal and converting any and all images to Base64.
Last edited by Derren on Fri Jun 12, 2020 10:00 am, edited 1 time in total.
Re: Stop WebGadget accessing the internet
No, I'm not in control. That's exactly right. The user selects any pre-existing, locally-saved HTML file from their PC to view as plain HTML. I don't have access to that file, and thus have no way to know what that HTML data is. It needs to be shown as raw HTML (formatting etc) but can be missing the images and other such data. I need to stop this extra data being downloaded on demand.Derren wrote:Are you not in control of the data you want to display in your webgadget??
Re: Stop WebGadget accessing the internet
Okay then, I have no idea what the purpose of this is then. If the user wants to load a file that is riddled with external resources, it's his own choosing, I would say.
Displaying any user-chosen html file, but castrate it so that only the text is left seems weird to me.
But could this be a solution for you?
If pictures are not relevant either, I would just parse the HTML file and remove any tags that load external resources.
There are only 3 types. Images, Scripts and CSS. The first 2 are loaded in a similar way, the CSS had a completely different command.
You can use RegEx to identify these.
https://regex101.com/r/AEkzQg/1
This RegEx catches all "src" elements in any tag. The capture group 4 contains the actual url. You can either replace it with "#" to create a link to nowhere, or replace the whole src="url" part with just nothing.
On topf of that you need to take care of external CSS, which is referenced with a Link-Tag and a href-Element (don't delete all href-elements, because hyperlinks use them as well). https://regex101.com/r/U7SO6M/1
Displaying any user-chosen html file, but castrate it so that only the text is left seems weird to me.
But could this be a solution for you?
If pictures are not relevant either, I would just parse the HTML file and remove any tags that load external resources.
There are only 3 types. Images, Scripts and CSS. The first 2 are loaded in a similar way, the CSS had a completely different command.
You can use RegEx to identify these.
https://regex101.com/r/AEkzQg/1
This RegEx catches all "src" elements in any tag. The capture group 4 contains the actual url. You can either replace it with "#" to create a link to nowhere, or replace the whole src="url" part with just nothing.
On topf of that you need to take care of external CSS, which is referenced with a Link-Tag and a href-Element (don't delete all href-elements, because hyperlinks use them as well). https://regex101.com/r/U7SO6M/1
Re: Stop WebGadget accessing the internet
Below is a picture to show what I mean. Assume the file is just the raw HTML web data of this actual forum post, and it's loaded into a WebGadget(). When the internet is connected, the WebGadget() will show it like the "Before" image below. I want it to show as the "After" image instead, as if the internet was offline (even though it's not). That is, it retains all formatting and such but won't be able to load the images and other non-local data.Derren wrote:I have no idea what the purpose of this is
I will look at parsing the HTML file, as you suggested. Hopefully that will do the job, but denying the WebGadget() access to the internet would be a lot simpler.

Last edited by BarryG on Fri Jun 12, 2020 11:01 am, edited 2 times in total.