text from browser window
text from browser window
hello,
I need to build a tool (windows 7) to extract the html code from a browsers window.
Till now I can locate the corresponding window by a keyword in the window's title.
But how do I find the html text inside? I got stuck completely.
And to my surprise I'm not able to find something in the Coding Questions.
tomio
I need to build a tool (windows 7) to extract the html code from a browsers window.
Till now I can locate the corresponding window by a keyword in the window's title.
But how do I find the html text inside? I got stuck completely.
And to my surprise I'm not able to find something in the Coding Questions.
tomio
Re: text from browser window
If the browser is IE, you could access remotely its IHTMLDocoment2 object.
The "difficult" thing is to obtain the address of that object.
See -> http://support.microsoft.com/kb/249232
After you have done that, you can use the object methods to do almost anything.
The "difficult" thing is to obtain the address of that object.
See -> http://support.microsoft.com/kb/249232
After you have done that, you can use the object methods to do almost anything.
"Have you tried turning it off and on again ?"
A little PureBasic review
A little PureBasic review
- netmaestro
- PureBasic Bullfrog
- Posts: 8451
- Joined: Wed Jul 06, 2005 5:42 am
- Location: Fort Nelson, BC, Canada
Re: text from browser window
If you use the webgadget (not mozilla) on Windows it's fairly simple:
Alternatively you can pass #PB_Web_SelectedText to get just the current selection.
Code: Select all
If OpenWindow(0, 0, 0, 600, 330, "WebGadget", #PB_Window_SystemMenu | #PB_Window_ScreenCentered)
ButtonGadget(1, 250, 300,100,20,"copy html code")
WebGadget(0, 10, 10, 580, 280, "http://www.purebasic.com")
; Note: if you want to use a local file, change last parameter to "file://" + path + filename
Repeat
ev= WaitWindowEvent()
Select ev
Case #PB_Event_Gadget
If EventGadget() = 1
txt$ = GetGadgetItemText(0,#PB_Web_HtmlCode)
OpenWindow(1,0,0,640,480,"HTML Code:",#PB_Window_SystemMenu | #PB_Window_ScreenCentered)
EditorGadget(2,0,0,640,480,#PB_Editor_ReadOnly)
SetGadgetText(2, txt$)
Repeat:Until WaitWindowEvent()=#PB_Event_CloseWindow
CloseWindow(1)
EndIf
EndSelect
Until ev = #PB_Event_CloseWindow
EndIf
BERESHEIT
Re: text from browser window
But he said, "from a browser window". Moreover he said he "located the window", so I believe he meant an external browser.
From webgadget obviously is easy as you said.
From webgadget obviously is easy as you said.
"Have you tried turning it off and on again ?"
A little PureBasic review
A little PureBasic review
-
- Always Here
- Posts: 6426
- Joined: Fri Oct 23, 2009 2:33 am
- Location: Wales, UK
- Contact:
Re: text from browser window
....Seems to me that both methods posted above need to know the web address, so if you can find that, having found the browser window, then the method of extracting the text is down to which best meets your requirement.
IdeasVacuum
If it sounds simple, you have not grasped the complexity.
If it sounds simple, you have not grasped the complexity.
Re: text from browser window
thanks.
I've read your answers and I have to think about it.
Yes, I can locate the window.
The problem is to read the html code.
Actually, it's not the code I'm interested in, but some short piece of the very large text-output.
tomio
I've read your answers and I have to think about it.
Yes, I can locate the window.
The problem is to read the html code.
Actually, it's not the code I'm interested in, but some short piece of the very large text-output.
tomio
Re: text from browser window
And that's different from the original question... since the output is not the html code but the final rendering of it.Tomio wrote: Actually, it's not the code I'm interested in, but some short piece of the very large text-output.
So probably you don't need what outlined in the the previous answer (it can work too but maybe is overkill) and you can pull it off in a simpler way.
You could try sending a #WM_COPY message but I don't know if IE does honor the request.
Alternatively you could try SendInput() or keybd_event() to make the browser save the output text to file or to the clipboard with a CTRL + C.
"Have you tried turning it off and on again ?"
A little PureBasic review
A little PureBasic review
Re: text from browser window
Sounds good.
The tool is to run as an office tool.
The user is a secretary (a lady).
If possible, the whole stuff should run in the background.
The tool is to be started once and check the available windows periodically for a special url.
If found it should grab the whole output (or part) somehow for further processing.
I have build several tools in the past concerning windows, like positioning+[no]header+etc.
Just design. But never send a key command.
tomio
The tool is to run as an office tool.
The user is a secretary (a lady).
If possible, the whole stuff should run in the background.
The tool is to be started once and check the available windows periodically for a special url.
If found it should grab the whole output (or part) somehow for further processing.
I have build several tools in the past concerning windows, like positioning+[no]header+etc.
Just design. But never send a key command.
Question: How do I send a command to the window like the one mentioned or whichever could help me??SendInput() or keybd_event() to make the browser save the output text to file
tomio
Re: text from browser window
Based on what you are saying you don't need an external browser then.
1) If the thing must run in background probably it's better to create a self-included solution using the PB's webgadget. In that case you can find many examples on the forum on how to interact with it (search for IHTMLDocument2).
2) You could also simply download the data from the url, and than process the file to extract the data (maybe the simplest solution ?).
3) If you want to use an external browser for some reason anyway, you can also find sendinput() examples, just do a search. Basically you execute the target program and while it has the focus you start to synthesize keyboard and mouse input. But in your case I wouldn't follow this road.
1) If the thing must run in background probably it's better to create a self-included solution using the PB's webgadget. In that case you can find many examples on the forum on how to interact with it (search for IHTMLDocument2).
2) You could also simply download the data from the url, and than process the file to extract the data (maybe the simplest solution ?).
3) If you want to use an external browser for some reason anyway, you can also find sendinput() examples, just do a search. Basically you execute the target program and while it has the focus you start to synthesize keyboard and mouse input. But in your case I wouldn't follow this road.
"Have you tried turning it off and on again ?"
A little PureBasic review
A little PureBasic review
Re: text from browser window
perhaps I don't understand your reply.
The secretary has several windos open. None, one or more can be IE browser, others can be Word or whatever.
When she decides to check that particular url (from outside, another office department), my tool shortly after will notice and grab the text output. The secretary does not have to take care of it. So what do you mean with external browser?
And I can't use a gadget. I don't manipulate/create a Window for her, but just want to read out the text from her browser whenever she clicks on that favorite url. In principle if things work out, she even doesn't need to know from the tool.
Your solution 2) is what we will do, when there is no other solution.
This would mean she always has to do a copy/paste. Depending on the situation could be 1-10 times a day. We would prefer to avoid this.
By the way, the extracted text is not only to be saved but will be manipulated in some complex way. This is what I want to do in PB anyway. So extract + process is what I would like to do in PB in one go.
tomio
The secretary has several windos open. None, one or more can be IE browser, others can be Word or whatever.
When she decides to check that particular url (from outside, another office department), my tool shortly after will notice and grab the text output. The secretary does not have to take care of it. So what do you mean with external browser?
And I can't use a gadget. I don't manipulate/create a Window for her, but just want to read out the text from her browser whenever she clicks on that favorite url. In principle if things work out, she even doesn't need to know from the tool.
Your solution 2) is what we will do, when there is no other solution.
This would mean she always has to do a copy/paste. Depending on the situation could be 1-10 times a day. We would prefer to avoid this.
By the way, the extracted text is not only to be saved but will be manipulated in some complex way. This is what I want to do in PB anyway. So extract + process is what I would like to do in PB in one go.
tomio
Re: text from browser window
So you need to monitor browsers windows and when a specific url is visited act upon that in some way.Tomio wrote:When she decides to check that particular url my tool shortly after will notice ...
And you cannot get any help from the user, for example by making her drag the url to your client.
Is that correct ? If it is, than THAT is the main hurdle.
You must admit it wasn't really evident from your original post, anyway I started to reply so I'll give it one more shot.
You cannot reliably poll all the windows of the browsers to check for a specific url and hope to catch it at the right time.
If the PC is connected to internet through a proxy maybe you can inspect the log and when a particular url is visited take from there.
You can also write your own simple proxy, run it on the secretary's PC and filter/examine all the traffic.
If you can retrieve that information from a similar source (some kind of log) probably it's the easiest way.
If not, you need some collaboration from the browser. Maybe a plugin or something similar.
For IE you can create a special DLL called Browser Helper Object (BHO). I'm not up to date with it so I don't know if it's still feasible.
See -> http://msdn.microsoft.com/en-us/library ... 85%29.aspx
Maybe someone else have some other ideas.
"Have you tried turning it off and on again ?"
A little PureBasic review
A little PureBasic review
Re: text from browser window
Just my 2 cents.
Have a look at http://crossrider.com/ .Very useful framework for creating cross-browser extensions.luis wrote:
If not, you need some collaboration from the browser. Maybe a plugin or something similar.
I didn't try it practically, but, for making things simpler, what about setting up Small HTTP server to act as a middleman between browser and internet(control/filter the links you need) and control it from PB via cmd line.luis wrote:
You can also write your own simple proxy, run it on the secretary's PC and filter/examine all the traffic
Re: text from browser window
hm,
with this little code running in the background periodically, I can check for a keyword set by the called url with the <title> tag.
This works. So for me the problem is not to find the window but to extract the text output.
tomio
with this little code running in the background periodically, I can check for a keyword set by the called url with the <title> tag.
Code: Select all
Procedure FindPartWin(part$)
r=GetWindow_(GetDesktopWindow_(),#GW_CHILD)
Repeat
t$=Space(999) : GetWindowText_(r,t$,999)
If FindString(t$,part$,1)<>0
w=r
Else
r=GetWindow_(r,#GW_HWNDNEXT)
EndIf
Until r=0 Or w<>0
ProcedureReturn w
EndProcedure
Debug FindPartWin("A Keyword")
tomio
Re: text from browser window
If you want to get the html page from internet explorer, luis gave you the solution, i use also.
See here: http://www.purebasic.fr/english/viewtopic.php?t=24570
From Mozilla Firefox, I use MozRepl extension.
For other browsers it will be difficult!
See here: http://www.purebasic.fr/english/viewtopic.php?t=24570
From Mozilla Firefox, I use MozRepl extension.
For other browsers it will be difficult!
Re: text from browser window
As I said, I can locate the page in question.
And save the text I had selected (for further processing).
Still I'm busy with selecting the text by the tool itself.
But I'm confident to solve this soon.
Thanks for any help so far
tomio
And save the text I had selected (for further processing).
Still I'm busy with selecting the text by the tool itself.
But I'm confident to solve this soon.
Thanks for any help so far
tomio