Page 1 of 1
#PB_Web_PlainText for GetGadgetItemText()
Posted: Thu May 26, 2011 12:58 pm
by MachineCode
Can we have a #PB_Web_PlainText flag for GetGadgetItemText() to return the plain text of the web page? We've got a #PB_Web_HtmlCode so a plain text alternative would be great. And yes, I know we can use #PB_Web_SelectedText but that's different and too cumbersome, as we need to highlight all the text first and that's annoying for the user.
Re: #PB_Web_PlainText for GetGadgetItemText()
Posted: Thu May 26, 2011 1:26 pm
by TomS
I know it's a feature request but here's a possible solution.
There's no formatting at all in this example (not even <br> which would be easy to add).
But when it comes to Div's and Tables you're lost.
This code would also return any JS-Code there is in the body (but only if it's not encapsulated by CDATA).
Code: Select all
<body><script type="javascript">document.write('hello world<br>');</script>
How are you?</body>
Would indeed return "
document.write('hello world'); How are you?"
Code: Select all
<body><script type="javascript">
<![CDATA[
document.write('hello world<br>');
]]>
</script>
How are you?</body>
Would return "
How are you?"
Code: Select all
Procedure.s HTML2PlainText(input.s)
Protected workString.s = Mid(input, FindString(input, "<body ", 1)) ;Start at the body-tag thus ignore all the styles and JS in the head-tag.
Protected *c.Character = @workString
Protected outside.i = #True
Protected lt.i = Asc("<")
Protected gt.i = Asc(">")
Protected result.s
While *c\c ! 0
Select *c\c
Case lt
outside = #False
Case gt
outside = #True
EndSelect
If outside = #True
If *c\c ! gt
result + Chr(*c\c)
EndIf
EndIf
*c + SizeOf(Character)
Wend
ProcedureReturn result
EndProcedure
Debug HTML2PlainText("<html><body background-color=#FFFFFF><h1>Welcome</h1> <p>To the jungle!</p> <p>of html Code</p></body></html>")
Debug HTML2PlainText("<h1>Welcome</h1> <p>To the jungle!</p> <p>of html Code</p>")