Basically, this short routine strips away all the HTML leaving only the content.
RESULT1$ Only strips off the HTML (so there may be space problems)
RESULT2$ Replaces the HTML with a single space, then trims away any double spaces
Code: Select all
CreateRegularExpression(0, "\<[^\<]+\>")
STRING$="hELLO<P ALIGN=left HEIGHT=22>Hello World</p> wORLD <Br>" ; Yes I know this is bullsh*t HTML
RESULT1$=ReplaceRegularExpression(0, STRING$,"")
RESULT2$=trim(ReplaceString(ReplaceString(ReplaceRegularExpression(0, STRING$," ")," "," ")," "," "))
Debug RESULT1$
Debug RESULT2$
End