It is currently Sun Dec 16, 2018 6:59 pm

All times are UTC + 1 hour




Post new topic Reply to topic  [ 11 posts ] 
Author Message
 Post subject: HTML2TEXT
PostPosted: Wed Oct 31, 2018 7:22 pm 
Offline
User
User
User avatar

Joined: Sat Apr 20, 2013 2:58 pm
Posts: 24
Location: Hungary; Pilisvörösvár
Hi!
I need a lot of help.

Someone who is a professional would translate these programs to me:
https://www.vbarchiv.net/tipps/tipp_200-html2text.html

VB to PB.

Thank you very much in advance!


Top
 Profile  
Reply with quote  
 Post subject: Re: HTML2TEXT
PostPosted: Wed Oct 31, 2018 9:19 pm 
Offline
Addict
Addict
User avatar

Joined: Sun Nov 05, 2006 11:42 pm
Posts: 4349
Location: Lyon - France
Hello perhaps a begining :

http://www.purebasic.fr/english/viewtop ... 444#266444
https://www.purebasic.fr/german/viewtop ... aa#p112747

_________________
ImageThe happiness is a road...
Not a destination


Top
 Profile  
Reply with quote  
 Post subject: Re: HTML2TEXT
PostPosted: Thu Nov 01, 2018 10:06 am 
Offline
Enthusiast
Enthusiast

Joined: Tue Oct 14, 2014 12:09 pm
Posts: 217
Url which explain one method
https://www.computerhope.com/issues/ch001877.htm


Top
 Profile  
Reply with quote  
 Post subject: Re: HTML2TEXT
PostPosted: Fri Nov 09, 2018 7:03 pm 
Offline
User
User
User avatar

Joined: Sat Apr 20, 2013 2:58 pm
Posts: 24
Location: Hungary; Pilisvörösvár
Hello!
Thanks for the comments!
I've created some usable code. How can I create a HTML2TEXT.ddl? And how could it be called? Thank you!

Code:
DataSection
 
  IID_IHTMLDocument2: ; {332C4425-26CB-11D0-B483-00C04FD90119}
  Data.l $332C4425
  Data.w $26CB, $11D0       
  Data.b $B4, $83, $00, $C0, $4F, $D9, $01, $19
 
  IID_IHTMLDocument3: ; {3050F485-98B5-11CF-BB82-00AA00BDCE0B}
  Data.l $3050F485
  Data.w $98B5, $11CF
  Data.b $BB, $82, $00, $AA, $00, $BD, $CE, $0B
 
  IID_NULL: ; {00000000-0000-0000-0000-000000000000}
  Data.l $00000000
  Data.w $0000, $0000
  Data.b $00, $00, $00, $00, $00, $00, $00, $00
 
EndDataSection



;----------------
Procedure WebGadget_Document(Gadget, *IID)
  Document = 0
 
  Browser.IWebBrowser2 = GetWindowLong_(GadgetID(Gadget), #GWL_USERDATA)
  If Browser
    If Browser\get_Document(@DocumentDispatch.IDispatch) = #S_OK And DocumentDispatch
      DocumentDispatch\QueryInterface(*IID, @Document)
      DocumentDispatch\Release()
    EndIf
  EndIf     
 
  ProcedureReturn Document
EndProcedure
;------------------------------

Procedure.s WebGadget_PageText(Gadget)
  Result$ = ""
 
  Document.IHTMLDocument2 = WebGadget_Document(Gadget, ?IID_IHTMLDocument2)
  If Document
    If Document\get_body(@Body.IHTMLElement) = #S_OK   
      If Body\get_innerText(@bstr_text) = #S_OK And bstr_text
        Result$ = PeekS(bstr_text, -1, #PB_Unicode)
        SysFreeString_(bstr_text)
      EndIf         
     
      Body\Release()
    EndIf       
    Document\Release()
  EndIf         
 
  ProcedureReturn Result$
EndProcedure

Procedure.s HTML2TEXT (url.s, out.s="txt")
 
 
  html.s
 
  DeleteFile("oldal.html")
 
  InitNetwork()
 
  URL$=url
  ReceiveHTTPFile(URL$,"oldal.html")
  ReadFile(0, "oldal.html")   ; if the file could be read, we continue...
  While Eof(0) = 0            ; loop as long the 'end of file' isn't reached
    html= html+ ReadString(0) ; display line by line in the debug window
   
  Wend
  CloseFile(0)               ; close the previously opened file
 
  DeleteFile("oldal.html")
 
  If out="csv"
    hRegex = CreateRegularExpression(#PB_Any, "<\/? ?td ?\/?>", #PB_RegularExpression_NoCase)
    html = ReplaceRegularExpression(hRegex, html, ";")
  EndIf
 
 
  OpenFile(1,"oldal.html")
  WriteString(1,html)
  CloseFile(1)
 
 
 
 
 
 
 
 
  If OpenWindow(0, 0, 0, 600, 300, "WebGadget", #PB_Window_SystemMenu | #PB_Window_ScreenCentered | #PB_Window_Invisible)
   
    WebGadget(0, 10, 10, 580, 280, "file://"+GetCurrentDirectory() + "oldal.html")
    myBrowser.IWebBrowser2 = GetWindowLong_(GadgetID(0), #GWL_USERDATA)
    myBrowser\put_Silent(#True) 
    ; Note: if you want to use a local file, change last parameter to "file://" + path + filename
    Repeat
      Event = WaitWindowEvent()
     
     
     
      If  GetGadgetAttribute(0,#PB_Web_Busy)=0
       
       szoveg.s=WebGadget_PageText(0)
        Break
      EndIf
     
    Until Event = #PB_Event_CloseWindow
  EndIf
 
  ProcedureReturn szoveg
EndProcedure

Debug HTML2TEXT ("http://bestbet.site/show.php?show=one", "csv")


Top
 Profile  
Reply with quote  
 Post subject: Re: HTML2TEXT
PostPosted: Fri Nov 09, 2018 8:58 pm 
Offline
Addict
Addict
User avatar

Joined: Sun Nov 05, 2006 11:42 pm
Posts: 4349
Location: Lyon - France
1/ For create a DLL instead of an EXE, in the compiler options, in "executable format" choose "Shared DLL"

2/ For calling the DLL, you have the library
https://www.purebasic.com/documentation ... index.html

And using Callfunction or CallfunctionFast
https://www.purebasic.com/documentation ... ry.pb.html
Or prototypes
https://www.purebasic.com/documentation ... types.html

_________________
ImageThe happiness is a road...
Not a destination


Top
 Profile  
Reply with quote  
 Post subject: Re: HTML2TEXT
PostPosted: Fri Nov 09, 2018 9:10 pm 
Offline
User
User
User avatar

Joined: Sat Apr 20, 2013 2:58 pm
Posts: 24
Location: Hungary; Pilisvörösvár
I've created a dll file but it does not work.
Somewhere I'll wrong it. I'm a very beginner.

How do I rewrite this program to work?


Top
 Profile  
Reply with quote  
 Post subject: Re: HTML2TEXT
PostPosted: Sat Nov 10, 2018 11:47 am 
Offline
User
User
User avatar

Joined: Sat Apr 20, 2013 2:58 pm
Posts: 24
Location: Hungary; Pilisvörösvár
Code:
....
ProcedureDLL.s HTML2TEXT (url.s, out.s="txt")
 
 
  html.s
 
  DeleteFile("oldal.html")
 
  InitNetwork()
 
  URL$=url
  ReceiveHTTPFile(URL$,"oldal.html")
  ReadFile(0, "oldal.html")   ; if the file could be read, we continue...
  While Eof(0) = 0            ; loop as long the 'end of file' isn't reached
    html= html+ ReadString(0) ; display line by line in the debug window
   
  Wend
  CloseFile(0)               ; close the previously opened file
 
  DeleteFile("oldal.html")
 
  If out="csv"
    hRegex = CreateRegularExpression(#PB_Any, "<\/? ?td ?\/?>", #PB_RegularExpression_NoCase)
    html = ReplaceRegularExpression(hRegex, html, ";")
  EndIf
 
 
  OpenFile(1,"oldal.html")
  WriteString(1,html)
  CloseFile(1)
 
 
 
  If OpenWindow(0, 0, 0, 600, 300, "WebGadget", #PB_Window_SystemMenu | #PB_Window_ScreenCentered | #PB_Window_Invisible)
   
    WebGadget(0, 10, 10, 580, 280, "file://"+GetCurrentDirectory() + "oldal.html")
    myBrowser.IWebBrowser2 = GetWindowLong_(GadgetID(0), #GWL_USERDATA)
    myBrowser\put_Silent(#True) 
    ; Note: if you want to use a local file, change last parameter to "file://" + path + filename
    Repeat
      Event = WaitWindowEvent()
     
     
     
      If  GetGadgetAttribute(0,#PB_Web_Busy)=0
       
        Global  szoveg.s=WebGadget_PageText(0)
        Break
      EndIf
     
      DeleteFile("oldal.html")
     
    Until Event = #PB_Event_CloseWindow
  EndIf
 
  ProcedureReturn szoveg
EndProcedure
...


The previous code rewritten so I make dll file.

Code:
 OpenLibrary(0,"HTML2TEXT.dll")
   Prototype.s ProtoFunction(url.s, out.s="txt")
   HTML2TEXT.ProtoFunction=GetFunction(0, "HTML2TEXT")
   
   td.s = PeekS(HTML2TEXT("http://bestbet.site/show.php?show=one","csv"))

  CloseLibrary(0)
   
   Debug td.s


I'm trying to call it all but never succeed. Where do I break it?


Last edited by incaroad on Sat Nov 10, 2018 4:41 pm, edited 1 time in total.

Top
 Profile  
Reply with quote  
 Post subject: Re: HTML2TEXT
PostPosted: Sat Nov 10, 2018 1:32 pm 
Offline
Addict
Addict

Joined: Sun Sep 07, 2008 12:45 pm
Posts: 4051
Location: Germany
Without the full code it is very difficult to help.

One question...

Who closes your window inside the dll ?
Is there a PostEvent ?

Is the variable szoveg Global ?
(Read the help about DLL)

You don't need PeekS() if you use GetFunction()

...


Top
 Profile  
Reply with quote  
 Post subject: Re: HTML2TEXT
PostPosted: Sat Nov 10, 2018 2:38 pm 
Offline
User
User
User avatar

Joined: Sat Apr 20, 2013 2:58 pm
Posts: 24
Location: Hungary; Pilisvörösvár
Thanks for Infratec!

Full code in the 4. comment.


I'm trying to fix it.


Top
 Profile  
Reply with quote  
 Post subject: Re: HTML2TEXT
PostPosted: Sun Nov 11, 2018 3:02 pm 
Offline
User
User

Joined: Tue Mar 03, 2009 3:40 pm
Posts: 47
Location: france
hello,

works on windows 10 x64 with pb 5.70 LTS B2 :

for the DLL (html2text.dll):

Code:
EnableExplicit
InitNetwork()
;
DataSection
  IID_IHTMLDocument2: ; {332C4425-26CB-11D0-B483-00C04FD90119}
  Data.l $332C4425
  Data.w $26CB, $11D0       
  Data.b $B4, $83, $00, $C0, $4F, $D9, $01, $19
  ;
  IID_IHTMLDocument3: ; {3050F485-98B5-11CF-BB82-00AA00BDCE0B}
  Data.l $3050F485
  Data.w $98B5, $11CF
  Data.b $BB, $82, $00, $AA, $00, $BD, $CE, $0B
  ;
  IID_NULL: ; {00000000-0000-0000-0000-000000000000}
  Data.l $00000000
  Data.w $0000, $0000
  Data.b $00, $00, $00, $00, $00, $00, $00, $00
EndDataSection
;
Global szoveg.s
;----------------
ProcedureDLL WebGadget_Document(Gadget, *IID)
  Protected Document, Browser.iwebbrowser2,DocumentDispatch.idispatch
  Browser = GetWindowLong_(GadgetID(Gadget), #GWL_USERDATA)
  If Browser
    If Browser\get_Document(@DocumentDispatch) = #S_OK And DocumentDispatch
      DocumentDispatch\QueryInterface(*IID, @Document)
      DocumentDispatch\Release()
    EndIf
  EndIf     
   ProcedureReturn Document
EndProcedure
;------------------------------

ProcedureDLL.s WebGadget_PageText(Gadget)
  Protected Document.ihtmldocument2, bstr_text, result$, body.ihtmlelement
  ;Result$ = ""
 
  Document.IHTMLDocument2 = WebGadget_Document(Gadget, ?IID_IHTMLDocument2)
  If Document
    If Document\get_body(@Body) = #S_OK   
      If Body\get_innerText(@bstr_text) = #S_OK And bstr_text
        Result$ = PeekS(bstr_text, -1, #PB_Unicode)
        SysFreeString_(bstr_text)
      EndIf         
     
      Body\Release()
    EndIf       
    Document\Release()
  EndIf         
 
  ProcedureReturn Result$
EndProcedure

ProcedureDLL.s HTML2TEXT (url.s, out.s="txt")
 Protected html.s, myBrowser.iwebbrowser2,event
 ;html.s
 szoveg=""
 ;
 If FileSize("oldal.html")>0
   DeleteFile("oldal.html")
 EndIf
 
 ;InitNetwork()
 
  ;URL$=url
  If ReceiveHTTPFile(url,"oldal.html")
  If out="csv" 
  ReadFile(0, "oldal.html")   ; if the file could be read, we continue...
  While Eof(0) = 0            ; loop as long the 'end of file' isn't reached
    html+ ReadString(0) ; display line by line in the debug window
  Wend
  CloseFile(0)               ; close the previously opened file
 
  DeleteFile("oldal.html")
   
    If CreateRegularExpression(0, "<\/? ?td ?\/?>", #PB_RegularExpression_NoCase)
      html = ReplaceRegularExpression(0, html, ";")
    EndIf
   
  OpenFile(1,"oldal.html")
  WriteString(1,html)
  CloseFile(1)
 EndIf
 ;
  If OpenWindow(0, 0, 0, 600, 300, "WebGadget", #PB_Window_SystemMenu | #PB_Window_ScreenCentered | #PB_Window_Invisible)
   
    WebGadget(0, 10, 10, 580, 280, "file://"+GetCurrentDirectory() + "oldal.html")
    myBrowser = GetWindowLong_(GadgetID(0), #GWL_USERDATA)
    myBrowser\put_Silent(#True)
    ; Note: if you want to use a local file, change last parameter to "file://" + path + filename
    Repeat
      Event = WaitWindowEvent()
         
      If  GetGadgetAttribute(0,#PB_Web_Busy)=0
        szoveg=WebGadget_PageText(0)
        DeleteFile("oldal.html")
        CloseWindow(0)
        Break
      EndIf
     
    Until Event = #PB_Event_CloseWindow
  EndIf
 EndIf
  ProcedureReturn szoveg
EndProcedure

;Debug HTML2TEXT ("http://bestbet.site/show.php?show=one", "csv")



for testing the DLL :

Code:
Prototype.i ProtoFunction(url.s, out.s="txt")
If OpenLibrary(0,"HTML2TEXT.dll")
   
   HTML2TEXT.ProtoFunction=GetFunction(0, "HTML2TEXT")
   
   td.s = PeekS(HTML2TEXT("http://bestbet.site/show.php?show=one","csv"))
   
  CloseLibrary(0)
EndIf

Debug td.s



Top
 Profile  
Reply with quote  
 Post subject: Re: HTML2TEXT
PostPosted: Sun Nov 11, 2018 4:40 pm 
Offline
User
User
User avatar

Joined: Sat Apr 20, 2013 2:58 pm
Posts: 24
Location: Hungary; Pilisvörösvár
Hello!

Thank you very much Drgolf!
You are very professional.
:)


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 11 posts ] 

All times are UTC + 1 hour


Who is online

Users browsing this forum: No registered users and 17 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
Jump to:  

 


Powered by phpBB © 2008 phpBB Group
subSilver+ theme by Canver Software, sponsor Sanal Modifiye