Well I've been beavering away on this and have come up with some code - although it is Windows-only I'm afraid.
There are two "modes". To make an MHT you need the page's HTML code. "Normal mode" will obtain it using GetGadgetItemText, which doesn't get the
into it. The other mode, "current", uses API functions to get
.
read into memory. This isn't a problem but it is a shortcoming. I haven't yet studied Thalius's
code in T&T but will hopefully implement it in the near future.
So the code needs some work. It'll be a while before it's finished but in the meantime I thought it'd be good to post this version in case there's anything in it people can use.
For one thing it includes cross-platform procedures to encode to the Quoted Printable format, which may be useful. (This particular task utilises several procedures and a structure of my own liking - doubtless it could all be condensed into one clean procedure.) However, there may be a problem with this code - see snag#5.
On a personal note, another reason I want to submit this is because for 18 months I've asked many questions on this forum and received a lot of kind assistance from the brethren. I'm glad to be able to give something back, even if it is preliminary code.
I apologise for the unwieldiness of the code. It uses procedures that I keep categorised in many separate PBI's - clearly for this the simplest thing was to lump them all together.
If you specify no filename, the MHT will be saved inside the ClufleyMHT folder with a filename derived from the webpage's title, as per IE.
Code: Select all
Procedure R(str.s)
MessageRequester("URL2MHT",str,0)
EndProcedure
Procedure RStatus(str.s,status.b)
b.s
If status
b = "True"
Else
b = "False"
EndIf
R(str+" = "+b)
EndProcedure
Global c9.s = Chr(9)
Global c10.s = Chr(10)
Global c13.s = Chr(13)
Global c32.s = Chr(32)
Global c34.s = Chr(34)
Global c39.s = Chr(39)
Global mainfolder.s = "C:\ClufleyMHT\"
Enumeration ; modes
#MHT_Normal
#MHT_Current
EndEnumeration
;-
;- MATH PROCEDURES
Procedure.d Beat(a.d,b.d)
If a>b
ProcedureReturn a
Else
ProcedureReturn b
EndIf
EndProcedure
Procedure.d Defeat(a.d,b.d)
If a<b
ProcedureReturn a
Else
ProcedureReturn b
EndIf
EndProcedure
;-
;- STRING PROCEDURES
Procedure.s EnsureStart(string.s,start.s)
If Left(string,Len(start)) <> start
string = start+string
EndIf
ProcedureReturn string
EndProcedure
Procedure.s EnsureEnd(string.s,endd.s)
If Right(string,Len(endd)) <> endd
string = string+endd
EndIf
ProcedureReturn string
EndProcedure
Procedure.s EnsureNotStart(string.s,start.s)
If Left(string,Len(start)) = start
string = Mid(string,Len(start)+1,Len(string))
EndIf
ProcedureReturn string
EndProcedure
Procedure.s EnsureNotEnd(string.s,endd.s)
snipped = Len(string)-Len(endd)
If Right(string,Len(endd)) = endd
string = Left(string,snipped)
EndIf
ProcedureReturn string
EndProcedure
Procedure.s GetBefore(string.s,split.s)
test.l = FindString(string,split,0)
If test
string = Left(string,test-1)
EndIf
ProcedureReturn string
EndProcedure
Procedure.s GetAfter(string.s,split.s)
test.l = FindString(string,split,0)
If test
ProcedureReturn Right(string,Len(string)-test-Len(split)+1)
EndIf
ProcedureReturn ""
EndProcedure
Procedure.s EnsureNoDuplicates(arr.s,delim.s)
items = CountString(arr,delim)
If Not items
ProcedureReturn arr
EndIf
clean.s=delim
For a = 1 To items
code.s = StringField(arr,a,delim)
If Not FindString(clean,delim+code+delim,0)
clean+code+delim
EndIf
Next a
ProcedureReturn EnsureNotStart(clean,delim)
EndProcedure
Procedure.s RecursiveReplaceString(str.s,a.s,b.s)
While FindString(str,a,0)
str=ReplaceString(str,a,b)
Wend
ProcedureReturn str
EndProcedure
Procedure.s BSTRToString(Address.i)
Protected String.s, Length.l
Length = WideCharToMultiByte_(#CP_ACP, 0, Address, -1, 0, 0, #Null, #Null)
String = Space(Length)
WideCharToMultiByte_(#CP_ACP, 0, Address, -1, @String, Length, #Null, #Null)
SysFreeString_(String)
ProcedureReturn String
EndProcedure
Procedure.s FieldAccumulate(str.s,level.l,delim.s)
fields = CountString(str,delim)
accum.s
For a = 1 To Defeat(level,fields)
accum+StringField(str,a,delim)+delim
Next a
ProcedureReturn accum
EndProcedure
Procedure.s RemoveFieldsFromEnd(arr.s,delim.s,postremove.l)
fields=CountString(arr,delim)
take=fields-postremove
If take>0 And take<fields
r.s
For f = 1 To take
r+StringField(arr,f,delim)+delim
Next f
ProcedureReturn r
Else
ProcedureReturn arr
EndIf
EndProcedure
;-
;- WEBGADGET/MS PROCEDURES
DataSection
IID_IHTMLDocument: ; {626FC520-A41E-11CF-A731-00A0C9082637}
Data.l $626FC520
Data.w $A41E, $11CF
Data.b $A7, $31, $00, $A0, $C9, $08, $26, $37
IID_IHTMLDocument2: ; {332C4425-26CB-11D0-B483-00C04FD90119}
Data.l $332C4425
Data.w $26CB, $11D0
Data.b $B4, $83, $00, $C0, $4F, $D9, $01, $19
IID_IHTMLDocument3: ; {3050F485-98B5-11CF-BB82-00AA00BDCE0B}
Data.l $3050F485
Data.w $98B5,$11CF
Data.b $BB,$82,$00,$AA,$00,$BD,$CE,$0B
IID_IHTMLDocument4: ; {3050F69A-98B5-11CF-BB82-00AA00BDCE0B}
Data.l $3050F69A
Data.w $98B5, $11CF
Data.b $BB, $82, $00, $AA, $00, $BD, $CE, $0B
IID_IHTMLDocument5: ; {3050F80C-98B5-11CF-BB82-00AA00BDCE0B}
Data.l $3050F80C
Data.w $98B5, $11CF
Data.b $BB, $82, $00, $AA, $00, $BD, $CE, $0B
EndDataSection
Procedure GetBrowser(g.i) ; By Zapman Inspired by Fr34k
If IsGadget(g)
If GetGadgetText(g) = ""
SetGadgetText(g, "about:blank") ; to avoid error when using Browser
While WindowEvent():Wend
EndIf
Browser.IWebBrowser2 = GetWindowLong_(GadgetID(g), #GWL_USERDATA)
If Browser
Ready = 0
ct = 0
While Ready < 4 And ct<200
WindowEvent()
State = 0
If Browser\get_ReadyState(@BrowserState.i) = #S_OK
If BrowserState = 4
Ready + 1
EndIf
EndIf
If Ready = 0 : Delay(5) : EndIf
ct + 1
Wend
EndIf
ProcedureReturn Browser
EndIf
EndProcedure
Procedure GetDocumentDispatch(g.i) ; By Zapman Inspired by Fr34k
; Example: DocumentDispatch.IDispatch = GetDocumentDispatch(WebGadget)
; Do not forget to release DocumentDispatch when finished to use it
Browser.IWebBrowser2 = GetBrowser(g)
If Browser
If Browser\get_Document(@DocumentDispatch.IDispatch) = #S_OK
ProcedureReturn DocumentDispatch
EndIf
EndIf
EndProcedure
Procedure.s GetWebgadgetHTML(g.i)
; Retrieve all HTML content of the document which is in the Webgadget
DocumentDispatch.IDispatch = GetDocumentDispatch(g)
If DocumentDispatch
If DocumentDispatch\QueryInterface(?IID_IHTMLDocument2, @Document.IHTMLDocument2) = #S_OK And Document
Document\get_body(@Element.IHTMLElement); Get the <BODY> Element
If Element
If Element\get_parentElement(@Parent.IHTMLElement) = #S_OK And Parent; Get the <HTML> Element
Parent\get_outerHTML(@bstr)
Parent\Release()
EndIf
Element\Release()
EndIf
Document\Release()
EndIf
DocumentDispatch\Release()
EndIf
HTML.s
If bstr
HTML = PeekS(bstr, -1, #PB_Unicode) ; get the whole text of the document
SysFreeString_(bstr)
EndIf
; now we need to get the doctype.
; I haven't worked out how to use IHTMLDOCUMENT5, so for now I'm having
; to use PB's GetGadgetItemText. In most cases, perhaps all, it will be correct!
doctype.s = GetGadgetItemText(g,#PB_Web_HtmlCode)
detect = FindString(LCase(doctype),"<html>",0)
doctype = Left(doctype,detect-1)
If FindString(LCase(doctype),"doctype",0)
ProcedureReturn doctype+c13+HTML
Else
ProcedureReturn HTML
EndIf
EndProcedure
Procedure.s WBGetTitle(g.i)
If Not IsGadget(g) : ProcedureReturn : EndIf
Browser.IWebBrowser2 = GetBrowser(g)
title.s
If Browser
Browser\get_LocationName(@bstr)
If bstr
title = BSTRToString(bstr)
EndIf
EndIf
ProcedureReturn title
EndProcedure
;-
;- FILE/FOLDER PROCEDURES
Procedure.b FolderExists(folder.s)
If FileSize(folder) = -2
ProcedureReturn #True
EndIf
EndProcedure
Procedure.b EnsureFolder(folder.s)
If FolderExists(folder)
ProcedureReturn #True
Else
ProcedureReturn CreateDirectory(folder)
EndIf
EndProcedure
Procedure.b IsDrivePath(path.s)
If Right(path,2)=":\"
ProcedureReturn #True
EndIf
EndProcedure
Procedure.b EnsureFolderPath(path.s)
path=EnsureEnd(path,"\")
shortpath.s
levels = CountString(path,"\")
If levels>1
For f = 1 To levels
shortpath = FieldAccumulate(path,f,"\")
If IsDrivePath(shortpath)
Continue
EndIf
EnsureFolder(shortpath)
Next f
EndIf
EndProcedure
Procedure.b StringToFile(filename.s,string.s)
If Not filename
ProcedureReturn #False
EndIf
string = RemoveString(string,c10)
If FindString(string,c13,0)
string = EnsureNotEnd(string,c13) ; this removes final linebreak
string = ReplaceString(string,c13,c13+c10)
EndIf
file = CreateFile(#PB_Any,filename)
If IsFile(file)
WriteString(file,string)
CloseFile(file)
ProcedureReturn #True
EndIf
EndProcedure
;-
;- URL/FILENAME PROCEDURES
Procedure.b FileIsOnline(url.s)
online.b
test.s = Left(url,3)
Select test
Case "www"
online = #True
Case "htt"
online = #True
Case "ftp"
online = #True
Default
online = #False
EndSelect
ProcedureReturn online
EndProcedure
Procedure.b WASTEURLIsRelative(url.s)
If Not FindString(url,"\",0) And Not FileIsOnline(url)
ProcedureReturn #True
EndIf
EndProcedure
Procedure.s GetExt(rawfilepath.s)
rawfilepath = LCase(rawfilepath)
rawfilepath = RemoveString(rawfilepath,"*")
file.s = GetBefore(rawfilepath,"#")
file = ReplaceString(file,"/","\")
file = EnsureEnd(file,"\")
file = StringField(file,CountString(file,"\"),"\")
; ? and & can appear in URL addresses, but are legal for filenames,
; so we need everything up until the last instance, but not including it
If FileIsOnline(rawfilepath)
file = RemoveFieldsFromEnd(file+"&","&",1)
file = EnsureNotEnd(file,"&")
file = RemoveFieldsFromEnd(file+"?","?",1)
file = EnsureNotEnd(file,"?")
EndIf
; a filename can contain multiple dots
file = EnsureEnd(file,".")
ext.s = StringField(file,CountString(file,"."),".")
ext = EnsureNotEnd(ext,".")
ext = RemoveString(ext,"\")
;R("EXT: "+ext)
ProcedureReturn ext
EndProcedure
Procedure.s ValidateOnlineFilename(url.s)
url = GetBefore(url,"#")
url = GetBefore(url,"?")
url = GetBefore(url,c32)
test.s = Left(url,3)
Select test
Case "www"
url="http://"+url
EndSelect
url=EnsureNotEnd(url,"/")
ProcedureReturn url
EndProcedure
Procedure.s ValidateFilename(rawfile.s,replacechar.s)
If FileIsOnline(rawfile)
;R("File is online!")
ProcedureReturn ValidateOnlineFilename(rawfile)
EndIf
rawfile = EnsureNotStart(rawfile,"file:///")
path.s = GetPathPart(rawfile)
path = EnsureEnd(path,"\")
path = ReplaceString(path,"/","\") ; maybe not necessary
file.s = GetFilePart(rawfile)
file = GetBefore(file,"#")
If FindString(file,".",0)
file = RemoveFieldsFromEnd(file+".",".",1)
file = EnsureNotEnd(file,".")
EndIf
illegalchars.s = "/-\-:-*-?-|-<->-"+c34+"-"
For i = 1 To CountString(illegalchars,"-")
file = ReplaceString(file,StringField(illegalchars,i,"-"),replacechar)
Next i
file = Trim(file)
ext.s = GetExt(rawfile)
ProcedureReturn path+file+"."+ext
EndProcedure
Procedure.s URL_GetPath(url.s)
If FileIsOnline(url)
url = ValidateOnlineFilename(url)
folder.s = url
L=Len(folder)
If Right(folder,1) = "/"
L-1
folder=Left(folder,L)
EndIf
For a = 1 To L
test$=Right(folder,a)
test$=Left(test$,1)
If test$="/" Or test$="\"
b=L-a+1
folder=Left(folder,b)
Break
EndIf
Next
Else
folder=GetPathPart(url)
EndIf
ProcedureReturn folder
EndProcedure
;-
;- "LUMP" STUFF
; "lumps are a structure I use to store multi-line strings. Hopefully it isn't too esoteric!
#LumpMax=20000
#MiniLumpMax=500
Structure Lump
lines.w
line.s[#LumpMax]
EndStructure
Structure MiniLump
lines.w
line.s[#MiniLumpMax]
EndStructure
Procedure.l LumpLinesFromString(str.s)
ProcedureReturn Defeat(#LumpMax,CountString(str,c13))
EndProcedure
Procedure.s LumpToString(*source.Lump)
str.s
For a = 1 To *source\lines
str + *source\line[a] + c13
Next
ProcedureReturn str
EndProcedure
Procedure.b LumpFromString(str.s,*dest.Lump)
str=RemoveString(str,c10)
str=EnsureEnd(str,c13)
*dest\lines = LumpLinesFromString(str)
For a = 1 To *dest\lines
*dest\line[a] = StringField(str,a,c13)
Next a
ProcedureReturn #True
EndProcedure
Procedure.b RebreakLump(*l.Lump)
If *l\lines
str.s = LumpToString(*l)
LumpFromString(str,*l)
EndIf
EndProcedure
Procedure.b WrapLumpToLength(*lp.Lump,wl.i)
If Not *lp\lines
ProcedureReturn #False
EndIf
cong.s
For a = 1 To *lp\lines
start = 1
cong = ""
linelength = Len(*lp\line[a])
If linelength>wl
Repeat
cong+Mid(*lp\line[a],start,wl)+c13
start+wl
Until start>linelength
*lp\line[a] = EnsureNotEnd(cong,c13)
EndIf
Next a
RebreakLump(*lp)
ProcedureReturn #True
EndProcedure
;-
;- DATE PROCEDURES
Structure dtsplit
second.i
minute.i
hour.i
day.i
month.i
year.i
dayofweek.s
EndStructure
Procedure.s DayName(datelong.i)
d=DayOfWeek(datelong)
n.s
Select d
Case 0
n="Sunday"
Case 1
n="Monday"
Case 2
n="Tuesday"
Case 3
n="Wednesday"
Case 4
n="Thursday"
Case 5
n="Friday"
Case 6
n="Saturday"
EndSelect
ProcedureReturn n
EndProcedure
Procedure.s MonthName(num.i)
Select num
Case 1
ProcedureReturn "January"
Case 2
ProcedureReturn "February"
Case 3
ProcedureReturn "March"
Case 4
ProcedureReturn "April"
Case 5
ProcedureReturn "May"
Case 6
ProcedureReturn "June"
Case 7
ProcedureReturn "July"
Case 8
ProcedureReturn "August"
Case 9
ProcedureReturn "September"
Case 10
ProcedureReturn "October"
Case 11
ProcedureReturn "November"
Case 12
ProcedureReturn "December"
EndSelect
EndProcedure
Procedure.s ExtractifyDate(datelong.i,*dt.dtsplit)
*dt\second = Val(FormatDate("%ss",datelong))
*dt\minute = Val(FormatDate("%ii",datelong))
*dt\hour = Val(FormatDate("%hh",datelong))
*dt\day = Val(FormatDate("%dd",datelong))
*dt\month = Val(FormatDate("%mm",datelong))
*dt\year = Val(FormatDate("%yyyy",datelong))
*dt\dayofweek = DayName(datelong)
EndProcedure
Procedure.s MHTDateTime(datelong.i)
ExtractifyDate(datelong,@dt.dtsplit)
monthname.s = MonthName(dt\month)
GMT.s = "-0000"
md.s =
Left(dt\dayofweek,3)+","+c32+Str(dt\day)+c32+Left(monthname,3)+c32+Str(dt\year)+c32+Str(dt\hour)+":"+Str(dt\minute)+":"+Str(dt\second)+c32+G
MT
ProcedureReturn md
EndProcedure
;-
;- MHT-MAKING PROCEDURES
#QPLL=76
Procedure.s MIMEType(file.s)
ext.s = GetExt(file)
ext = EnsureStart(ext,".")
mime.s
Select ext
Case ".jpg"
mime = "image/jpeg"
Case ".gif"
mime = "image/gif"
Case ".png"
mime = "image/png"
Case ".bmp"
mime = "image/bmp"
Case ".css"
mime = "text/css"
Case ".html"
mime = "text/html"
Case ".js"
mime = "application/octet-stream"
EndSelect
If mime
ProcedureReturn mime
EndIf
hKey.l = 0
datasize.l = 255
KeyValue.s = Space(datasize)
If RegOpenKeyEx_(#HKEY_CLASSES_ROOT, ext, 0, #KEY_READ, @hKey)
KeyValue = "application/octet-stream"
Else
If RegQueryValueEx_(hKey, "Content Type", 0, 0, @KeyValue, @datasize)
KeyValue = "application/octet-stream"
Else
KeyValue = Left(KeyValue, datasize-1)
EndIf
RegCloseKey_(hKey)
EndIf
ProcedureReturn KeyValue
EndProcedure
Procedure WaitForWebgadget2(gad.l,maxmilliseconds.l)
Browser.IWebBrowser2=GetWindowLong_(GadgetID(gad),#GWL_USERDATA)
milliwait=125
Repeat
While WindowEvent() : Wend : Browser\get_Busy(@busy.l)
count+milliwait
If busy=#VARIANT_TRUE
Sleep_(milliwait)
EndIf
If maxmilliseconds>0
If count=>maxmilliseconds
Break
EndIf
EndIf
Until busy=#VARIANT_FALSE
EndProcedure
Procedure.s FromPosToNext(str.s,pos.i,nxt.s)
ne=FindString(str,nxt,pos)
If Not ne
ne=Len(str)
EndIf
extr.s = Mid(str,pos,ne)
ProcedureReturn extr
EndProcedure
Procedure.s GetLinkedResourceNames(html.s)
resourcelist.s
newfile.s
extnames.s = "jpg|png|bmp|gif|js|php|css|"
exts = CountString(extnames,"|")
src.s = "src="
sources = CountString(html,src)
If sources
;R("INSTANCES OF SRC = "+Str(sources))
For s = 1 To sources
detect=FindString(html,src,detect+1)
If detect
newfile = FromPosToNext(html,detect,src)
newfile = StringField(newfile,1,c13)
newfile = StringField(newfile,1,">")
newfile = RemoveString(newfile,src)
newfile = RemoveString(newfile,c34)
newfile = StringField(newfile,1,c32)
newfile = GetBefore(newfile,"?")
resourcelist+newfile+"|"
EndIf
Next s
EndIf
detect=0
bgimurl.s = "background-image:url"
html = ReplaceString(html,"background-image: url",bgimurl)
bgimurls = CountString(html,bgimurl)
If bgimurls
;R("INSTANCES OF "+bgimurl+" = "+Str(bgimurls))
For a = 1 To bgimurls
detect=FindString(html,bgimurl,detect+1)
If detect
newfile = FromPosToNext(html,detect,bgimurl)
newfile = RemoveString(newfile,bgimurl)
newfile = StringField(newfile,1,")")
newfile = EnsureNotStart(newfile,"(")
;R(newfile)
resourcelist+newfile+"|"
EndIf
Next a
EndIf
; and now a special search for any linked stylesheets
If FindString(LCase(html),"stylesheet",0)
html+c13
For a = 1 To CountString(html,c13)
line.s = LCase(StringField(html,a,c13))
If FindString(line,"<link rel",0)
If FindString(line,"stylesheet",0)
line = GetAfter(line,"hre")
line = GetAfter(line,"=")
line = RemoveString(line,"/>")
line = RemoveString(line,">")
line = RemoveString(line,c34)
line = Trim(line)
resourcelist+line+"|"
EndIf
EndIf
Next a
EndIf
resourcelist = EnsureNoDuplicates(resourcelist,"|")
ProcedureReturn resourcelist
EndProcedure
Procedure.s RandomCode(l.i)
c.s
For a = 1 To l
c+Str(Random(9))
Next a
ProcedureReturn c
EndProcedure
Procedure.b WrapLumpToLengthWithEndchar(*lp.Lump,wl.i,endchar.s)
If Not *lp\lines
ProcedureReturn #False
EndIf
If Not endchar
endchar=c13
EndIf
cong.s
For a = 1 To *lp\lines
start = 0
cong = ""
linelength = Len(*lp\line[a])
If linelength>wl
Repeat
cong+Mid(*lp\line[a],start+1,wl)+endchar
start+wl
Until start>linelength
*lp\line[a] = EnsureNotEnd(cong,c13)
EndIf
Next a
RebreakLump(*lp)
ProcedureReturn #True
EndProcedure
Procedure.s Ascii2QP(ascii.i)
h.s = Hex(ascii)
If Len(h)=1
h="0"+h
EndIf
ProcedureReturn "="+h
EndProcedure
Procedure.s QuotedPrintable(raw.s)
LumpFromString(raw,@lp.Lump)
c.s
qp.s
For l = 1 To lp\lines
linelength = Len(lp\line[l])
For a = 1 To linelength
c=Mid(lp\line[l],a,1)
asc=Asc(c)
Select asc
Case 10
Case 13
qp+c
Case 32
If a=linelength
qp+Ascii2QP(asc)
Else
qp+c32
EndIf
Case 33 To 60
qp+c
Case 62 To 126
qp+c
Default
qp+Ascii2QP(asc)
EndSelect
Next a
qp+c13
Next l
LumpFromString(qp,@lp.Lump)
WrapLumpToLengthWithEndchar(@lp,#QPLL-2,"="+c13)
flattened.s = LumpToString(@lp)
flattened = RecursiveReplaceString(flattened,c13+c13,c13)
ProcedureReturn flattened
EndProcedure
Procedure.s EncodeToBase64(*rawdata,rawloaf.i)
enclength = 1.5*rawloaf
*enc = AllocateMemory(enclength)
encodedlength = Base64Encoder(*rawdata,rawloaf,*enc,enclength)
encodedstring.s = PeekS(*enc,encodedlength)
FreeMemory(*enc)
ProcedureReturn encodedstring
EndProcedure
Procedure.s EncodeToQuotedPrintable(*rawdata,rawloaf.i)
rawstring.s = PeekS(*rawdata,rawloaf)
encodedstring.s = QuotedPrintable(rawstring)
ProcedureReturn encodedstring
EndProcedure
Procedure.s ContentLocation(urlfile.s)
If FileIsOnline(urlfile)
Else
urlfile = "file:///"+urlfile
EndIf
ProcedureReturn "Content-Location: "+urlfile
EndProcedure
Procedure.s Alter(str.s) ; for testing
; this alters the filename of a resource within the MHT, so that we can tell
; whether the MHT is still accessing an online resource instead of an internal one.
ext$ = GetExt(str)
str = EnsureNotEnd(str,"."+ext$)
ProcedureReturn str+"ALTER."+ext$
EndProcedure
Procedure.s FormPart(urlfile.s,folder.s,boundary.s,network.b)
urlfile = ValidateFileName(urlfile,"")
ext.s = GetExt(urlfile)
encodemethod.s
Select ext
Case "html", "css", "js", "php"
encodemethod="quoted-printable"
Default
encodemethod="base64"
EndSelect
header.s = "--"+boundary +c13
header + "Content-Type: "+MIMEType(urlfile) +c13
header + "Content-Transfer-Encoding: "+encodemethod +c13
header + ContentLocation(Alter(urlfile)) +c13+c13
localfile.s = folder+RandomCode(8)+c32+StringField(GetFilePart(urlfile),1,".")+"."+ext
If FileIsOnline(urlfile)
;R("File is online")
If network
status = ReceiveHTTPFile(urlfile,localfile)
Else
R("Can't download because network=0.")
EndIf
Else
;R("File is offline (local)")
status = CopyFile(urlfile,localfile)
EndIf
file = OpenFile(#PB_Any,localfile)
If Not file
ProcedureReturn ""
EndIf
loaf = Lof(file)
If loaf<1
CloseFile(file)
ProcedureReturn ""
EndIf
*filedata = AllocateMemory(loaf)
ReadData(file,*filedata,loaf)
CloseFile(file)
encodedstring.s
Select encodemethod
Case "base64"
;R("Encode method: Base64")
encodedstring = EncodeToBase64(*filedata,loaf)
Case "quoted-printable"
;R("Encode method: Quoted-Printable")
encodedstring = EncodeToQuotedPrintable(*filedata,loaf)
EndSelect
FreeMemory(*filedata)
LumpFromString(encodedstring,@lm.Lump)
WrapLumpToLength(@lm,#QPLL)
encodedstring = LumpToString(@lm)
ProcedureReturn header+encodedstring+c13+c13+c13
EndProcedure
Procedure.s GetWebgadgetURL(wg.i)
If IsGadget(wg)
WebObject.IWebBrowser2 = GetWindowLong_(GadgetID(wg), #GWL_USERDATA)
WebObject\Get_LocationURL(@BSTR_Url.l)
If BSTR_Url
ProcedureReturn BSTRToString(BSTR_Url)
EndIf
EndIf
EndProcedure
Procedure.s GetWebpageBaseDirectory(wg.i)
realurl.s = GetWebgadgetURL(wg)
If realurl
baseurl.s = URL_GetPath(realurl)
EndIf
html.s = LCase(GetGadgetItemText(wg,#PB_Web_HtmlCode))
If FindString(html,"<base",0)
html = GetAfter(html,"<base")
html = GetBefore(html,">")
html = GetAfter(html,"href=")
html = Trim(html)
If html
baseurl=html
EndIf
EndIf
ProcedureReturn EnsureEnd(baseurl,"/")
EndProcedure
Procedure.s FormBoundary()
ProcedureReturn "----=_NextPart_000_001F_01C96422.C9363B50"
EndProcedure
Procedure.s FormHTMLPart(html.s,url.s,boundary.s)
header.s = "--"+boundary +c13
header + "Content-Type: text/html;" +c13
header + " charset="+c34+"iso-8859-1"+c34 +c13
header + "Content-Transfer-Encoding: quoted-printable" +c13
header + ContentLocation(url)
header+c13+c13
encoded.s = QuotedPrintable(html)
ProcedureReturn header+encoded+c13+c13+c13
EndProcedure
Procedure.s FormHeaderData(title.s,boundary.s)
header.s = "From: <Saved by ClufleyMHT>" +c13
header + "Subject: "+title +c13
header + "Date: "+MHTDateTime(Date()) +c13
header + "MIME-Version: 1.0" +c13
header + "Content-Type: multipart/related;" +c13
header + " type="+c34+"text/html"+c34+";" +c13
header + " boundary="+c34+boundary+c34 +c13
header + "X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.3350" +c13+c13
header + "This is a multi-part message in MIME format." +c13+c13
ProcedureReturn header
EndProcedure
;-
;- THE MHT PROCEDURE
Procedure.b URL2MHT(url.s,mht.s,mode.b)
If Not url
ProcedureReturn #False
EndIf
folder.s = mainfolder+RandomCode(8)+"\"
EnsureFolderPath(folder)
;R("URL: "+url+Chr(13)+"MHT: "+mht)
ww=800
wh=800
win = OpenWindow(#PB_Any, 0, 0, ww, wh, "MHT displayer", #PB_Window_Invisible)
wg = WebGadget(#PB_Any, 5, 5, ww-10, wh-10, url)
WaitForWebgadget2(wg,10000)
baseurl.s = GetWebpageBaseDirectory(wg)
title.s
html.s
If mode=#MHT_Current
title = WBGetTitle(wg)
html = GetWebgadgetHTML(wg)
Else
title = GetGadgetItemText(wg,#PB_Web_PageTitle)
html = GetGadgetItemText(wg,#PB_Web_HtmlCode)
EndIf
CloseWindow(win)
If Not mht
mht = mainfolder+ReplaceString(title,c32+c32,c32)
EndIf
mht = EnsureEnd(mht,".mht")
mht = ValidateFilename(mht,c32)
boundary.s = FormBoundary()
headerdata.s = FormHeaderData(title,boundary)
filelist.s = GetLinkedResourceNames(html)
files = CountString(filelist,"|")
Dim filedata.s(files+1)
network.b = InitNetwork()
;RStatus("network",network)
resourcepointer.s
resourcepath.s
If files
For f = 1 To files
resourcepointer = StringField(filelist,f,"|")
resourcepointer = RemoveString(resourcepointer,c34)
If Left(resourcepointer,1)="/"
resourcepath = EnsureNotEnd(RemoveFieldsFromEnd(baseurl,"/",1),"/") +resourcepointer
Else
If FileIsOnline(resourcepointer)
resourcepath = resourcepointer
Else
resourcepath = baseurl+resourcepointer
EndIf
EndIf
filedata(f) = FormPart(resourcepath,folder,boundary,network)
; now update html with "internal" reference to resource
html = ReplaceString(html,resourcepointer,Alter(resourcepath))
Next f
EndIf
; encode html to Quoted Printable format
qphtml.s = FormHTMLPart(html,url,boundary)
total.s = headerdata+qphtml
If files
For f = 1 To files
total+filedata(f)
Next f
EndIf
DeleteDirectory(folder,"*.*",#PB_FileSystem_Recursive|#PB_FileSystem_Force)
EnsureFolderPath(GetPathPart(mht))
status = StringToFile(mht,total)
ProcedureReturn status
EndProcedure
;- AT LAST, PROGRAM START!
url.s
destfile.s
;url = InputRequester("URL TO MAKE MHT FROM","Please enter URL below:","")
;url = "http://www.purebasic.fr/english/index.php" ; main forum index
url = "http://www.purebasic.fr/english/viewforum.php?f=13" ; coding forum
;url = "http://www.adobeforums.com/webx/.3bc459a3/"
;url = "http://timesonline.typepad.com/timesarchive/2009/01/10-freaky-finds.html"
;url = "http://en.wikipedia.org/wiki/Ralf_Hutter"
status = URL2MHT(url,destfile,#MHT_Current)
R("STATUS: "+Str(status))
End
If you want to help with this "project", please do.
You can help by testing out this code with different URLs and reporting errors. The URLs would be helpful.
If you can think of options for the finished function, please post them. ("Express mode" etc.)
Seymour.