Page 1 of 2

Convert txt file to Acrobat pdf

Posted: Thu May 28, 2009 1:40 pm
by doctorized
Here is an other code I 've writen. It converts txt files to adobe pdf.

Attention: Some languages, such as greek and Chinese, are not supported.

Code: Select all

; Author: Wicker Man (doctorized in PB)
; Date: May 22 2009
; OS: Windows
; Demo: No
; More codes at: www.geocities.com/kc2000labs/pb/pb.htm


;The following code creates pdf files version 1.2 (Acrobat 3.0).

Global Position.l, pageNo.l, lineNo.l
Global Dim location.l(5000)
Global Dim pageObj.l(5000)
Global lines.l, obj.l, Tpages.l, encoding.l, resources.l, pages.l, pointSize.q
Global vertSpace.d, info.l, root.l, npagex.d, npagey.l, linelen.l, cache.s
Global FileTXT.s, FilePDF.s
Global AppName.s, Author.s, Creator.s, Keywords.s, Subject.s, Title.s, BaseFont.s, rotate.l, pageWidth.d, pageHeight.d

Declare.l StartPage()
Declare.s endpage(streamstart.l)

Procedure writepdf(stre.s, flush.l=0)
Position + Len(stre)
cache + stre + Chr(13)
If Len(cache) > 32000 Or flush > 0
	OpenFile(0, FilePDF)
	FileSeek(0, Lof(0))
	WriteStringN(0, cache)
	CloseFile(0)
	cache = ""
EndIf
EndProcedure
  
Procedure WriteStart()
  writepdf ("%PDF-1.2")
  writepdf ("%βγΟΣ")
EndProcedure

Procedure WriteHead()
CreationDate.s = "D:" + FormatDate( "%YYYY%MM%DD%HH%II%SS",Date())
obj + 1
location(obj) = Position
info = obj

writepdf (Str(obj) + " 0 obj")
writepdf ("<<")
writepdf ("/Author (" + Author + ")")
writepdf ("/CreationDate (" + CreationDate + ")")
writepdf ("/Creator (" + Creator + ")")
writepdf ("/Producer (" + AppName + ")")
writepdf ("/Title (" + Title + ")")
writepdf ("/Subject (" + Subject + ")")
writepdf ("/Keywords (" + Keywords + ")")
writepdf (">>")
writepdf ("endobj")

obj + 1
root = obj
obj + 1
Tpages = obj
encoding = obj + 2
resources = obj + 3

obj + 1
location(obj) = Position
writepdf (Str(obj) + " 0 obj")
writepdf ("<<")
writepdf ("/Type /Font")
writepdf ("/Subtype /Type1")
writepdf ("/Name /F1")
writepdf ("/Encoding " + Str(encoding) + " 0 R")
writepdf ("/BaseFont /" + BaseFont)
writepdf (">>")
writepdf ("endobj")

obj + 1
location(obj) = Position
writepdf (Str(obj) + " 0 obj")
writepdf ("<<")
writepdf ("/Type /Encoding")
writepdf ("/BaseEncoding /WinAnsiEncoding")
writepdf (">>")
writepdf ("endobj")

obj + 1
location(obj) = Position
writepdf (Str(obj) + " 0 obj")
writepdf ("<<")
writepdf ("  /Font << /F1 " + Str(obj - 2) + " 0 R >>")
writepdf ("  /ProcSet [ /PDF /Text ]")
writepdf (">>")
writepdf ("endobj")
EndProcedure
  
Procedure WritePages()
line.s: tmpline.s: beginstream.l
If ReadFile(1, FileTXT)
	beginstream = StartPage()
	lineNo = -1
	While Not Eof(1)
		line = ReadString(1)
      lineNo + 1
        
        ;page Break
        If lineNo >= lines Or FindString(line, Chr(12),1) > 0
          writepdf("1 0 0 1 " + Str(npagex) + " " + Str(npagey) + " Tm")
          writepdf("(" + Str(pageNo) + ") Tj")
          writepdf("/F1 " + Str(pointSize) + " Tf")
          endpage (beginstream)
          beginstream = StartPage()
        EndIf
        
        line = ReplaceString(ReplaceString(ReplaceString(line, "\", "\\"), "(", "\("), ")", "\)")
        
        If Len(line) > linelen
          
          ;word wrap
          While Len(line) > linelen
            tmpline = Left(line, linelen)
            For i = Len(tmpline) To (Len(tmpline) / 2) Step -1
              If FindString("*+^%$#,. ;<=>[])}!" + Chr(34), Mid(tmpline, i, 1),1)
                tmpline = Left(tmpline, i)
                Break
              EndIf
            Next
            
            line = Mid(line, Len(tmpline) + 1)
            writepdf("T* (" + tmpline + Chr(13) + Chr(10) + ") Tj")
            lineNo = lineNo + 1
            
            ;page Break
            If lineNo >= lines Or FindString(line, Chr(12),1) > 0
              writepdf("1 0 0 1 " + Str(npagex) + " " + Str(npagey) + " Tm")
              writepdf("(" + Str(pageNo) + ") Tj")
              writepdf("/F1 " + Str(pointSize) + " Tf")
              endpage (beginstream)
              beginstream = StartPage()
            EndIf
          Wend
          lineNo + 1
          writepdf("T* (" + line + Chr(13) + Chr(10) + ") Tj")
        Else
          writepdf("T* (" + line + Chr(13) + Chr(10) + ") Tj")
        EndIf
	Wend
	CloseFile(1)
EndIf
writepdf("1 0 0 1 " + Str(npagex) + " " + Str(npagey) + " Tm")
writepdf("(" + Str(pageNo) + ") Tj")
writepdf("/F1 " + Str(pointSize) + " Tf")
endpage (beginstream)
EndProcedure

Procedure.l StartPage()
  strmpos.l
  obj + 1
  location(obj) = Position
  pageNo + 1
  pageObj(pageNo) = obj
  
  writepdf(Str(obj) + " 0 obj")
  writepdf("<<")
  writepdf("/Type /Page")
  writepdf("/Parent " + Str(Tpages) + " 0 R")
  writepdf("/Resources " + Str(resources) + " 0 R")
  obj + 1
  writepdf("/Contents " + Str(obj) + " 0 R")
  writepdf("/Rotate " + Str(rotate))
  writepdf(">>")
  writepdf("endobj")
  
  location(obj) = Position
  writepdf(Str(obj) + " 0 obj")
  writepdf("<<")
  writepdf("/Length " + Str(obj + 1) + " 0 R")
  writepdf(">>")
  writepdf("stream")
  strmpos = Position
  writepdf("BT")
  writepdf("/F1 " + Str(pointSize) + " Tf")
  writepdf("1 0 0 1 50 " + Str(pageHeight - 40) + " Tm")
  writepdf(ReplaceString(StrD(vertSpace), ",", ".") + " TL")
  
  ProcedureReturn strmpos
EndProcedure

Procedure.s endpage(streamstart.l)
streamEnd.l
writepdf("ET")
streamEnd = Position
writepdf("endstream")
writepdf("endobj")
obj + 1
location(obj) = Position
writepdf(Str(obj) + " 0 obj")
writepdf(Str(streamEnd - streamstart))
writepdf ("endobj")
lineNo = 0
EndProcedure

Procedure endpdf()
ty.s: xreF.l
location(root) = Position
writepdf(Str(root) + " 0 obj")
writepdf("<<")
writepdf("/Type /Catalog")
writepdf("/Pages " + Str(Tpages) + " 0 R")
writepdf(">>")
writepdf("endobj")
location(Tpages) = Position
writepdf(Str(Tpages) + " 0 obj")
writepdf("<<")
writepdf("/Type /Pages")
writepdf("/Count " + Str(pageNo))
writepdf("/MediaBox [ 0 0 " + Str(pageWidth) + " " + Str(pageHeight) + " ]")
ty = ("/Kids [ ")
For i = 1 To pageNo
ty + Str(pageObj(i)) + " 0 R "
Next
ty + "]"
writepdf(ty)
writepdf(">>")
writepdf("endobj")
xreF = Position
writepdf("0 " + Str(obj + 1))
writepdf("0000000000 65535 f ")
For i = 1 To obj
writepdf(RSet(Str(location(i)), 10, "0") + " 00000 n ")
Next
writepdf("trailer")
writepdf("<<")
writepdf("/Size " + Str(obj + 1))
writepdf("/Root " + Str(root) + " 0 R")
writepdf("/Info " + Str(info) + " 0 R")
writepdf(">>")
writepdf("startxref")
writepdf(Str(xreF))
writepdf("%%EOF", 1)
EndProcedure

Procedure ConvertToPDF(sFileTXT.s, sFilePDF.s, sAppName.s="", sAuthor.s="", sCreator.s="", sKeywords.s="", sSubject.s="", sTitle.s="", sBaseFont.s="Courier", lpointSize.l = 12, lrotate.l=0, dpageWidth.d = 8.5, lpageHeight.l = 11)
If ReadFile(0,sFileTXT) = 0
	MessageRequester("Error","File " + Chr(34) + sFileTXT + Chr(34) + " not found.",#MB_ICONERROR)
	ProcedureReturn
EndIf
CloseFile(0)

;initialize
FileTXT = sFileTXT
FilePDF= sFilePDF
AppName = sAppName
Author = sAuthor
Creator = sCreator
Keywords = sKeywords
Subject = sSubject
Title = sTitle
BaseFont = sBaseFont
pointsize = lpointsize
rotate = lrotate
pageHeight = lpageHeight * 72
pageWidth = dpageWidth * 72
obj=0
Position = 0
cache = ""
vertSpace = pointsize * 1.2 ; Vertical spacing
lines = (pageHeight - 72) / vertSpace ; no of lines on one page
If FindString(LCase(BaseFont), "courier",1) ; for Courier font
	linelen = 1.5 * pageWidth / pointSize
ElseIf FindString(LCase(BaseFont), "arial",1) ; for Arial font
	linelen = 2 * pageWidth / pointSize
ElseIf FindString(LCase(BaseFont), "times-roman",1) ; for Time New Roman font
	linelen = 2.2 * pageWidth / pointSize
Else
	linelen = 2.2 * pageWidth / pointSize ; any other font
EndIf

npagex = pageWidth / 2
npagey = 25

WriteStart()
WriteHead()
WritePages()
endpdf()
EndProcedure

If OpenWindow(0, 300, 300, 350, 80,"PDF Converter", #PB_Window_SystemMenu | #PB_Window_MinimizeGadget | #PB_Window_ScreenCentered)

 If CreateGadgetList(WindowID(0))
 StringGadget(0,10,10,330,20,GetPathPart(ProgramFilename()) + "testTXT.txt"); use your own txt file.
 StringGadget(1,10,30,330,20,GetPathPart(ProgramFilename()) + "testPDF.pdf"); set the path you want to save the file.
 ButtonGadget(2,10,50,60,25,"Convert")
 EndIf
 
 Repeat
    EventID = WaitWindowEvent()

    If EventID = #PB_Event_CloseWindow  ; If the user has pressed on the close button
      Quit = 1
    ElseIf EventID = #PB_Event_Gadget
    	If EventGadget() = 2
    		ConvertToPDF(GetGadgetText(0),GetGadgetText(1))
    	EndIf
    EndIf
  Until Quit = 1
 
EndIf

End 

Re: Convert txt file to Acrobat pdf

Posted: Thu May 28, 2009 1:46 pm
by Kiffi
doctorized wrote:Here is an other code I 've writen. It converts txt files to adobe pdf.
cool! Thanks for sharing! Image

Greetings ... Kiffi

Re: Convert txt file to Acrobat pdf

Posted: Thu May 28, 2009 1:48 pm
by doctorized
Kiffi wrote:
doctorized wrote:Here is an other code I 've writen. It converts txt files to adobe pdf.
cool! Thanks for sharing! Image

Greetings ... Kiffi
I am working on a code that creates pdf version 1.3 with importing images, rotating texts and more. I only want to find out a way to import fonts and the project will be completed. Stay connected!

Posted: Thu May 28, 2009 5:15 pm
by srod
Interesting.

Will you be supporting unicode fonts?

Posted: Thu May 28, 2009 7:15 pm
by doctorized
srod wrote:Interesting.

Will you be supporting unicode fonts?

I want to fully support all languages. I want to believe that all type of fonts will be supported. I am working on it.

Posted: Fri May 29, 2009 1:10 am
by PB
Pretty good so far! Is there a way to change the font size? I need it smaller
for my app. I don't know anything about PDFs though, so not sure.

Also, does this infringe on any of Adobe's patents or anything? I wouldn't
want to get sued if the source code is (c) to Adobe or something. Is the
code 100% your own work?

Posted: Fri May 29, 2009 1:25 pm
by doctorized
PB wrote:Pretty good so far! Is there a way to change the font size? I need it smaller
for my app.
Procedure ConvertToPDF can take many optional parameters, ot only the txt and pdf file names. If you write: ConvertToPDF(GetGadgetText(0),GetGadgetText(1),"","","","","","","",6) you can change the font size. The default size is 12. Use here any number you want.
PB wrote: Is the code 100% your own work?
I found the code some years ago somewhere writen in VisualBasic 6 and there was no info about the author. I mean that the code is not coming from Adobe or any other company with any kind of copyright. If anyone knows the writer, let as know his name.

Posted: Fri May 29, 2009 1:44 pm
by PB
> The default size is 12. Use here any number you want.

Oops, I didn't see that at first due to the long line (another reason why a
line continuation char would be nice for the IDE). Thanks for telling me! :)

Posted: Fri May 29, 2009 1:50 pm
by Seymour Clufley
This is great work, Doctorized. Thanks!

Posted: Fri Aug 14, 2009 3:17 pm
by PB
Any more progress on this? :)

Posted: Wed Aug 26, 2009 3:05 pm
by doctorized
PB wrote:Any more progress on this? :)
I have stuck with the unicodes generally, not only the fonts.

Is there a way to translate "StrConv(ImgColor, vbUnicode)" form VB? ImgColor is byte array.

Posted: Wed Aug 26, 2009 10:20 pm
by WilliamL
I am using a Mac and trying to see if your program will run. I ran into a problem with "%βγΟΣ" which I could write as "%"+chr(?)+chr(?)+chr?". I think it might run on a Mac.

What I really need is a pdf to text code. Any ideas?

Posted: Thu Aug 27, 2009 3:09 pm
by doctorized
WilliamL wrote:I am using a Mac and trying to see if your program will run. I ran into a problem with "%βγΟΣ" which I could write as "%"+chr(?)+chr(?)+chr?". I think it might run on a Mac.
use: "%" + Chr(226) + Chr(227) + Chr(207) + Chr(211)
What I really need is a pdf to text code. Any ideas?
what version of pdf you need to convert to text? If we are talking for simple format of version 1.2, the same as my code creates, it is very simple to get the text from it. The only thing you should do to this case, is to open the pdf file with notepad to see what is inside to be able to write a code to get the text. If the file is not so simple, if for example there are embedded fonts or images, the things are more compicated.

Posted: Thu Aug 27, 2009 4:46 pm
by talisman
As a side note many e-books and the like are scans of the actual works, in which case you need a text recognition application... that means... complication!

Posted: Thu Aug 27, 2009 5:28 pm
by WilliamL
I'm going to try the CHR() commands when I get a chance and get back to you. [later] Yes, it appears to work (on a Mac)! I used OpenFileRequester() and it was easier to get the file name.
open the pdf file with notepad to see what is inside to be able to write a code to get the text
I did that and it worked but it seemed clumsy. I just thought there might be some logic to the distribution to the text which would make it easier to find the starting point of the text. I wondered if there was some character sequence that always precedes the text? The main thing is that my effort worked for my needs.