Replace text in .pdf file?

Just starting out? Need help? Post your questions and find answers here.
camille
User
User
Posts: 71
Joined: Tue Nov 19, 2019 12:52 pm

Replace text in .pdf file?

Post by camille »

Hello!

Is it possible with PB to search and replace text in a .pdf file and save the modified .pdf afterwards?

The content of the pdf has a graphic at the top left and all other text is editable but of course the text is formatted like: "this text part is blue", "this text has a different font size than other lines", etc.

I would like to prefill this document with words to find like:

Code: Select all

recipient_adress_name
recipient_adress_street
recipient_adress_city
...
and read the belonging values e.g. from a .ini file

Code: Select all

[Recipient]
recipient_adress_name=Walter Stanfield
recipient_adress_name=Sesamestreet 1-8
recipient_adress_city=Dallas
and afterwards search for the words and replace it with the belonging value without breaking the design of the pdf at the end...

Merci beacoup!
Axolotl
Addict
Addict
Posts: 802
Joined: Wed Dec 31, 2008 3:36 pm

Re: Replace text in .pdf file?

Post by Axolotl »

Hello back!
when it comes to software, I would never say never or I would never say it can't be done.
It's not easy and (as far as I know) there is no PB library or PB functions.
Suggestion based on my application experience:
Take a file with editor format (e.g. .rtf) and generate a PDF file from it after you have made the desired changes.
I use placeholders such as $ProgramName$ or similar in the text.

Anyway, a PDF Library I know is MuPDF -- no interface to PB right now. (I think.)
Just because it worked doesn't mean it works.
PureBasic 6.04 (x86) and <latest stable version and current alpha/beta> (x64) on Windows 11 Home. Now started with Linux (VM: Ubuntu 22.04).
Axolotl
Addict
Addict
Posts: 802
Joined: Wed Dec 31, 2008 3:36 pm

Re: Replace text in .pdf file?

Post by Axolotl »

A little search for PDF Libraries (C or C++ based) resulted in the following list:

Projects on github
* PDF-Writer
* PoDoFo
* VersyPDF
* libharu-pdf

To be honest, I never tried any of them, but maybe there is already something from the pros.
Just because it worked doesn't mean it works.
PureBasic 6.04 (x86) and <latest stable version and current alpha/beta> (x64) on Windows 11 Home. Now started with Linux (VM: Ubuntu 22.04).
highend
Enthusiast
Enthusiast
Posts: 162
Joined: Tue Jun 17, 2014 4:49 pm

Re: Replace text in .pdf file?

Post by highend »

Thank you, Axolotl!

I've tried to modify it via Python and PyMuPDF and in general, it works...

The problem: It uses e.g. a different font than the one that is used on that line where the replacement happens (and the font is in the .pdf file and additionally installed on the computer where I've used the python script)...

I guess I can't use .rtf ;(
E.g. I have a graphic (a logo) in the top left corner and I need to change text in a block on the right side of it.

When I open the .rtf with Wordpad and add the logo, I can't have a multiline text block on the right hand side of it.

I guess everything could be automated perfectly (by first searching and replacing text in a human readable text file and then creating a .pdf out of it), e.g. by using LaTEX but that would require a gigabyte of software to install and enough time to dive into LaTEX...

What I really do not want: An office software to install (MS Office, Libre Office or anything like that).
Too much bloatware...

:mrgreen:
Axolotl
Addict
Addict
Posts: 802
Joined: Wed Dec 31, 2008 3:36 pm

Re: Replace text in .pdf file?

Post by Axolotl »

Understandable.

I guess you can manage the .rtf way by using a table like this:
The table is a two column and one row one and the separator lines are invisible.
Insert the picture (logo) in the first column and add the text in the others.

Code: Select all

{\rtf1\ansi\ansicpg1252\deff0\nouicompat\deflang1031{\fonttbl{\f0\froman\fcharset0 Times New Roman;}}
{\*\generator Riched20 10.0.22621}\viewkind4\uc1\trowd\trgaph108\trpaddl108\trpaddr108\trpaddfl3\trpaddfr3
\cellx4820\cellx9640 
\pard\intbl\nowidctlpar\hyphpar0\charscalex100\kerning1\f0\fs20 <Pict>\cell Name\par
City\par
something\cell\row 
\pard\nowidctlpar\hyphpar0\par
}
My way of doing this.
  1. Start my word processor and enter the stuff I want show and store as Doc1.rtf
  2. Open Doc1.rtf in Wordpad check the result and store it as Doc1-1.rtf
  3. Open Doc1-1.rtf in wordpad and do all the stuff I want with Placeholders and so on
The second step is important, because my word processor is very very talkative (means: the rtf contains far too many details that are not needed)
Bear in mind, that EditorGadget() is not able to deal with tables. But you can do changes with any text editor as well.
Just because it worked doesn't mean it works.
PureBasic 6.04 (x86) and <latest stable version and current alpha/beta> (x64) on Windows 11 Home. Now started with Linux (VM: Ubuntu 22.04).
morosh
Enthusiast
Enthusiast
Posts: 329
Joined: Wed Aug 03, 2011 4:52 am
Location: Beirut, Lebanon

Re: Replace text in .pdf file?

Post by morosh »

May be this isn't what you search exactly, but you can fill fields in pdf programmatically:
using cpdf https://community.coherentpdf.com/, which is free for personnal use, you can add comments programmatically.
with the help of Infratec, I tried the following and it works:

Code: Select all

CompilerIf #PB_Compiler_IsMainFile
  EnableExplicit
CompilerEndIf

Structure cpdf_position 
  cpdf_anchor.i;    /* Position anchor */
  cpdf_coord1.d;    /* Parameter one */
  cpdf_coord2.d;    /* Parameter two */
EndStructure
Global pos.cpdf_position
  #kk=28.346457

Enumeration cpdf_anchor 
  #cpdf_posCentre;      /* Absolute centre */
  #cpdf_posLeft  ;       /* Absolute left */
  #cpdf_posRight ;       /* Absolute right */
  #cpdf_top      ;            /* Top top centre of the page */
  #cpdf_topLeft  ;        /* The top left of the page */
  #cpdf_topRight ;       /* The top right of the page */
  #cpdf_left     ;           /* The left hand side of the page, halfway down */
  #cpdf_bottomLeft;     /* The bottom left of the page */
  #cpdf_bottom    ;         /* The bottom middle of the page */
  #cpdf_bottomRight;    /* The bottom right of the page */
  #cpdf_right      ;          /* The right hand side of the page, halfway down */
  #cpdf_diagonal   ;       /* Diagonal, bottom left To top right */
  #cpdf_reverseDiagonal; /* Diagonal, top left To bottom right */
EndEnumeration

Enumeration cpdf_font 
  #cpdf_timesRoman   ;           /* Times Roman */
  #cpdf_timesBold    ;            /* Times Bold */
  #cpdf_timesItalic  ;          /* Times Italic */
  #cpdf_timesBoldItalic  ;      /* Times Bold Italic */
  #cpdf_helvetica        ;            /* Helvetica */
  #cpdf_helveticaBold    ;        /* Helvetica Bold */
  #cpdf_helveticaOblique ;     /* Helvetica Oblique */
  #cpdf_helveticaBoldOblique ; /* Helvetica Bold Oblique */
  #cpdf_courier              ;              /* Courier */
  #cpdf_courierBold          ;          /* Courier Bold */
  #cpdf_courierOblique       ;       /* Courier Oblique */
  #cpdf_courierBoldOblique   ;    /* Courier Bold Oblique */
EndEnumeration

Enumeration cpdf_justification
  #cpdf_leftJustify    ;   /* Left justify */
  #cpdf_CentreJustify  ; /* Centre justify */
  #cpdf_RightJustify   ;   /* Right justify */
EndEnumeration

PrototypeC.i prototype_cpdf_version()
PrototypeC prototype_cpdf_startup(*argv)
PrototypeC.i prototype_cpdf_fromFile(filename.p-utf8, userpw.p-utf8)
PrototypeC prototype_cpdf_clearError()
PrototypeC.i prototype_cpdf_mergeSimple(*pdfs, length.l)
PrototypeC prototype_cpdf_toFile(pdf.l, filename.p-utf8, linearize.l, make_id.l)
PrototypeC prototype_cpdf_pages(pdf.l)
PrototypeC.l prototype_cpdf_range(n1.l, n2.l)
PrototypeC prototype_cpdf_addText(flag_add.l, pdf.l, page.l, txt.p-utf8, *position.cpdf_position, linespacing.d,
                                  stbates.l, font.i, size.d, red.d, green.d, blue.d, flagunder.l, flagcrop.l, flagoutline.l, opacity.d,
                                  justification.i, flag_midline.l, flag_topline.l, fname.p-utf8, linewidth.d, embed.l)
PrototypeC prototype_cpdf_addTextSimple(pdf.l, page.l, txt.p-utf8, *position.cpdf_position, font.i, size.d)

Global cpdf_library.i
Global cpdf_version_.prototype_cpdf_version
Global cpdf_startup.prototype_cpdf_startup
Global cpdf_fromFile.prototype_cpdf_fromFile
Global cpdf_clearError.prototype_cpdf_clearError
Global cpdf_mergeSimple.prototype_cpdf_mergeSimple
Global cpdf_toFile.prototype_cpdf_toFile
Global cpdf_pages.prototype_cpdf_pages
Global cpdf_range.prototype_cpdf_range
Global cpdf_addText.prototype_cpdf_addText
Global cpdf_addTextSimple.prototype_cpdf_addTextSimple

Global.i cpdf_lastError_
Global.i cpdf_lastErrorString_

Macro cpdf_version()
  PeekS(cpdf_version_(), -1, #PB_UTF8)
EndMacro

Macro cpdf_lastError
  PeekI(cpdf_lastError_)
EndMacro

Macro cpdf_lastErrorString
  PeekS(cpdf_lastErrorString_, -1, #PB_UTF8)
EndMacro


Procedure cpdf_addTextSimple2(pdf.l, page.l, txt.s, posx.f, posy.f, font.i, size.d)
  pos\cpdf_coord1=#kk*posx            ; /* Parameter one */
  pos\cpdf_coord2=#kk*(29.7-posy)          ; /* Parameter two */  
  cpdf_addTextSimple(pdf, page, txt, @pos, font.i, size.d)
EndProcedure


Procedure.i OpenCPDF()
  
  If Not IsLibrary(cpdf_library)
    CompilerIf #PB_Compiler_Processor = #PB_Processor_x86
      cpdf_library = OpenLibrary(#PB_Any, "E:\install\dvd2\PDF utilities\sdk\cpdf\win32\libcpdf.dll")
    CompilerElseIf #PB_Compiler_Processor = #PB_Processor_x64
      cpdf_library = OpenLibrary(#PB_Any, "E:\install\dvd2\PDF utilities\sdk\cpdf\win64\libcpdf.dll")
    CompilerEndIf
    
    If cpdf_library
      cpdf_version_         = GetFunction(cpdf_library, "cpdf_version")
      cpdf_startup          = GetFunction(cpdf_library, "cpdf_startup")
      cpdf_fromFile         = GetFunction(cpdf_library, "cpdf_fromFile")
      cpdf_clearError       = GetFunction(cpdf_library, "cpdf_clearError")
      cpdf_mergeSimple      = GetFunction(cpdf_library, "cpdf_mergeSimple")
      cpdf_toFile           = GetFunction(cpdf_library, "cpdf_toFile")
      cpdf_pages            = GetFunction(cpdf_library, "cpdf_pages")
      cpdf_range            = GetFunction(cpdf_library, "cpdf_range")
      cpdf_addText          = GetFunction(cpdf_library, "cpdf_addText")
      cpdf_addTextSimple    = GetFunction(cpdf_library, "cpdf_addTextSimple")
      
      cpdf_lastError_       = GetFunction(cpdf_library, "cpdf_lastError")
      cpdf_lastErrorString_ = GetFunction(cpdf_library, "cpdf_lastErrorString")
    Else 
      MessageRequester("Error","no libcpdf.dll present")
    EndIf
  EndIf
  
  ProcedureReturn cpdf_library
  
EndProcedure

Procedure CloseCPDF()
  If IsLibrary(cpdf_library)
    CloseLibrary(cpdf_library)
    cpdf_library = #Null
  EndIf
EndProcedure

CompilerIf #PB_Compiler_IsMainFile
  
  Define *argv
  Define.i orig_pdf, output, i
  Dim pdfs.l(2)
  Define range.l
  ;1pt = 1/72 inch = 0.0352777 cm    1cm=28.346457  
  pos\cpdf_anchor=#cpdf_posLeft     ;    /* Position anchor */
  pos\cpdf_coord1=#kk*6            ; /* Parameter one */
  pos\cpdf_coord2=#kk*10          ; /* Parameter two */
  
  If OpenCPDF()
    
    ; Initialise cpdf
    cpdf_startup(@*argv)
    
    ;Debug cpdf_version()
    
    ; We will take the input hello.pdf And Repeat it three times
    orig_pdf = cpdf_fromFile("toto.pdf", "")
    
    ; Check the error state
    If cpdf_lastError = 1
      Debug cpdf_lastErrorString
      End 1
    EndIf
    
    ; Clear the error state
    cpdf_clearError()
    
    ; The Array of PDFs To merge
    pdfs(0) = orig_pdf
    pdfs(1) = orig_pdf
    pdfs(2) = orig_pdf
    
    ; Merge them
    output = cpdf_mergeSimple(pdfs(), 1)
    
    ; Check the error state
    If cpdf_lastError = 1
      Debug cpdf_lastErrorString
      End 1
    EndIf
    
    cpdf_clearError()
    
    range=cpdf_range(1, 1)
    cpdf_addText(#False, output, range, "hahaha!!!", @pos, 1, 0, #cpdf_timesBold, 14, 1, 0.0, 0.0, #False, #False,
                      #False, 1, #cpdf_CentreJustify, #False, #False, "xxx", 5, #False)
    For i=1 To 5
      cpdf_addTextSimple2(output, range, "Hello!!!"+Str(i), 5, i, #cpdf_timesBold, 14)
    Next  
    ; Write output
    cpdf_toFile(output, "output.pdf", #False, #False)
    Debug cpdf_pages(output)
    ; Check the error state
    If cpdf_lastError = 1
      Debug cpdf_lastErrorString
      End 1
    EndIf
    
    CloseCPDF()
  EndIf
CompilerEndIf
RunProgram("output.pdf")
May be an alternative
HTH
PureBasic: Surprisingly simple, diabolically powerful
camille
User
User
Posts: 71
Joined: Tue Nov 19, 2019 12:52 pm

Re: Replace text in .pdf file?

Post by camille »

Great, really, thanks a lot!

rtf is really a mess to read but at last I can do text replacements with minimal effort and it doesn't require huge apps / installations.

I didn't want to have a hard time converting my pdf into rtf so I looked around and stumbled upon "AbleWord" (http://www.ableword.net/).

It can read pdf and save it as rtf. The layout was reproduced a 100%, I've just had to do font adjustments. Cool thing and afaik it's free to use...

The bad thing: Wordpad can of course open that rtf file but it messes up the content really bad so I know need to rely on AbleWord to make changes.
Maybe it's because of the text boxes that it used to replicate the layout...
camille
User
User
Posts: 71
Joined: Tue Nov 19, 2019 12:52 pm

Re: Replace text in .pdf file?

Post by camille »

@morosh

Thank you!

cpdf does not seem to be capable of finding and replacing text but it could be used to add text to a (blank, apart from already present & fixed elements) pdf file.

I'm not that firm with api access, it seems it's not possible to define a different font to add text apart from the three default ones Times New Roman, Helvetica and Courier)?
Axolotl
Addict
Addict
Posts: 802
Joined: Wed Dec 31, 2008 3:36 pm

Re: Replace text in .pdf file?

Post by Axolotl »

@camille
It is great if you find your way.
Thanks for the hint. I have never heard about AbleWord.
Just because it worked doesn't mean it works.
PureBasic 6.04 (x86) and <latest stable version and current alpha/beta> (x64) on Windows 11 Home. Now started with Linux (VM: Ubuntu 22.04).
tua
User
User
Posts: 68
Joined: Sun Jul 23, 2023 8:49 pm
Location: BC, Canada

Re: Replace text in .pdf file?

Post by tua »

Your exact use case is not entirely clear from your post...

Do you control the PDF? Do you merely want to fill in, say some name & address into a fillable PDF that someone else (often government agencies) provides? Then do some processing when filled, such as saving, emailing etc.?

If so, then there's nothing easier than using Adobe's FDF mechanism (unless you want to spring for hundreds of $$$ for some sophisticated library).
FDF just requires writing a simple text file with FDF extension.
camille
User
User
Posts: 71
Joined: Tue Nov 19, 2019 12:52 pm

Re: Replace text in .pdf file?

Post by camille »

Hi!

Sorry, I've missed that there was a new posting here :oops:

It's my own pdf that I want to create invoices with.
I've created it / them with PDF-XChange Editor in the first place.

I've stumbled upon https://github.com/pdfcpu/pdfcpu
but it doesn't create a 100% pixel-perfect repro of the original form when you use non-standard fonts (in my case e.g. "PB Sans").
Neither did any other CLI / Python tool (that don't cost a fortune)...

I've used LaTeX but found it quite cumbersome for this task and I don't like that a standard installation of e.g. TeXLive wants to use a whopping few gigabytes^^

But I then found: https://typst.app/home
It's a <40 MB standalone CLI binary which is able to produce 100% pixel-perfect layouts, allows scripting inside the .typ document (crazy^^), has external package support for e.g. creating SEPA QR-Codes in the document, etc.

It wasn't overly complicated to recreate my invoice template file and I can now create perfect (for my usecase) .pdf's with it in a blink of an eye :mrgreen:

Regards,
Camille
infratec
Always Here
Always Here
Posts: 7577
Joined: Sun Sep 07, 2008 12:45 pm
Location: Germany

Re: Replace text in .pdf file?

Post by infratec »

Have you ever looked at scribus?
I don't know if it can do what you want, but it is 100% free.

www.scribus.net

https://wiki.scribus.net/canvas/Get_Sta ... _Scribus:1
highend
Enthusiast
Enthusiast
Posts: 162
Joined: Tue Jun 17, 2014 4:49 pm

Re: Replace text in .pdf file?

Post by highend »

@infratec It seems she's not looking for (another) Desktop app to create / fill a form but a tool that allows to automate the whole process via command line interface (and still creates a 100% reproducible output even with non-standard fonts)?
Last edited by highend on Sun Jul 06, 2025 12:25 pm, edited 2 times in total.
infratec
Always Here
Always Here
Posts: 7577
Joined: Sun Sep 07, 2008 12:45 pm
Location: Germany

Re: Replace text in .pdf file?

Post by infratec »

You can fully control scribus by script from cli.
camille
User
User
Posts: 71
Joined: Tue Nov 19, 2019 12:52 pm

Re: Replace text in .pdf file?

Post by camille »

Hi again!

At infratec: I'll take a look at Scribus, thank you!
At highend: It's true, I'd like to have a cli solution only but I guess I should at least try every single suggestion :P
Post Reply