PurePDF Version 2.0

Developed or developing a new product in PureBasic? Tell the world about it.
IdeasVacuum
Always Here
Always Here
Posts: 6426
Joined: Fri Oct 23, 2009 2:33 am
Location: Wales, UK
Contact:

Re: PurePDF Version 2.0

Post by IdeasVacuum »

Using PB5.21 LTS Beta 3, the following Compiler tests fail:

PurePDF

Code: Select all

CompilerIf Defined(PurePDF_Include,#PB_Constant)
  XIncludeFile "D:\PROJECTS\PURE BASIC\CommonCode\PurePDF_res.pbi"
CompilerEndIf
PurePDF_res

Code: Select all

CompilerIf Defined(MEM_DataStructure,#PB_Structure)=0
Structure MEM_DataStructure
  pData.i       ; Pointer to memory
  lMaxSize.i    ; Max. reserved memory in bytes
  lCurSize.i    ; Current last possition of data in bytes
EndStructure
CompilerEndIf
IdeasVacuum
If it sounds simple, you have not grasped the complexity.
IdeasVacuum
Always Here
Always Here
Posts: 6426
Joined: Fri Oct 23, 2009 2:33 am
Location: Wales, UK
Contact:

Re: PurePDF Version 2.0

Post by IdeasVacuum »

Hi normeus

Just noticed: photolist()

What is actually stored in the 'real' version of that list? Could that be a memory hog? It very probably isn't of course.
IdeasVacuum
If it sounds simple, you have not grasped the complexity.
normeus
Enthusiast
Enthusiast
Posts: 470
Joined: Fri Apr 20, 2012 8:09 pm
Contact:

Re: PurePDF Version 2.0

Post by normeus »

IdeasVacuum,
Thanks for sticking with this.
I tried your memory cleaning code:
every 500 images I run your code and I see that my commit memory goes down to 46k but after a few seconds it jumps right back up to a high amount ( 600 meg then gradually goes to 800 till I hit another 500 photos then it goes back down to 46k). my working set keeps growing at a steady pace until it gets to 125meg which should be OK since that is the final size of the PDF. then it looks like it is done but the final pdf created viewed in a hex editor has this at the end:

Code: Select all

14124 0 obj
<</Type /XObject
/Subtype /Image
/Width 544
/Height 408
/ColorSpace /DeviceRGB
/BitsPerComponent 8
/Filter /DCTDecode
/Length 29688>>
stream

endstream
endobj
14125 0 obj
<</Type /XObject
/Subtype /Image
/Width 544
/Height 408
/ColorSpace /DeviceRGB
/BitsPerComponent 8
/Filter /DCTDecode
/Length 53871>>
stream

endstream
endobj
14126 0 obj
/Width 544
notice the missing picture data.

also, at the last part it just gives up and ends at "/Width 544" ;

it is missing:

Code: Select all

2 0 obj
<<
/ProcSet [/PDF /Text /ImageB /ImageC /ImageI]
/Font <<
/F1 10403 0 R
/F2 10404 0 R
>>
/XObject <<
/I1 10405 0 R
ETC......

At this point I will just have a running total of memory size used by my loaded images and save the PDF and create a new one starting where the first one ended.
I tried to use "pdf_Image()" instead to load the images but the program gave me an invalid memory access. ( read error at address 8) located in the footer() procedure.

Thank you.
google Translate;Makes my jokes fall flat- Fait mes blagues tombent à plat- Machte meine Witze verpuffen- Eh cumpari ci vo sunari
IdeasVacuum
Always Here
Always Here
Posts: 6426
Joined: Fri Oct 23, 2009 2:33 am
Location: Wales, UK
Contact:

Re: PurePDF Version 2.0

Post by IdeasVacuum »

pdf_Image() is making a structured LinkedList of the image details - looks as though it is designed to handle a lot of images.

I wonder if calling EmptyWorkingSet() more frequently, say every 100 images, would make a difference.

It does however seem to be the case that PurePdf is falling at the last hurdle. I'm sure ABBKlaus would know where to look.
IdeasVacuum
If it sounds simple, you have not grasped the complexity.
normeus
Enthusiast
Enthusiast
Posts: 470
Joined: Fri Apr 20, 2012 8:09 pm
Contact:

Re: PurePDF Version 2.0

Post by normeus »

IdeasVacuum,

I tried EmptyWorkingSet() every 100 pictures but got the same result. It works if I resize to smaller images so I am pretty sure it is not the number of pictures only the size of pictures that matter.

all the pictures resized become about 94 meg

there is no error when running the program ( loading the images from memory ) but looking at the last lines of PDF with a hex editor:

Code: Select all

15535 0 obj
<</Type /XObject
/Subtype /Image
/Width 544
/Height 408
/ColorSpace /DeviceRGB
/BitsPerComponent 8
/Filter /DCTDecode
/Length 24657>>
stream
@:kz.......etc..... encoded image
endstream
endobj
15536 0 obj
which looks like once it hits a preset 'PDF size' then it just exits and saves the file.
the size of the PDF created ( non working pdf ) in my different retries:
127,168,000 bytes ( this is the latest attempt )
126,912,000 bytes
126,912,000 bytes
127,104,000 bytes
so for now I will keep an eye on that PDF limit and just SavePDF and start a newone ( I do hate doing this but it works)
Thank you,
Norm.
google Translate;Makes my jokes fall flat- Fait mes blagues tombent à plat- Machte meine Witze verpuffen- Eh cumpari ci vo sunari
ABBKlaus
Addict
Addict
Posts: 1143
Joined: Sat Apr 10, 2004 1:20 pm
Location: Germany

Re: PurePDF Version 2.0

Post by ABBKlaus »

I can´t reproduce your problem here ! I tried with your sample code the creation of 4000 Images. The resulting PDF has 198.275.333 Bytes and displayed correct in Adobe Reader.
You might want to check the result of ResizeImage() and pdf_GetErrorCode() :wink:

BR Klaus

PS : The Image has 1920 x 1200 Pixel and was resized to 544 x 408 Pixel.
IdeasVacuum
Always Here
Always Here
Posts: 6426
Joined: Fri Oct 23, 2009 2:33 am
Location: Wales, UK
Contact:

Re: PurePDF Version 2.0

Post by IdeasVacuum »

Might be how the code handles the image width. The originals that fail are 672 x 384 pixels.

So normeus, debug the final size of the images (and put your feet up for a while), and also add pdf_GetErrorMessage() and pdf_GetErrorCode() after pdf_ImageMem().

Another 'thing' - are you compiling the app as Unicode or ASCII?
IdeasVacuum
If it sounds simple, you have not grasped the complexity.
normeus
Enthusiast
Enthusiast
Posts: 470
Joined: Fri Apr 20, 2012 8:09 pm
Contact:

Re: PurePDF Version 2.0

Post by normeus »

compiling as ascii and using 5.20 LTS (x86)

I will try pdf_GetErrorMessage() and pdf_GetErrorCode()

for my 4000 image problem, ( it is more like 5000 images now )

I had trouble with images before, that is why I started resizing images.

Thank you,!!!

Norm.
google Translate;Makes my jokes fall flat- Fait mes blagues tombent à plat- Machte meine Witze verpuffen- Eh cumpari ci vo sunari
normeus
Enthusiast
Enthusiast
Posts: 470
Joined: Fri Apr 20, 2012 8:09 pm
Contact:

Re: PurePDF Version 2.0

Post by normeus »

After days of testing these are my inconclusive results:

I have 5200 images that should go in a pdf .
I load the file names from a csv file which also contains dates and locations.
There is a sqllite database created in memory so that I could sort by location or
other selections (like this location not that location ) so after the csv file is loaded
into the database my memory usage is 63KB.

I used:

Code: Select all

 pdf_ImageMem(fname,m,z,l,t,180,135)
Debug pdf_GetErrorMessage()
Debug pdf_GetErrorCode()
Debug fname
and I got no errors but when I tried "Pdf_Image()
I got error "14" on certain jpg files I looked at the files and they were an odd size
(like 3500x1000).
I got rid of the whole resize function and I am just doing a straight

Code: Select all

	If  ResizeImage(pdf_1_img,544,408) ; original size has to be 1632x1224
	  ;etc..
      pdf_ImageMem(fname,m,z,l,t,180,135)
	  ;etc..
	EndIf

I realize, that; because I am using UseSQLiteDatabase() I am using extra memory.

I added EmptyWorkingSet() just before pdf_Save("large.pdf") and that gave me the biggest gain in pdf size. ( Thanx IdeasVacuum this piece of code really comes handy)

Code: Select all

EmptyWorkingSet()
pdf_Save("large.pdf")
;
end


Procedure EmptyWorkingSet()
  ;--------------------------
  ;Reduce memory use of current process
  Protected *EmptyWorkingSet
  Protected hDll.i = OpenLibrary(#PB_Any, "Psapi.dll")
  If hDll
    *EmptyWorkingSet = GetFunction(hDll, "EmptyWorkingSet")
    If *EmptyWorkingSet            
      CallFunctionFast(*EmptyWorkingSet, GetCurrentProcess_())
    EndIf
    CloseLibrary(hDll)
  EndIf
EndProcedure
so after the rezise of images I keep a running total of memory used
until I hit 134,300,069
( a number I came up after deleting a few pictures at a time until it worked.. Hours of JOY! )
then save pdf and start a new one.

This gave me a pdf with a final size of 141 meg. that is a decent size and I figured that because of the
extra memory I am using I would never get to the 198 meg that ABBKlaus got.( Thank you so much ABBKlaus for testing my program and maintaining PurePDF )
once I got the program running, I copied it to a faster computer with more memory ( 8 gig )
I only wanted to see just how fast the program could run.
Well, the PDF it created was bad! while saving it died at around 117meg
so changed my picture memory limit to 114,300,069 bytes and now it works on the faster machine.

The PDF created on the faster machine 122meg.
Same data, same pictures on my slow laptop 120 meg.

That is it, the program works with this 122meg limit.
I might do some more testing ( it takes a LONG... time to do a single test).

Thank you.
Norm.
google Translate;Makes my jokes fall flat- Fait mes blagues tombent à plat- Machte meine Witze verpuffen- Eh cumpari ci vo sunari
IdeasVacuum
Always Here
Always Here
Posts: 6426
Joined: Fri Oct 23, 2009 2:33 am
Location: Wales, UK
Contact:

Re: PurePDF Version 2.0

Post by IdeasVacuum »

The PDF created on the faster machine 122meg.
Same data, same pictures on my slow laptop 120 meg.
That seems weird. However, seems as though the PDF size is well within the means of the RAM on both machines. There must be something else going on, probably very obvious once you have found it.
IdeasVacuum
If it sounds simple, you have not grasped the complexity.
normeus
Enthusiast
Enthusiast
Posts: 470
Joined: Fri Apr 20, 2012 8:09 pm
Contact:

Re: PurePDF Version 2.0

Post by normeus »

I forgot to mention that each image is going into its own pdf page.
I do:

Code: Select all

pdf_AddPage()
;add image
so I commented out all of the code except for pdf_AddPage() and that is where all my memory was going.
Use this sample code to see 2 gig of memory get used up by 16000 pdf pages. Then a 16meg PDF is created.
this is just a few lines of text with bookmarks per page

Code: Select all

#PurePDF_Include=1

Procedure Header()
  pdf_Ln(5)
EndProcedure

Procedure Footer()
  pdf_SetY(-15)
  pdf_SetFont("Arial","",9)
  pdf_Line(17,260,198,260)
  pdf_Cell(0,2,"This document is ready     pg."+Str(pdf_GetPageNo()),0,0,#PDF_ALIGN_CENTER)
EndProcedure

pdf_Create("P","mm",#PDF_PAGE_FORMAT_LETTER)
pdf_SetLeftMargin(12.5)
pdf_SetProcFooter(@Footer())
pdf_SetProcHeader(@Header()) 
For i = 1 To 15408 ; 16000 pages will not run at all, 15408 will save the file but it will have an error
pdf_AddPage()  
pdf_SetFont("Arial","",18)
pdf_Write(6,"Title for chapter "+Str(i))
  pdf_BookMark(Str(i))
  pdf_Ln(5)
  pdf_SetFont("","",12)
  For x = 1 To 39
    pdf_Write(6,Str(i)+" "+Str(x))
    pdf_Ln()
    Next
Next

Debug "PDF is in memory ready to be saved"

pdf_Save("c:\manypages.pdf")  ; change this to a drive with write access

Debug "PDF was saved"
it looks like the problem is a combination of the amount of data and the number of pages.

Norm.
google Translate;Makes my jokes fall flat- Fait mes blagues tombent à plat- Machte meine Witze verpuffen- Eh cumpari ci vo sunari
IdeasVacuum
Always Here
Always Here
Posts: 6426
Joined: Fri Oct 23, 2009 2:33 am
Location: Wales, UK
Contact:

Re: PurePDF Version 2.0

Post by IdeasVacuum »

I had realised it was one-page-per-image, do not know why I didn't question the page but only the image handling. So, you have looked at the problem outside-of-the-box and it looks like you have potentially discovered the root of the issue. Currently, the lib builds the entire pdf file in memory, then writes it to disc. Generally that is going to work just fine. To make huge files, perhaps we need an option whereby the lib makes two or more temporarily files, saved to disc, and then finally combines them into one.

Your pdf file is big normeus, but not that big and you do seem to have enough RAM space available. So, maybe there is something outside of your process that is causing the problem - another app trampling the memory being used by your app (I remember that being an issue with animated cursors on Windows at some time). Perhaps the process is being interrupted by your Anti-Virus app. That is a common issue (AV treating large file writing as suspicious), I used to have that problem when writing large CAD files, so it definitely is a possibility......
IdeasVacuum
If it sounds simple, you have not grasped the complexity.
normeus
Enthusiast
Enthusiast
Posts: 470
Joined: Fri Apr 20, 2012 8:09 pm
Contact:

Re: PurePDF Version 2.0

Post by normeus »

in the structure "MEM_DataStructure"
"lCurSize" is of integer type with a max of 2147483647 ( 2gig )
could it be, since I see an error when the program gets to 2gig of memory usage that the pointer just wraps to 0?

could lCurSize be changed to a Quad? ( at least as far as being a pointer for all the data in a pdf )

Thanks,
Norm.
ABBKlaus
Addict
Addict
Posts: 1143
Joined: Sat Apr 10, 2004 1:20 pm
Location: Germany

Re: PurePDF Version 2.0

Post by ABBKlaus »

normeus wrote:in the structure "MEM_DataStructure"
"lCurSize" is of integer type with a max of 2147483647 ( 2gig )
could it be, since I see an error when the program gets to 2gig of memory usage that the pointer just wraps to 0?

could lCurSize be changed to a Quad? ( at least as far as being a pointer for all the data in a pdf )

Thanks,
Norm.
Thats exactly the limit a 32-bit process can handle.
IdeasVacuum
Always Here
Always Here
Posts: 6426
Joined: Fri Oct 23, 2009 2:33 am
Location: Wales, UK
Contact:

Re: PurePDF Version 2.0

Post by IdeasVacuum »

ah, so that limit needs to have a switch - if your app is 64bit on a 64bit OS, more ram is a available for a single process.......
.....but also, it would be good to use a temporary files method so that huge PDF files can be made on 32bit.
IdeasVacuum
If it sounds simple, you have not grasped the complexity.
Post Reply