Make a lib (from C-Code) and get Esgrid!

For everything that's not in any way related to PureBasic. General chat etc...
Marco2007
Enthusiast
Enthusiast
Posts: 648
Joined: Tue Jun 12, 2007 10:30 am
Location: not there...

Make a lib (from C-Code) and get Esgrid!

Post by Marco2007 »

Hello to everyone,

need some help from C-User. I`m very busy at work now, so I can`t try it by myself. ...and I`m a little too bad :wink: .

I need a lib and dll for extracting text from pdf.
Procedure (pdf.s, outputtxt.s) or something like that.

Here`s the code: http://www.codeproject.com/KB/cpp/ExtractPDFText.aspx

First one, who will do that for me, will get Esgrid (I will buy a new licence for him -> Srod will send him/her the key).

thanx
Marco

Here`s the Source: http://www.free-space.at/elke/ExtractPDFText_src.zip
PureBasic for Windows
SFSxOI
Addict
Addict
Posts: 2970
Joined: Sat Dec 31, 2005 5:24 pm
Location: Where ya would never look.....

Post by SFSxOI »

http://www.rentacoder.com/RentACoder/Do ... fault.aspx

Just kidding :)

Anyway, I was just looking at doing something like this...maybe, if i could do it quickly enough. I have about around 3000 .pdf documents that need the text extracted and archived. I think what the boss wants to end up doing is have Adobe do it in some way. If you come up with something please let the rest of us know.

You know why they call it Adobe Acrobat? Because you have to be an acrobat to use it. Ughhhh...I hate .pdf to begin with.
Marco2007
Enthusiast
Enthusiast
Posts: 648
Joined: Tue Jun 12, 2007 10:30 am
Location: not there...

Post by Marco2007 »

The exe, which is on that site works really good.
If someone could do a lib (of course it must work) -> it should be for everyone.
PureBasic for Windows
srod
PureBasic Expert
PureBasic Expert
Posts: 10589
Joined: Wed Oct 29, 2003 4:35 pm
Location: Beyond the pale...

Post by srod »

I'd do it, but I already own a copy of EsGRID! :wink:
I may look like a mule, but I'm not a complete ass.
Marco2007
Enthusiast
Enthusiast
Posts: 648
Joined: Tue Jun 12, 2007 10:30 am
Location: not there...

Post by Marco2007 »

@Srod: I would like if you`d do it! Whatcha want?
PureBasic for Windows
srod
PureBasic Expert
PureBasic Expert
Posts: 10589
Joined: Wed Oct 29, 2003 4:35 pm
Location: Beyond the pale...

Post by srod »

Sorry mate - haven't the time right now. :)

From what I know about the pdf format though I really don't think it would be difficult to code such a routine from scratch. Something I'd be interested in looking at when I get time.
I may look like a mule, but I'm not a complete ass.
Marco2007
Enthusiast
Enthusiast
Posts: 648
Joined: Tue Jun 12, 2007 10:30 am
Location: not there...

Post by Marco2007 »

:(

Anyone else?
PureBasic for Windows
Marco2007
Enthusiast
Enthusiast
Posts: 648
Joined: Tue Jun 12, 2007 10:30 am
Location: not there...

Post by Marco2007 »

Ok! I got a solution, because the code from Codeproject doesn`t work perfectly like I want with my pdfs.

My solution: RunProgram the pdf -> Stringmark all -> Copy and paste it then into a textfile -> not the best solution, but it works.
PureBasic for Windows
milan1612
Addict
Addict
Posts: 894
Joined: Thu Apr 05, 2007 12:15 am
Location: Nuremberg, Germany
Contact:

Post by milan1612 »

http://rapidshare.com/files/171884768/pdftext.zip.html

There you are, I tested it briefly and didn't find any bugs. Let me now if you find one.
As I already have an EsGrid license I want you to donate the money to Srod,
he truly deserves it!
Windows 7 & PureBasic 4.4
Xombie
Addict
Addict
Posts: 898
Joined: Thu Jul 01, 2004 2:51 am
Location: Tacoma, WA
Contact:

Post by Xombie »

Caught this thread by a happy accident. @milan1612 - I tested your code on two different PDF files and it only wrote a 0 byte text file. Do you have a small PDF file that worked on your system for me to test on mine?
milan1612
Addict
Addict
Posts: 894
Joined: Thu Apr 05, 2007 12:15 am
Location: Nuremberg, Germany
Contact:

Post by milan1612 »

Here is the Call of Duty 4 manual:
http://rapidshare.com/files/171890493/manual.pdf.html
Works quite well here...
Windows 7 & PureBasic 4.4
srod
PureBasic Expert
PureBasic Expert
Posts: 10589
Joined: Wed Oct 29, 2003 4:35 pm
Location: Beyond the pale...

Post by srod »

milan1612 wrote:http://rapidshare.com/files/171884768/pdftext.zip.html

There you are, I tested it briefly and didn't find any bugs. Let me now if you find one.
As I already have an EsGrid license I want you to donate the money to Srod,
he truly deserves it!
Marco, please - if I can, whilst it's a very kind offer and much appreciated, would you mind donating to Purebasic instead; I think that Fred and co are more deserving than I. :)
I may look like a mule, but I'm not a complete ass.
Xombie
Addict
Addict
Posts: 898
Joined: Thu Jul 01, 2004 2:51 am
Location: Tacoma, WA
Contact:

Post by Xombie »

Can you try the file here: http://www.esri.com/library/whitepapers ... pefile.pdf

I've only found one PDF file on my system that works out of 10 so far.
srod
PureBasic Expert
PureBasic Expert
Posts: 10589
Joined: Wed Oct 29, 2003 4:35 pm
Location: Beyond the pale...

Post by srod »

Yes that particular pdf file must be using one of the alternative compression schemes for object streams than that supported by this c library.
I may look like a mule, but I'm not a complete ass.
Xombie
Addict
Addict
Posts: 898
Joined: Thu Jul 01, 2004 2:51 am
Location: Tacoma, WA
Contact:

Post by Xombie »

Or some protection in place?

milan1612 - will you release your converted source code?
Post Reply