Page 1 of 1

Strip and save attachments from raw EML files

Posted: Sat Oct 06, 2007 4:51 am
by Fangbeast
As the subject says. This is part of a routine in my GetMail program to strip attachments from a raw EML file and save them to disk, rewriting the raw eml file again.

It wasn't working properly until the might wizard, sir s of rod fixed it and I am obnoxiously grateful. Hope somebody has a use for this, it's has some more to be done to it and you can massage it to suit yourself.

Code: Select all

;============================================================================================================================
; Myprototypes.pb                              ; Prototype function declarations
;===========================================================================================================================

;===========================================================================================================================
; Constants.pb                                 ; Visual designer created constants
;===========================================================================================================================

;===========================================================================================================================
; Windows.pb                                   ; Visual designer created window code
;===========================================================================================================================

;===========================================================================================================================
; Mydeclarations.pb                            ; All procedural declarations
;===========================================================================================================================

Declare DumpBody(BodyText.s)
Declare DumpAttachments()

;===========================================================================================================================
; Myconstants.pb"                               ; All my personal constants
;===========================================================================================================================

Global EmlFileId.l, AttachmentId.l, NewEmlFileId.l, BoundaryId.s

;============================================================================================================================
; All procedures
;============================================================================================================================

Procedure DumpBody(BodyText.s)

; --26470415-POCO-75665348                                                              The boundary id from the header
; Content-Type: text/plain; charset="iso-8859-1"                                        Content type definition
; Content-Transfer-Encoding: quoted-printable                                           Type of encoding used on the text

  WriteStringN(NewEmlFileId.l, BodyText.s)                                              ; Dump the identifier to disk, we need it

  While Eof(EmlFileId.l) = 0                                                           ; Continue till the end of the file

    TestString.s = ReadString(EmlFileId.l)   ; Find the header for the body section     ; Get a string to test

    If FindString(TestString.s, BoundaryId.s, 1) <> 0                                   ; Find the multiple parts

      Break                                                                             ; Found the boundary id

    EndIf                                                                               ; End the conditional test

    WriteStringN(NewEmlFileId.l, TestString.s)                                          ; Write the line to disk

  Wend                                                                                  ; Continue the loop

EndProcedure

Procedure DumpAttachments()

; --26470415-POCO-75665348                                                                This is the boundary id
; Content-Type: image/png; name="digital-camera_aj_ashton_01.png"
; Content-Transfer-Encoding: Base64
; Content-Disposition: attachment; filename="digital-camera_aj_ashton_01.png"

  While Eof(EmlFileId.l) = 0                                                            ; Read till the end of the file

    TestString.s = ReadString(EmlFileId.l)                                               ; Read in a string

    If FindString(TestString.s, "Content-Disposition: attachment; filename=", 1) <> 0    ; Did we find binary filename"

      AttachmentFileName.s = Mid(TestString.s, 44, Len(TestString.s) - 44)               ; Extract the filename

      WriteStringN(NewEmlFileId.l, "file:///D:\" + AttachmentFileName.s)                 ; Write html reference string

      AttachmentId.l = CreateFile(#PB_Any, "D:\" + AttachmentFileName.s)                 ; Create the attachment file

      If AttachmentId.l                                                                  ; Are both files open at this time

        BlankString.s = ReadString(EmlFileId.l)                                          ; Throw the blank line away

        While Eof(EmlFileId.l) = 0                                                      ; Read till the end of the file

          InputString.s = ReadString(EmlFileId.l)                                        ; Read in the line to be decoded

          If InputString.s                                                               ; If we actually got a string

            InputSize.l = Len(InputString)                                               ; Check the length of the string

            *OutputBuffer = AllocateMemory(InputSize.l)                                  ; Create a buffer for the decoded string

            ActualBytes.l = Base64Decoder(@InputString.s, InputSize.l, *OutputBuffer, InputSize.l)  ; Did we actually get a decoded line

            WriteData(AttachmentId.l, *OutputBuffer, ActualBytes.l)                      ; Write the line To disk

            FreeMemory(*OutputBuffer)                                                    ; Free the decoder memory

            InputString.s = ""                                                           ; Clear the input string

          Else                                                                           ; Otherwise do the below

            Break                                                                        ; Return from here on blank line

          EndIf                                                                          ; End the conditional test

        Wend                                                                             ; Continue till the end of the file

        CloseFile(AttachmentId.l)                                                         ; Close the output file

      EndIf                                                                              ; No more tests

    EndIf                                                                                ; End the conditional test

  Wend                                                                                   ; Iterate through the data

EndProcedure

;============================================================================================================================
; Main program body
;============================================================================================================================

InFile.s = OpenFileRequester("File to decode", "D:\", "Raw EML (*.eml)|*.eml", 0)                 ; Get a file to read

NewEmlFileId.l = CreateFile(#PB_Any, "D:\NewEMLFile.txt")                                         ; Create the new EML file from the old

If InFile.s                                                                                       ; Did we get a file?

  EmlFileId.l  = ReadFile(#PB_Any, InFile.s)                                                       ; Try to open it

  If EmlFileId.l                                                                                   ; Did we get a handle?

    ;----------------------------------------------------------------------------------------------
    ; Get and save the header information and test for multipart boundary
    ;----------------------------------------------------------------------------------------------

    While Eof(EmlFileId.l) = 0                                                                    ; Continue till the end of the file

      TestString.s = ReadString(EmlFileId.l)                                                       ; Get a string to test

      If FindString(TestString.s, "Content-Type: multipart/mixed; boundary=", 1) <> 0             ; Has attachments

        AttachmentFlag.l  = 1                                                                     ; Set attachments flag

        BoundaryId.s      = Mid(TestString.s, 42, Len(TestString.s) - 42)                         ; Keep boundary ID for separating items

      EndIf                                                                                       ; End the test condition

      If TestString.s = ""               ; A header section is always ended with a blankline      ; If the test string is empty

        WriteStringN(NewEmlFileId.l, "") ; Write the blank line to disk                           ; Show the header lines

        Break                           ; No more header to write, on to the body                 ; Return to the previous routine

      Else                              ; We haven't found the end of the header area yet         ; Otherwise do the below

        WriteStringN(NewEmlFileId.l, TestString.s)  ; So keep writing to the new raw EML file      ; Show the header lines

      EndIf                                                                                       ; End the test condition

    Wend                                                                                          ; Continue till the end

    ;----------------------------------------------------------------------------------------------
    ; Dump the text portion to disk and exit if the attachments flag isnt' set (Just text/html message
    ;----------------------------------------------------------------------------------------------

    If AttachmentFlag.l = 0                                                                       ; No attachments, write body text

      While Eof(EmlFileId.l) = 0                                                                  ; Continue till the end of the file

        TestString.s = ReadString(EmlFileId.l)   ; Find the header for the body section            ; Get a string to test

        If FindString(TestString.s, BoundaryId.s, 1) <> 0   ; We found the attachment boundary    ; Find the multiple parts

          Break                                             ; identifier so we should break now   ; We found the boundary id, next attachment

        EndIf                                                                                     ; End the conditional test   

        WriteStringN(NewEmlFileId.l, TestString.s)                                                 ; Write the line to disk

      Wend                                                                                        ; Continue the loop

      End                                                                                         ; End the program, not multipart

      CloseFile(FilIdIn.l)                                                                        ; Close this file now, not needed any more

      CloseFile(NewEmlFileId.l)                                                                    ; Close this file now, not needed any more

    EndIf                                                                                         ; No more tests

    ;---------------------------------------------------------------------------------------------- 
    ; Find the boundary id and dump dump the text/html portions to disk (for file with attachments) 
    ;---------------------------------------------------------------------------------------------- 
  
    ; --26470415-POCO-75665348                                                                     The boundary id from the header 
  
    While Eof(EmlFileId.l) = 0                         ; Deal with text and attachments here      ; Continue till the end of the file 
  
      TestString.s = ReadString(EmlFileId.l)            ; Find the header for the body section    ; Get a string to test 
  
      If FindString(TestString.s, BoundaryId.s, 1) <> 0 ; Did we find the boundary id yet?        ; Find the multiple parts 
  
        If FindString(TestString.s, "Content-Type: text/", 1) <> 0                                ; Is this text information? 
  
          DumpBody(TestString.s)                                                                  ; Dump the body text 
  
        Else                                                                                      ; Otherwise do the below
  
          DumpAttachments()                                                                       ; Dump the attachments to disk 
  
        EndIf                                                                                     ; No more tests 
  
      EndIf                                                                                       ; No more tests 
  
    Wend                                                                                          ; Continue the loop 
  
    ;---------------------------------------------------------------------------------------------- 
      
    CloseFile(EmlFileId.l)                                                                         ; Close the file now 
  
  EndIf                                                                                           ; End the conditional test 
  
EndIf                                                                                             ; End the conditional test 
  
End                                                                                                ; End the program 

Posted: Sat Oct 06, 2007 12:53 pm
by Inf0Byt3
I have a good use for this! Thank you.

Posted: Sat Oct 06, 2007 1:27 pm
by Fangbeast
Thank srod (LOL). If he hadn't fixed my stuffups, I would have a lot wrong. Amazed I got it partly working, I must say. Was missing the very first attachment and did some strange things with memory once in a while in a routine that had nothing to do with it!!

Thanks srod, you wizard you (NO!! I won;t seek you my daughter!)

Got to extend this into my main mail retriever now.

Posted: Sat Oct 06, 2007 1:40 pm
by srod
Aw shucks... :oops:

I only fixed it to shut you up! :wink:

Posted: Sat Oct 06, 2007 1:45 pm
by Fangbeast
Didn't shut me up though (BLAH, BLAH, BLAH, BLAH!!! (sings like a lamb at basting time)

And there is a slight problem (evil grin)

Posted: Sat Oct 06, 2007 1:50 pm
by srod
You can fix it this time, I'm still recovering from the beating you gave me last time!

Who'd have thought that being hit with a rancid kipper would hurt so much?

:wink:

Posted: Sat Oct 06, 2007 2:02 pm
by Fangbeast
me? Fix code?? You have to be kidding!!!! You must have been told by lots of people that I cna't code my way out of a rancid kipper (the one you stole).

Byt he way, that was Bericko's rancid, steaming, fetid, decaying g-string collection that he stole from Rings, used up and then gave to me to beat you with

Help wanted

Posted: Sun Oct 07, 2007 9:31 am
by Fangbeast
If anyone has ever done this, i'd appreciate if they would produce better code than this and teach me how to do it properly.

There are many mime types and only the images/binary files need stripping and saving to disk and the raw EML email file rewritten to disk. ]

My code is very basic and hard to debug, too many loops, and file seekin (int he current version) going on.

it's very messy.