Reading TIFF tags
Posted: Sun Mar 04, 2007 7:30 am
I recently needed to get some info from a large number of TIFF images, but didn't want to have to load the pixel data just to get it (color depth, width, height, etc.). The obvious solution was to read the tag data directly, so I dug out my copy of the TIFF 6 filespec and got comfy...
Anyway, I wanted to share what I learned, so I wrote and commented a function to read all tags in a TIFF file and return them as formatted text. It reads the tags for all pages of multi-page TIFFs, and handles endian conversion. I could not fully test endian conversion, though, so I'd be happy to hear about any unusual results on platforms other than Windows.
NOTE: This code is free for any use at all, and is not released under any license in particular. I don't care if you copy it verbatim and pretend that you wrote it
Anyway, I wanted to share what I learned, so I wrote and commented a function to read all tags in a TIFF file and return them as formatted text. It reads the tags for all pages of multi-page TIFFs, and handles endian conversion. I could not fully test endian conversion, though, so I'd be happy to hear about any unusual results on platforms other than Windows.
NOTE: This code is free for any use at all, and is not released under any license in particular. I don't care if you copy it verbatim and pretend that you wrote it

Code: Select all
#LITTLE_ENDIAN = $4949
#BIG_ENDIAN = $4D4D
Macro SwapW(val)
(((val >> 8) & $00ff) | ((val << 8) & $ff00))
EndMacro
Macro SwapL(val)
(((val>>24) & $000000ff)|((val>>8) & $0000ff00)|((val<<8) & $00ff0000)|((val<<24) & $ff000000))
EndMacro
Procedure.s GetTIFFTags(filename$)
Protected result$, tiffile.l, ifd.l, tag.w, type.w
Protected nval.l, offset.l, fpos.l, byteorder.l
tiffile = ReadFile(#PB_Any, filename$)
byteorder = ReadWord(tiffile) ;check the endianness of the file
;prepare to handle reverse endianness depending on platform
SwapByteOrder.b = #False
CompilerSelect #PB_Compiler_OS
CompilerCase #PB_OS_AmigaOS
If byteorder = #LITTLE_ENDIAN : SwapByteOrder = #True : EndIf
CompilerCase #PB_OS_Linux
If byteorder = #LITTLE_ENDIAN : SwapByteOrder = #True : EndIf
CompilerCase #PB_OS_MacOS
If byteorder = #LITTLE_ENDIAN : SwapByteOrder = #True : EndIf
CompilerCase #PB_OS_Windows
If byteorder = #BIG_ENDIAN : SwapByteOrder = #True : EndIf
CompilerEndSelect
;position of 1st IFD is stored at offset 4 bytes
FileSeek(tiffile, 4)
ifd = ReadLong(tiffile)
If SwapByteOrder : ifd = SwapL(ifd) : EndIf
;we will support multi-page TIFF by looping through each IFD found in the file
While ifd
result$ = result$ + "TIFF directory at offset " + Str(ifd) + ":" + #CRLF$
;go to IFD
FileSeek(tiffile, ifd)
;1st 2 bytes of IFD indicate the # of tags stored
numtags = ReadWord(tiffile)
If SwapByteOrder : numtags = SwapW(numtags) : EndIf
;Enumerating the tags. The tag structure is simple:
; tag number: 2 bytes; unique identifier for each TIFF tag
; tag type: 2 bytes; indicates the type of value stored in the tag
; num values: 4 bytes; how many values of type 'tag type' are stored
; data offset: 4 bytes; offset (starting from 0) where the tag data is located.
; If the data is 4 bytes or less, the value is stored
; directly here instead of elsewhere to save time/space.
For t = 1 To numtags
tag = ReadWord(tiffile)
If SwapByteOrder : tag = SwapW(tag) : EndIf
result$ = result$ + " TAG " + Str(tag) + #TAB$
type = ReadWord(tiffile)
If SwapByteOrder : type = SwapW(type) : EndIf
Select type
Case 1 : result$ = result$ + "BYTE" + #TAB$
Case 2 : result$ = result$ + "CHAR" + #TAB$
Case 3 : result$ = result$ + "WORD" + #TAB$
Case 4 : result$ = result$ + "LONG" + #TAB$
Case 5 : result$ = result$ + "DOUBLE" + #TAB$
EndSelect
nval = ReadLong(tiffile)
If SwapByteOrder : nval = SwapL(nval) : EndIf
offset = ReadLong(tiffile) ;do not swap yet if different endianness - we will do it later
result$ = result$ + "value: "
fpos = Loc(tiffile) ;we will be visiting other offsets - we must store the current
;location so we can return here before moving on to the next tag.
Select type
Case 1 ;tag type is byte
If SwapByteOrder : offset = SwapL(offset) : EndIf
If nval <= 4
;value takes less than 4 bytes, so it is stored directly in offset.
;we must unpack each used byte as binary
For n = 0 To nval-1
result$ = result$ + Str(PeekB(@offset+n)) + " "
Next
Else
;value consumes more than 4 bytes; it is stored elsewhere
FileSeek(tiffile, offset)
For n = 0 To nval-1
result$ = result$ + Str(ReadByte(tiffile)) + " "
Next
FileSeek(tiffile, fpos)
EndIf
Case 2 ;type is char (1 byte)
If SwapByteOrder : offset = SwapL(offset) : EndIf
If nval <= 4
;value is stored in offset.
;unpack each used byte as char
For n = 0 To nval-1
result$ = result$ + Chr(PeekC(@offset+n))
Next
Else
;value consumes more than 4 bytes; it is stored elsewhere
FileSeek(tiffile, offset)
For n = 0 To nval-1
result$ = result$ + Chr(ReadCharacter(tiffile))
Next
FileSeek(tiffile, fpos)
EndIf
Case 3 ;type is word (2 bytes)
If nval <= 2
;value is stored in offset; unpack each used word
For n = 0 To nval-1 Step 2
valw.w = PeekW(@offset+n)
If SwapByteOrder : valw = SwapW(valw) : EndIf
result$ = result$ + Str(valw) + " "
Next
Else
If SwapByteOrder : offset = SwapL(offset) : EndIf
FileSeek(tiffile, offset)
For n = 0 To nval-1 Step 2
valw.w = ReadWord(tiffile)
If SwapByteOrder : valw = SwapW(valw) : EndIf
result$ = result$ + Str(val) + " "
Next
FileSeek(tiffile, fpos)
EndIf
Case 4 ;type is long (4 bytes)
If SwapByteOrder : offset = SwapL(offset) : EndIf
If nval = 1
;value can be taken directly from offset
result$ = result$ + Str(offset)
Else
FileSeek(tiffile, offset)
For n = 0 To nval-1
vall.l = ReadLong(tiffile)
If SwapByteOrder : vall = SwapL(vall) : EndIf
result$ = result$ + Str(val) + " "
Next
FileSeek(tiffile, fpos)
EndIf
Case 5 ;type is double (8 bytes)
;it is always more than 4 bytes, so seek to offset & read
If SwapByteOrder : offset = SwapL(offset) : EndIf
FileSeek(tiffile, offset)
For n = 0 To nval-1
;we will read the value in as two longs - the first is the
;numerator and the second is the denominator. the final result
;is the division of the numerator by the denominator; a float.
numer = ReadLong(tiffile)
If SwapByteOrder : numer = SwapL(numer) : EndIf
denom = ReadLong(tiffile)
If SwapByteOrder : denom = SwapL(denom) : EndIf
result$ = result$ + StrF(numer / denom, 2)
Next
FileSeek(tiffile, fpos)
EndSelect
result$ + #CRLF$
Next
result$ + #CRLF$ ;add some whitespace between IFDs for beauty;)
ifd = ReadLong(tiffile) ;0 if there are no more IFDs, else offset to next IFD
If SwapByteOrder : ifd = SwapL(ifd) : EndIf
Wend
CloseFile(tiffile)
ProcedureReturn result$
EndProcedure