[prev in list] [next in list] [prev in thread] [next in thread] 

List:       jakarta-commons-user
Subject:    Re: Creating EXIF tags (TiffOutputField) the right way
From:       Benedikt Ritter <britter () apache ! org>
Date:       2016-06-02 20:55:32
Message-ID: CAB917RKcZW6O=n3SFOMZdX2p+LiQmvtfwT2k-kvC6NY46Kx0Og () mail ! gmail ! com
[Download RAW message or body]


Hello Joakim,

Joakim Knudsen <joakim.grahl@gmail.com> schrieb am Mi., 1. Juni 2016 um
15:10 Uhr:

> Sure! That would also give even more scrutiny to the code. I'm not 100%
> sure this is totally correct, but I got wonderful help from Phil Harvey
> (ExifTool) to get the charset/encoding correct.
> So I'm pretty confident. How do I contribute?
> 

Looking at the Commons Imaging website [1] I realised, that we currently do
not have a user guide :o) To the best idea would probably be to add it to
the Sample Usage page [2]. The website is build from source in SVN [3]. You
would have to check that out, modify the documentation and then create an
SVN patch file, using

svn diff >> mypatch.diff

the mypatch.diff would then have to be attached to a Jira issue. More
information can be found in [5].



> Btw, you wouldn't happen to know anything about IPTC and XMP, would you? It
> seems the EXIF tags I'm writing (UserComment and ImageDescription) are not
> enough for the comment to appear as a caption in image viewer software
> (like Picasa etc). I was wondering (hoping) Sanselan could write the
> following tags:
> 
> IPTC:Caption-Abstract
> and
> XMP:Description
> 
> 
To be honest, I don't know much about how Sanselan/Imaging works. I have
worked on the code for a while, but I don't use it in my current projects.
So the only thing I can do, is look through the code for you and try to
find an answer to your questions :-)

Benedikt

[1] http://commons.apache.org/proper/commons-imaging/index.html
[2] http://commons.apache.org/proper/commons-imaging/sampleusage.html
[3] http://svn.apache.org/repos/asf/commons/proper/imaging/trunk
[4] http://issues.apache.org/jira/browse/IMAGING
[5] http://commons.apache.org/patches.html


> 
> Joakim
> 
> On 1 June 2016 at 14:55, Benedikt Ritter <britter@apache.org> wrote:
> 
> > Hello Joakim,
> > 
> > glad you found out what to do. This would make for a good addition to the
> > user guide. Would you like to contribute your findings?
> > 
> > Benedikt
> > 
> > Joakim Knudsen <joakim.grahl@gmail.com> schrieb am Di., 31. Mai 2016 um
> > 19:21 Uhr:
> > 
> > > Btw, ENCODING_UTF16 is just a String = "UTF-16LE" (Little Endian)
> > > 
> > > On 31 May 2016 at 19:20, Joakim Knudsen <joakim.grahl@gmail.com>
> wrote:
> > > 
> > > > Following a post on the User-Commons-Apache log (from 2012), I ended
> up
> > > > with the following code which seems to work.
> > > > It writes proper Unicode, which I can read back successfully using
> > > > ExifTool. I also see the comment nicely in Windows Explorer, and
> under
> > > File
> > > > > Properties.
> > > > Note I changed the field type from ASCII to FIELD_TYPE_UNDEFINED,
> > > > otherwise (with ASCII) it did not work. At least Windows couldn't
> make
> > > > sense of the EXIF data.
> > > > 
> > > > // http://osdir.com/ml/user-commons-apache/2012-03/msg00046.html
> > > > byte[] unicodeMarker = new byte[]{ 0x55, 0x4E, 0x49, 0x43, 0x4F,
> 0x44,
> > > > 0x45, 0x00 };
> > > > byte[] comment = textToSet.getBytes(ENCODING_UTF16); // OR UTF-16BE
> if
> > > the file is big-endian!
> > > > byte[] bytesComment = new byte[unicodeMarker.length +
> comment.length];
> > > > System.arraycopy(unicodeMarker, 0, bytesComment, 0,
> > > unicodeMarker.length);
> > > > System.arraycopy(comment, 0, bytesComment, unicodeMarker.length,
> > > comment.length);
> > > > 
> > > > TiffOutputField exif_comment = new
> > > TiffOutputField(TiffConstants.EXIF_TAG_USER_COMMENT,
> > > > TiffFieldTypeConstants.FIELD_TYPE_UNDEFINED,
> > > bytesComment.length, bytesComment);
> > > > 
> > > > 
> > > > I can now write UserComment: "æøå" without problems :)
> > > > 
> > > > 
> > > > 
> > > > - Joakim
> > > > 
> > > > 
> > > > On 31 May 2016 at 17:39, Benedikt Ritter <britter@apache.org> wrote:
> > > > 
> > > > > Hello Joachim,
> > > > > 
> > > > > Joakim Knudsen <joakim.grahl@gmail.com> schrieb am Sa., 28. Mai
> 2016
> > um
> > > > > 21:10 Uhr:
> > > > > 
> > > > > > Hi Benedikt, and thanks for replying!
> > > > > > 
> > > > > > So, if FieldType is unused, maybe the alternative, simpler
> > constructor
> > > > > is
> > > > > > more appropriate/correct to use?
> > > > > > 
> > > > > > // try using the approach given in the example (modified from the
> > GPS
> > > > > tag):
> > > > > > TiffOutputField exif_comment = TiffOutputField.create(
> > > > > > TiffConstants.EXIF_TAG_USER_COMMENT,
> > > > > > outputSet.byteOrder, textToSet);
> > > > > > 
> > > > > > However, now Sanselan throws an ImageWriteException:
> > > > > > org.apache.sanselan.ImageWriteException: Tag has unexpected data
> > type.
> > > > > > 
> > > > > > So are you 100% sure field type should not be set (to ASCII)?
> > > > > > 
> > > > > 
> > > > > No, I'm just saying that it uses a hard coded encoding anyway :-)
> > > > > 
> > > > > 
> > > > > > 
> > > > > > Next, you're saying the string to set (textToSet) is converted
> > > > > internally
> > > > > > to byte array, using US-ASCII encoding.
> > > > > > If I try writing "æøåæøå" to a file, I get "쎦쎸쎥쎦쎸쎥" \
> > > > > > when I copy
> the
> > > JPEG
> > > > > > out and check Properties in Windows Explorer.
> > > > > > If I write only ASCII characters, e.g. "Test", then that comes
> > through
> > > > > just
> > > > > > fine.
> > > > > > 
> > > > > > In summary, here is the code that works for me (except non-ASCII
> > > > > > characters):
> > > > > > 
> > > > > > 
> > > > > > *//
> > > > > > 
> > > > > > 
> > > > > 
> > > 
> > 
> http://mail-archives.apache.org/mod_mbox/commons-user/201203.mbox/%3CCAJm2B-mYCXYKuyu=Hs8UAZCpw-B=KWuZ4gsZFUOBvZWUN0LUfA@mail.gmail.com%3E
> 
> > > > > > <
> > > > > > 
> > > > > 
> > > 
> > 
> http://mail-archives.apache.org/mod_mbox/commons-user/201203.mbox/%3CCAJm2B-mYCXYKuyu=Hs8UAZCpw-B=KWuZ4gsZFUOBvZWUN0LUfA@mail.gmail.com%3E
> 
> > > > > > > *byte
> > > > > > b[] = ExifTagConstants.EXIF_TAG_USER_COMMENT.encodeValue(
> > > > > > TiffFieldTypeConstants.FIELD_TYPE_ASCII,
> > > > > > textToSet, outputSet.
> > > > > > *byteOrder*);
> > > > > > 
> > > > > > // constructor arguments: taginfo tag fieldtype count bytes
> > > > > > TiffOutputField exif_comment = new
> > > > > > TiffOutputField(TiffConstants.EXIF_TAG_USER_COMMENT.tag,
> > > > > > TiffConstants.EXIF_TAG_USER_COMMENT,
> > > > > > TiffFieldTypeConstants.FIELD_TYPE_UNDEFINED,
> > > > > > b.length, b);
> > > > > > 
> > > > > 
> > > > > The provided links indicate to me, that it is possible to write non
> > > ASCII
> > > > > characters. Are you sure your code looks like what Damjan suggested?
> > > > > 
> > > > > Benedikt
> > > > > 
> > > > > 
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > Joakim
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > On 22 May 2016 at 15:29, Benedikt Ritter <britter@apache.org>
> > wrote:
> > > > > > 
> > > > > > > Hello Joakim
> > > > > > > 
> > > > > > > Joakim Knudsen <joakim.grahl@gmail.com> schrieb am Sa., 21. Mai
> > > 2016
> > > > > um
> > > > > > > 19:29 Uhr:
> > > > > > > 
> > > > > > > > Hi List!
> > > > > > > > 
> > > > > > > > I'm working on an Android app, where I want to read and write
> > > "EXIF
> > > > > > tags"
> > > > > > > > to JPEG files on the device. Sanselan 0.97 seems to work
> > > perfectly,
> > > > > > > > although it's a bit complicated to work with EXIF
> > > tags/directories.
> > > > > > > > 
> > > > > > > > The specific tags I'm interested in, is EXIF_TAG_USER_COMMENT
> > and
> > > > > > > > EXIF_TAG_IMAGE_DESCRIPTION.
> > > > > > > > According to the documentation I could find, UserComment is of
> > > field
> > > > > > type
> > > > > > > > "undefined", whereas ImageDescription is of field type ASCII.
> > > > > > > > 
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > 
> > 
> http://www.awaresystems.be/imaging/tiff/tifftags/privateifd/exif/usercomment.html
> > > > > > > > 
> > > > > 
> > http://www.awaresystems.be/imaging/tiff/tifftags/imagedescription.html
> > > > > > > > 
> > > > > > > > What's the proper way of creating those tags, wrt. charset
> etc?
> > I
> > > > > want
> > > > > > as
> > > > > > > > wide as possible character support (æøå etc).
> > > > > > > > 
> > > > > > > > I find different discussions online, with different advice.
> > Seems
> > > > > two
> > > > > > > > constructors are going around, where the simpler one does not
> > deal
> > > > > with
> > > > > > > > charset/encoding at all. This one uses the .create method:
> > > > > > > > 
> > > > > > > > String textToSet = "Some Text æøå";
> > > > > > > > 
> > > > > > > > TiffOutputField exif_comment = TiffOutputField.create(
> > > > > > > > TiffConstants.EXIF_TAG_USER_COMMENT,
> > > > > > > > outputSet.byteOrder, textToSet);
> > > > > > > > 
> > > > > > > > 
> > > > > > > > while this one uses the standard constructor:
> > > > > > > > 
> > > > > > > > byte b[] = ExifTagConstants.EXIF_TAG_USER_COMMENT.encodeValue(
> > > > > > > > TiffFieldTypeConstants.FIELD_TYPE_ASCII,
> > > > > > > > textToSet, outputSet.byteOrder
> > > > > > > > );
> > > > > > > > 
> > > > > > > > // constructor arguments: taginfo tag fieldtype count bytes
> > > > > > > > TiffOutputField exif_comment2 = new
> > > > > > > > TiffOutputField(TiffConstants.EXIF_TAG_USER_COMMENT.tag,
> > > > > > > > TiffConstants.EXIF_TAG_USER_COMMENT,
> > > > > > > > TiffFieldTypeConstants.FIELD_TYPE_UNDEFINED,
> > > > > > > > b.length, b);
> > > > > > > > 
> > > > > > > > In this last one, the string to set has been converted to a
> byte
> > > > > array
> > > > > > > > first. But can/should I set the encoding anywhere?
> > > > > > > > 
> > > > > > > > Is the field type even ASCII? This information seems to
> indicate
> > > > > it's
> > > > > > > > not ASCII...
> > > > > > > > 
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > 
> > 
> http://www.awaresystems.be/imaging/tiff/tifftags/privateifd/exif/usercomment.html
> > > > > > > > 
> > > > > > > > 
> > > > > > > > Need some help here, as you can see, to get this right. The
> > second
> > > > > > > > approach above does seem to work in my app, but I'd like to be
> > > sure
> > > > > > > > I'm not somehow messing up the JPEGs on the deviced.
> > > > > > > > 
> > > > > > > 
> > > > > > > I've looked at the code of
> > > > > > > org.apache.commons.imaging.formats.tiff.taginfos.TagInfoGpsText
> > > > > > > (ExifTagConstants.EXIF_TAG_USER_COMMENT is an instance of
> > > > > > TagInfoGpsText).
> > > > > > > Here are my observations:
> > > > > > > 
> > > > > > > - The FieldType parameter, which you have set to
> > > > > > > TiffFieldTypeConstants.FIELD_TYPE_ASCII is never used in the
> > > > > > implemenation
> > > > > > > of encodeValue(FieldType, Object, ByteOrder)
> > > > > > > - When converting the input String to byte array,
> > > > > String.getBytes(String
> > > > > > > charsetName) is used
> > > > > > > - For charsetName "US-ASCII" is always used (it can not be
> > > configured
> > > > > by
> > > > > > > the user)
> > > > > > > 
> > > > > > > So my guess is, that the code will not handle characters not in
> > the
> > > > > > > US-ASCII charset correctly.
> > > > > > > 
> > > > > > > Benedikt
> > > > > > > 
> > > > > > > 
> > > > > > > > 
> > > > > > > > 
> > > > > > > > 
> > > > > > > > Joakim
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > > 
> > > 
> > 
> 



[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic