[prev in list] [next in list] [prev in thread] [next in thread] 

List:       jakarta-commons-dev
Subject:    [jira] [Created] (SANSELAN-78) Improve speed of random-access-file handling for TIFF format, potenti
From:       "Gary Lucas (JIRA)" <jira () apache ! org>
Date:       2012-04-30 12:54:49
Message-ID: 907090045.9268.1335790489474.JavaMail.tomcat () hel ! zones ! apache ! org
[Download RAW message or body]

Gary Lucas created SANSELAN-78:
----------------------------------

             Summary: Improve speed of random-access-file handling for TIFF format, \
potentially others  Key: SANSELAN-78
                 URL: https://issues.apache.org/jira/browse/SANSELAN-78
             Project: Commons Sanselan
          Issue Type: Improvement
          Components: Format: TIFF
            Reporter: Gary Lucas



Large TIFF files can be organized into chunks (either strips or tiles) so that the \
image can be read a piece-at-a-time.  In the Apache Imaging implementation, each time \
one of these pieces is read, the TiffReader uses the getBlock() method of the \
ByteSourceFile class.  This class opens the file using the Java RandomAccessFile \
class, seeks to the position of the data in the file, reads its content, and closes \
the file.   Although this operation can be performed several times and thus entails a \
lot of redundant file opens and reads, the file cache performance on modern computers \
is truly amazing and for files of less than 5 megabytes, it often doesn't make a \
difference.   On larger files, however, it can be significant.

This Tracker Item proposes to modify the ByteSourceFile class so that an access \
routine can optionally hold the file open between getBlock() method calls.   It will \
accomplish this by adding a new method called .setPersistent(boolean).  By default, \
persistence will be set to false and the ByteSourceFile class will continue to work \
just as it always has (existing code will not be affected).  If persistence is set to \
true, the RandomAccessFile will be held open.

To get some sense of the performance difference, I ran several tests.  For the sample \
"ron and andy.tif" file provided with the Apache Imaging package, which is under 5 \
megabytes, the change made little difference.   However, when I tested with a larger \
files, such as the Apache Imaging sample 2560-by-1920 pixel  PICT2833.TIF file (a \
blurry picture of a pretty girl), and a 2500-by-2500 pixel file I downloaded from the \
US Geological Survey (USGS), I saw notable differences.  

I also tested on a fast local disk (my PC) and on a network disk.  Not surprisingly, \
the network disk showed the biggest change (in order to keep the test environment \
clean, I ran the network test early in the morning when the network was lightly \
used).

As you can see in the tests below on the local disk the savings is modest even for \
the largest file.  However, when dealing with a network file system, the change \
becomes significant.

{code}
ron and andy.tif   1500-by-1125   4.8 MB       
    local  original:     25.9 ms.   
    local  modified:     24.8 ms.
    network original:   122.7 ms.
    network modified:   117.6 ms.

PICT2833.TIF   2560-by-1920  14.1 MB
    local  original:     77.7 ms.   
    local  modified:     61.7 ms.
    network original:   774.1 ms.
    network modified:   463.8 ms.

USGS1   2500-by-2500   18.8 MB
    local  original:    192.3 ms.   
    local  modified:     94.5 ms.
    network original:  3992.8 ms.
    network modified:  1807.1 ms.

USGS2  10000-by-10000  286 MB
    local  original:   1930.5 ms.   
    local  modified:   1344.5 ms.
    network original: 26627.6 ms.
    network modified: 13402.1 ms.

{code}
One consequence of this change is that if persistence is set to true, the file will \
be held open until the ByteSourceFile goes out-of-scope and is garbage collected.  So \
this change will also make sure that the TiffReader sets the persistence back to \
false when it is done reading the file in order to expedite the release of file \
resources.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: \
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more \
information on JIRA, see: http://www.atlassian.com/software/jira

        


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic