[prev in list] [next in list] [prev in thread] [next in thread] 

List:       amanda-users
Subject:    Re: all estimate timed out
From:       Nathan Stratton Treadway <nathanst () ontko ! com>
Date:       2013-04-12 23:20:17
Message-ID: 20130412232016.GZ11464 () shire ! ontko ! com
[Download RAW message or body]

On Fri, Apr 12, 2013 at 17:09:11 -0400, Chris Hoogendyk wrote:
> The "Total bytes written:" was identical with and without the
> --sparse option (right down to the last byte ;-) ). It was the time
> taken to arrive at that estimate that was so very different:
> 
> Total bytes written: 2086440960 (2.0GiB, 11MiB/s)
> real    3m14.91s
> 
> Total bytes written: 2086440960 (2.0GiB, 17GiB/s)
> real    0m0.57s
> 
> 
> However, if I do an `ls -sl` on the directory and multiply the first
> column by 512, that does not quite match the length in bytes column.
> It is the same order of magnitude, but they are slightly different.
> I'm not sure what causes that, but I don't think the tif files are
> really sparse in the usual sense of that. Any imaginable gain in
> efficiency with regard to space would be minimal, and the cost in
> time is ridiculous.
> 
> Here is an example of one directory:
> 
> marlin:/export/herbarium/mellon/Masstypes_Scans_Server/ACANTHACEAE# ls -sl
> 
> total 4072318
> 410608 -rw-rw----   1 ariehtal herbarum 210246048 Dec 10 11:04 AC00312847.tif
> 402936 -rw-rw----   1 ariehtal herbarum 206423224 Dec  5 16:09 AC00312848.tif

Well, unless the length of the file is an exact multiple of the block
size, you'll normally find that the figures will be slightly
different... but the allocated space is always larger for non-sparse
files.

In your case, though, it's slightly smaller -- which is why you are
having this problem.... 

410608 * 512 = 210231296, 14752 less than 210246048
402936 * 512 = 206303232, 119992 less than 206423224
etc.


However, when tar puts the files into the archive, it has it's own
blocking factor, and it would seem that the space savings from the
sparseness in your files is so small that it's lost within that blocking
factor.  So yes, you are definitely in a lots-of-pain-and-no-gain
situation :(

Do you know how these TIF files are getting written onto your system?
You could avoid this problem if you were able to that process altered so
that it didn't create sparse files...

If the files are static, you could consider doing a pass through to
"un-sparsify" them somehow.  For example, doing a simple "cp" seems to be
produce normal files:

$ uname -a
SunOS myhost 5.9 Generic_122300-66 sun4u sparc SUNW,Netra-210
$ which cp
/usr/bin/cp
$ mkdir test1
$ echo "hi" | dd of=test1/t.t seek=10000
0+1 records in
0+1 records out
$ cp -Rp test1 test2 
$ ls -ls test1 test2                    
test1:
total 48
   48 -rw-r-----   1 x474712  other    5120003 Apr 12 18:05 t.t

test2:
total 10032
10032 -rw-r-----   1 x474712  other    5120003 Apr 12 18:05 t.t


(Note that the copy of the file found in test2/ is fully allocated.)



However, it sounds like in your particular situation the workaround of
using a "amgtar" with --sparse turned off might be good enough (given
that it's actually okay for the backup to ignore the fact the original
files are sparse).



						Nathan



----------------------------------------------------------------------------
Nathan Stratton Treadway  -  nathanst@ontko.com  -  Mid-Atlantic region
Ray Ontko & Co.  -  Software consulting services  -   http://www.ontko.com/
 GPG Key: http://www.ontko.com/~nathanst/gpg_key.txt   ID: 1023D/ECFB6239
 Key fingerprint = 6AD8 485E 20B9 5C71 231C  0C32 15F3 ADCD ECFB 6239
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic