[prev in list] [next in list] [prev in thread] [next in thread]
List: tar-bug
Subject: Re: Disk read error interpreted as "File shrank" + not saying it is "Padding with zeros", but padds
From: Ondrej Dubaj <odubaj () redhat ! com>
Date: 2021-04-12 5:23:02
Message-ID: CAE6EeJNfgCn9HzSzDsMj=Vapicgia-qtP72vfBSPH1YjD-FjZg () mail ! gmail ! com
[Download RAW message or body]
Hello, any updates here?
Is this a real issue or a misunderstanding?
Thank you. Regards,
Ondrej
On Tue, Mar 30, 2021 at 4:09 PM Ondrej Dubaj <odubaj@redhat.com> wrote:
> gentle ping
>
> On Fri, Mar 19, 2021 at 12:22 PM Ondrej Dubaj <odubaj@redhat.com> wrote:
>
> > Hello,
> > any update here?
> >
> > Thanks.
> > Ondrej
> >
> > On Mon, Mar 1, 2021 at 11:05 AM Ondrej Dubaj <odubaj@redhat.com> wrote:
> >
> > > Ping, any updates here?
> > >
> > > Thanks.
> > >
> > > On Mon, Feb 15, 2021 at 5:07 PM Ondrej Dubaj <odubaj@redhat.com> wrote:
> > >
> > > > Gentle ping
> > > >
> > > > On Mon, Jan 18, 2021 at 12:02 PM Ondrej Dubaj <odubaj@redhat.com>
> > > > wrote:
> > > >
> > > > > One of the customer faced I/O errors while archiving a huge file 11 TB and \
> > > > > observed that after Tar had hit read I/O error due to xfs filesystem, it \
> > > > > still continue writing 0's to the file using strace. However there was no \
> > > > > indication for tar that it was writing 0's when the error occurred.
> > > > > Later it was found that it is expected behavior to write 0's as the file \
> > > > > header is already written. Hence, it need to be padded with 0's.
> > > > > Using the reproducing steps provided by customer, we can see this behavior.
> > > > >
> > > > > Padding 0's is expected behavior however it does so silently (for Read \
> > > > > error at byte...), it should say it is Padding with zeros similar to how it \
> > > > > reports "File Shrank , padding with zeroes"
> > > > > During the reproducer steps provided by customer we see that sometimes tar \
> > > > > report "Read I/O errors" as "File shrank, padding with 0" , we see in the \
> > > > > step(2) provided.
> > > > > Reproducer available here:
> > > > >
> > > > > #!/bin/bash
> > > > > # Reproducer "tardust"
> > > > > #
> > > > > # When "tar create" reads a file there are several shortcomings when it \
> > > > > hits read error #
> > > > > # 1) When read() returns 0 bytes due to read error, then this happens
> > > > > # read(4, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., \
> > > > > 3584) = 3584 # write(3, \
> > > > > "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., \
> > > > > 3584) = 3584 # read(4, 0x563adef7b000, 3584) = -1 EIO (Input/output error)
> > > > > # write(2, "tar: ", 5tar: ) = 5
> > > > > # write(2, "/mntx/testfile: Read error at by"..., 70/mntx/testfile: Read \
> > > > > error at byte 260653056, while reading 3584 bytes) = 70 # write(2, ": \
> > > > > Input/output error", 20: Input/output error) = 20 # write(3, \
> > > > > "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., \
> > > > > 3584) = 3584 # write(3, \
> > > > > "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., \
> > > > > 3584) = 3584 # write(3, \
> > > > > "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., \
> > > > > 3584) = 3584 # Actual behaviour: it prints a message about "Read error", \
> > > > > but it conceals the information it will pad the output with zeros # \
> > > > > Expected behaviour: it should also print the information "padding with \
> > > > > zero" # 2) There is a 2nd shortcoming about tar not differentiate between \
> > > > > "read error" and "file shrinkage" # That means when it sees a short read \
> > > > > due to read error, it does not report read error. # It looks like this:
> > > > > # read(4, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., \
> > > > > 3584) = 3584 # write(3, \
> > > > > "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., \
> > > > > 3584) = 3584 # read(4, \
> > > > > "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., \
> > > > > 3584) = 2560 <<< HERE # write(2, "tar: ", 5tar: ) = 5
> > > > > # write(2, "/mntx/testfile: File shrank by 5"..., 65/mntx/testfile: File \
> > > > > shrank by 53927936 bytes; padding with zeros) = 65 # write(2, "\n", 1
> > > > > # ) = 1
> > > > > # write(3, \
> > > > > "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., \
> > > > > 3584) = 3584 # write(3, \
> > > > > "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., \
> > > > > 3584) = 3584 # Summary: A read error is not reported here. At least it now \
> > > > > says "padding with zeros" # Expected behaviour: it should report a read \
> > > > > error, so the user knows what it going on. #
> > > > > # 3) Side-Note:
> > > > > # The blocking factor is applied to the output. When reading a file, all \
> > > > > reads are misaligned by 512 bytes. # This is because it writes a 512 header \
> > > > > for every archived file. # That means the first read from file is 512bytes \
> > > > > too short: # Running with tar-blocking-factor=7
> > > > > # fstat(1, {st_mode=S_IFREG|0644, st_size=17827, ...}) = 0
> > > > > # write(1, "/mntx/testfile\n", 15/mntx/testfile
> > > > > # ) = 15
> > > > > # read(4, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., \
> > > > > 3072) = 3072 #1st read 512bytes too short # write(3, \
> > > > > "mntx/testfile\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 3584) = 3584 # \
> > > > > read(4, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., \
> > > > > 3584) = 3584 # write(3, \
> > > > > "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., \
> > > > > 3584) = 3584 #
> > > > > # 4) Reproducer overview:
> > > > > # - Create a 500MB testimage, then create a testfile1 in the image
> > > > > # - Use losetup/dmsetup with the "dust" target type
> > > > > # - you can inject IO errors at specified block number in "dust"
> > > > > # - You must hit a 4K boundary to see EIO, so use tar-blocking-factor=7 and
> > > > > # - vary the bad blocknumber to find the case (1)
> > > > > echo Step 1 Create disk image
> > > > > dd if=/dev/zero of=/tmp/testimage bs=1M count=500 || exit
> > > > > echo Step 2 Create XFS in image
> > > > > mkfs.xfs /tmp/testimage || exit
> > > > > echo Step 3 Use losetup so the file can be used a block device
> > > > > losetup /dev/loop1 /tmp/testimage || exit
> > > > > losetup
> > > > > echo Step 6 Now create the testfile, this will have read error injected \
> > > > > later mkdir /mntx
> > > > > mount /dev/loop1 /mntx || exit
> > > > > dd if=/dev/zero of=/mntx/testfile bs=1M count=300 || exit
> > > > > umount /mntx
> > > > > echo Step7 Now iterating through bad blocks
> > > > > echo As result, there are strace output file a1000 ... a1040
> > > > > for i in `seq 1000 1 1040`
> > > > > do
> > > > > echo
> > > > > echo Badblock $i
> > > > > let ERR=i
> > > > > let ERR1=i+1
> > > > > let NUMSECTOR2=1024000-ERR1
> > > > > #echo ERR1 is $ERR1
> > > > > #echo NUMSECTOR2 is $NUMSECTOR2
> > > > > dmsetup create tardust <<EOF
> > > > > 0 $ERR linear /dev/loop1 0
> > > > > $ERR 1 error
> > > > > $ERR1 $NUMSECTOR2 linear /dev/loop1 $ERR1
> > > > > EOF
> > > > > #dmsetup ls
> > > > > #dmsetup status
> > > > > #dmsetup table
> > > > > mount /dev/mapper/tardust /mntx || exit
> > > > > strace tar cvbf 7 /tmp/tardust.tar /mntx/testfile >&/tmp/a$i
> > > > > umount /mntx
> > > > > dmsetup remove tardust
> > > > > grep -e error -e shrank /tmp/a$i
> > > > > done
> > > > > echo "Done: inspect the strace output file for error behaviour (grep error \
> > > > > ; Look at last read()-call )" losetup -d /dev/loop1
> > > > >
> > > > > =================
> > > > >
> > > > > Actual results:
> > > > > - When tar hits a disk read error when reading file from disk and creating \
> > > > > an archive, then it prints "file shrank"
> > > > > - then it writes zeros (aka padding) according to initial file size (but \
> > > > > does not print that message)
> > > > > - This happens in most cases (due to tar-block-size / disk-block-size / \
> > > > > read-shift-by-512-bytes interaction)
> > > > > - I provided a reproducer which shows under which circumstances it \
> > > > > correctly prints "Read error at byte…"
> > > > > Expected results:
> > > > > - When there is a read error, THEN tar shall report a read error
> > > > > - When there is a read error, THEN tar shall NOT report a "file shrank"
> > > > > - In addition it SHALL print "Padding with zeros". This is missing \
> > > > > currently.
> > > > >
> > > > > Regards,
> > > > >
> > > > > Ondrej Dubaj
> > > > >
> > > > >
[Attachment #3 (text/html)]
<div dir="ltr"><div>Hello, any updates here?</div><div><br></div><div>Is this a real \
issue or a misunderstanding?</div><div><br></div><div>Thank you. \
Regards,</div><div><br></div><div>Ondrej<br></div></div><br><div \
class="gmail_quote"><div dir="ltr" class="gmail_attr">On Tue, Mar 30, 2021 at 4:09 PM \
Ondrej Dubaj <<a href="mailto:odubaj@redhat.com">odubaj@redhat.com</a>> \
wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px \
0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">gentle \
ping<br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Fri, \
Mar 19, 2021 at 12:22 PM Ondrej Dubaj <<a href="mailto:odubaj@redhat.com" \
target="_blank">odubaj@redhat.com</a>> wrote:<br></div><blockquote \
class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid \
rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div>Hello,</div><div>any update \
here?</div><div><br></div><div>Thanks.</div><div>Ondrej<br></div></div><br><div \
class="gmail_quote"><div dir="ltr" class="gmail_attr">On Mon, Mar 1, 2021 at 11:05 AM \
Ondrej Dubaj <<a href="mailto:odubaj@redhat.com" \
target="_blank">odubaj@redhat.com</a>> wrote:<br></div><blockquote \
class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid \
rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div>Ping, any updates \
here?</div><div><br></div><div>Thanks.<br></div></div><br><div \
class="gmail_quote"><div dir="ltr" class="gmail_attr">On Mon, Feb 15, 2021 at 5:07 PM \
Ondrej Dubaj <<a href="mailto:odubaj@redhat.com" \
target="_blank">odubaj@redhat.com</a>> wrote:<br></div><blockquote \
class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid \
rgb(204,204,204);padding-left:1ex"><div dir="ltr">Gentle ping<br></div><br><div \
class="gmail_quote"><div dir="ltr" class="gmail_attr">On Mon, Jan 18, 2021 at 12:02 \
PM Ondrej Dubaj <<a href="mailto:odubaj@redhat.com" \
target="_blank">odubaj@redhat.com</a>> wrote:<br></div><blockquote \
class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid \
rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div><pre \
id="gmail-m_6552418225091646724gmail-m_4667421641547113038gmail-m_-4947163951477357887 \
gmail-m_-6729866695184718752gmail-m_-7846508353099645584gmail-comment_text_0">One of \
the customer faced I/O errors while archiving a huge file 11 TB and observed that \
after Tar had hit read I/O error due to xfs filesystem, it still continue writing \
0's to the file using strace. However there was no indication for tar that it was \
writing 0's when the error occurred.
Later it was found that it is expected behavior to write 0's as the file header \
is already written. Hence, it need to be padded with 0's.
Using the reproducing steps provided by customer, we can see this behavior.
Padding 0's is expected behavior however it does so silently (for Read error at \
byte...), it should say it is Padding with zeros similar to how it reports "File \
Shrank , padding with zeroes"<br><br>During the reproducer steps provided by \
customer we see that sometimes tar report "Read I/O errors" as "File \
shrank, padding with 0" , we see in the step(2) provided.<br><br></pre><pre \
id="gmail-m_6552418225091646724gmail-m_4667421641547113038gmail-m_-4947163951477357887 \
gmail-m_-6729866695184718752gmail-m_-7846508353099645584gmail-comment_text_0">Reproducer \
available here:<br><br>#!/bin/bash # Reproducer "tardust"
#
# When "tar create" reads a file there are several shortcomings when it \
hits read error #
# 1) When read() returns 0 bytes due to read error, then this happens
# read(4, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., \
3584) = 3584 # write(3, \
"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., \
3584) = 3584 # read(4, 0x563adef7b000, 3584) = -1 EIO (Input/output error)
# write(2, "tar: ", 5tar: ) = 5
# write(2, "/mntx/testfile: Read error at by"..., 70/mntx/testfile: Read \
error at byte 260653056, while reading 3584 bytes) = 70 # write(2, ": \
Input/output error", 20: Input/output error) = 20 # write(3, \
"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., \
3584) = 3584 # write(3, \
"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., \
3584) = 3584 # write(3, \
"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., \
3584) = 3584 # Actual behaviour: it prints a message about "Read error", \
but it conceals the information it will pad the output with zeros # Expected \
behaviour: it should also print the information "padding with zero" # 2) \
There is a 2nd shortcoming about tar not differentiate between "read error" \
and "file shrinkage" # That means when it sees a short read due to read \
error, it does not report read error. # It looks like this:
# read(4, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., \
3584) = 3584 # write(3, \
"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., \
3584) = 3584 # read(4, \
"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., \
3584) = 2560 <<< HERE # write(2, "tar: ", 5tar: ) = 5
# write(2, "/mntx/testfile: File shrank by 5"..., 65/mntx/testfile: File \
shrank by 53927936 bytes; padding with zeros) = 65 # write(2, "\n", 1
# ) = 1
# write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., \
3584) = 3584 # write(3, \
"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., \
3584) = 3584 # Summary: A read error is not reported here. At least it now says \
"padding with zeros" # Expected behaviour: it should report a read error, \
so the user knows what it going on. #
# 3) Side-Note:
# The blocking factor is applied to the output. When reading a file, all reads are \
misaligned by 512 bytes. # This is because it writes a 512 header for every archived \
file. # That means the first read from file is 512bytes too short:
# Running with tar-blocking-factor=7
# fstat(1, {st_mode=S_IFREG|0644, st_size=17827, ...}) = 0
# write(1, "/mntx/testfile\n", 15/mntx/testfile
# ) = 15
# read(4, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., \
3072) = 3072 #1st read 512bytes too short # write(3, \
"mntx/testfile\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 3584) = 3584 # \
read(4, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., \
3584) = 3584 # write(3, \
"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., \
3584) = 3584 #
# 4) Reproducer overview:
# - Create a 500MB testimage, then create a testfile1 in the image
# - Use losetup/dmsetup with the "dust" target type
# - you can inject IO errors at specified block number in "dust"
# - You must hit a 4K boundary to see EIO, so use tar-blocking-factor=7 and
# - vary the bad blocknumber to find the case (1)
echo Step 1 Create disk image
dd if=/dev/zero of=/tmp/testimage bs=1M count=500 || exit
echo Step 2 Create XFS in image
mkfs.xfs /tmp/testimage || exit
echo Step 3 Use losetup so the file can be used a block device
losetup /dev/loop1 /tmp/testimage || exit
losetup
echo Step 6 Now create the testfile, this will have read error injected later
mkdir /mntx
mount /dev/loop1 /mntx || exit
dd if=/dev/zero of=/mntx/testfile bs=1M count=300 || exit
umount /mntx
echo Step7 Now iterating through bad blocks
echo As result, there are strace output file a1000 ... a1040
for i in `seq 1000 1 1040`
do
echo
echo Badblock $i
let ERR=i
let ERR1=i+1
let NUMSECTOR2=1024000-ERR1
#echo ERR1 is $ERR1
#echo NUMSECTOR2 is $NUMSECTOR2
dmsetup create tardust <<EOF
0 $ERR linear /dev/loop1 0
$ERR 1 error
$ERR1 $NUMSECTOR2 linear /dev/loop1 $ERR1
EOF
#dmsetup ls
#dmsetup status
#dmsetup table
mount /dev/mapper/tardust /mntx || exit
strace tar cvbf 7 /tmp/tardust.tar /mntx/testfile >&/tmp/a$i
umount /mntx
dmsetup remove tardust
grep -e error -e shrank /tmp/a$i
done
echo "Done: inspect the strace output file for error behaviour (grep error ; \
Look at last read()-call )" losetup -d /dev/loop1
=================
<br><br>Actual results:
- When tar hits a disk read error when reading file from disk and creating an \
archive, then it prints "file shrank"<br>- then it writes zeros (aka \
padding) according to initial file size (but does not print that \
message)
- This happens in most cases (due to tar-block-size / disk-block-size / \
read-shift-by-512-bytes interaction)
- I provided a reproducer which shows under which circumstances it correctly prints \
"Read error at byte…"
Expected results:
- When there is a read error, THEN tar shall report a read error
- When there is a read error, THEN tar shall NOT report a "file \
shrank"<br>- In addition it SHALL print "Padding with zeros". This is \
missing currently.<br><br><br></pre><pre \
id="gmail-m_6552418225091646724gmail-m_4667421641547113038gmail-m_-4947163951477357887 \
gmail-m_-6729866695184718752gmail-m_-7846508353099645584gmail-comment_text_0">Regards,<br><br></pre><pre \
id="gmail-m_6552418225091646724gmail-m_4667421641547113038gmail-m_-4947163951477357887 \
gmail-m_-6729866695184718752gmail-m_-7846508353099645584gmail-comment_text_0">Ondrej \
Dubaj<br></pre></div></div> </blockquote></div>
</blockquote></div>
</blockquote></div>
</blockquote></div>
</blockquote></div>
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic