[prev in list] [next in list] [prev in thread] [next in thread] 

List:       busybox
Subject:    Re: tar excludes files too late to stop hardlink detection
From:       Harald van Dijk <harald () gigawatt ! nl>
Date:       2021-06-27 14:15:18
Message-ID: 69000789-f322-70e0-18f6-aa0ea34d0212 () gigawatt ! nl
[Download RAW message or body]

On 26/06/2021 00:36, Harald van Dijk wrote:
> Hi,
> 
> tar --exclude results in bad archives when hardlinks are used. Consider 
> the following:
> 
>    $ mkdir tartest
>    $ echo hello > tartest/a
>    $ ln tartest/a tartest/b
>    $ busybox tar cf - tartest | tar tvf -
>    drwxr-xr-x harald/harald     0 2021-06-26 00:25 tartest/
>    -rw-r--r-- harald/harald     6 2021-06-26 00:25 tartest/b
>    hrw-r--r-- harald/harald     0 2021-06-26 00:25 tartest/a link to 
> tartest/b
> 
> This is okay. tar may either pick up a first and then detect b as a 
> hardlink to a, or pick up b first and then detect a as a hardlink to b. 
> On my system, it picks up b first. You can adjust the below accordingly 
> if on your system a is picked up first. Now, exclude b:
> 
>    $ busybox tar cf - --exclude=b tartest | tar tvf -
>    drwxr-xr-x harald/harald     0 2021-06-26 00:25 tartest/
>    hrw-r--r-- harald/harald     0 2021-06-26 00:25 tartest/a link to 
> tartest/b
> 
> This resulted in an archive where the contents of tartest/a are missing. 
> Extracting the archive results in an attempt to hardlink tartest/b, 
> which may or may not exist in the target directory. GNU tar does not do 
> this, it stores the contents of the file instead, which seems like a 
> better idea to me. Can busybox be modified to do that as well?
> 
> Tested with busybox 1.33.1.

It seems like the fix is trivial, please see attached patch.

Cheers,
Harald van Dijk
["0001-tar-exclude-files-before-updating-hardlink-info-list.patch" (text/x-patch)]

From 5d0451656ace0c21454baf1ef65bed51c647df90 Mon Sep 17 00:00:00 2001
From: Harald van Dijk <harald@gigawatt.nl>
Date: Sun, 27 Jun 2021 15:11:57 +0100
Subject: [PATCH] tar: exclude files before updating hardlink info list

When excluding one file, and including another file that is a hardlink
of the excluded file, it should be stored as an ordinary file.
---
 archival/tar.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/archival/tar.c b/archival/tar.c
index 4a540b77a..1f257958f 100644
--- a/archival/tar.c
+++ b/archival/tar.c
@@ -507,6 +507,9 @@ static int FAST_FUNC writeFileToTarball(struct recursive_state *state,
 	if (header_name[0] == '\0')
 		return TRUE;
 
+	if (exclude_file(tbInfo->excludeList, header_name))
+		return SKIP;
+
 	/* It is against the rules to archive a socket */
 	if (S_ISSOCK(statbuf->st_mode)) {
 		bb_error_msg("%s: socket ignored", fileName);
@@ -540,9 +543,6 @@ static int FAST_FUNC writeFileToTarball(struct recursive_state *state,
 		return TRUE;
 	}
 
-	if (exclude_file(tbInfo->excludeList, header_name))
-		return SKIP;
-
 # if !ENABLE_FEATURE_TAR_GNU_EXTENSIONS
 	if (strlen(header_name) >= NAME_SIZE) {
 		bb_simple_error_msg("names longer than "NAME_SIZE_STR" chars not supported");
-- 
2.31.1



_______________________________________________
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic