[prev in list] [next in list] [prev in thread] [next in thread]
List: coreutils-bug
Subject: bug#17505: bug#22277: 'dd' - stats are not what expected
From: Pádraig Brady <P () draigBrady ! com>
Date: 2015-12-31 18:23:16
Message-ID: 56857294.5020000 () draigBrady ! com
[Download RAW message or body]
On 31/12/15 10:18, Pádraig Brady wrote:
> unarchive 17505
> forcemerge 17505 22277
> stop
>
> On 31/12/15 01:11, Mike Fiedler wrote:
> >
> > Hi,
> >
> > I ran one of my favorite utilities 'dd' again this evening, this time with bs=1G \
> > ( IEC ) - I usually do 1M but this time I dealt with more data to be copied... I \
> > had to copy about 215 GiB of data from one to another drive ( offset 215 GiB was \
> > about the end of the last partition ). So I did:
> >
> > $ dd if=/dev/sdb of=/dev/sda bs=*1G* count=*222*
> > 222+0 records in
> > 222+0 records out
> > 238370684928 bytes (*238 GB*) copied, 1275.03 s, 187 MB/s
> >
> > When it finished, I got a bit confused, and I asked myself a question if the data \
> > I requested did really get copied.. of course it did, but I was not expecting \
> > 238 GB to be shown. To make sure I calculated the 512 byte sector end number out \
> > of the 238370684928 bytes 'dd' result and compared it with the output of fdisk \
> > showing the last sector of the last partition... I was fine.
> > I think, and many others have a same opinion, 1kB = 1000B, etc, should be banned \
> > from use in the IT world, and banned from use by the sales people.
> > The point is, as you probably noticed, if dd is told to use IEC, let's stick to \
> > IEC and not get the results in whatever artificial decimal crap.... It can not \
> > only confuse, but utility like 'dd' should be 100% specific about handling the \
> > units, and there should be not a bit of doubt when it spits out the results. If I \
> > would use 1K in this case, I would not notice the difference - my brain is simply \
> > too simple, and small, but 1G should at least result in displaying 222 GiB and \
> > for sure not GB.
>
> I have to agree, and this has come up a few times now.
>
> The number in brackets is not exact and informational for human consumption,
> so we should make an effort to be less confusing.
> There was a proposed patch at:
> http://debbugs.gnu.org/cgi/bugreport.cgi?bug=17505#37
> which auto determines the appropriate base from the amount output,
> to output the number with the least amount of info loss.
>
> There were some issues noted with that,
> but IMHO they were lesser than the current issue.
>
> We will have to be careful to not corrupt output
> when switching with status=progress (due to possibly shorter status line).
>
> I'll have another look.
The attached auto sets the units.
For status=progress this is done based on output block size,
for the final transfer stats, it's done based on the transferred byte count.
cheers,
Pádraig.
["dd-stats-units.patch" (text/x-patch)]
From 5885398ed2252af1ce433232e00933f15e0a318f Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Pádraig Brady?= <P@draigBrady.com>
Date: Fri, 16 May 2014 10:32:43 +0100
Subject: [PATCH] dd: output base 1024 transfer counts in IEC units
* src/dd.c (human_size): Add "base" and "from/to block size" params
so we can reuse this function for all output size conversions.
(print_xfer_stats): Use IEC units for status=progress output
when the obs is a multiple of 2 and not a multiple of 10.
Also use IEC units for the final status summary when the
number of output bytes is a power of 1024, and not a power of 1000.
* tests/dd/stats.sh: Update to test status units.
* NEWS: Mention the change in behavior.
Fixes http://bugs.gnu.org/17505
---
NEWS | 2 ++
src/dd.c | 44 +++++++++++++++++++++++++++++---------------
tests/dd/stats.sh | 10 +++++++---
3 files changed, 38 insertions(+), 18 deletions(-)
diff --git a/NEWS b/NEWS
index 5080f0f..f7d219e 100644
--- a/NEWS
+++ b/NEWS
@@ -45,6 +45,8 @@ GNU coreutils NEWS -*- outline -*-
date --iso-8601 now uses +00:00 timezone format rather than +0000.
The standard states to use this "extended" format throughout a timestamp.
+ dd now uses IEC units if appropriate for the human readable byte count status.
+
df now prefers sources towards the root of a device when
eliding duplicate bind mounted entries.
diff --git a/src/dd.c b/src/dd.c
index a5557a8..870e938 100644
--- a/src/dd.c
+++ b/src/dd.c
@@ -674,13 +674,23 @@ Options are:\n\
}
static char *
-human_size (size_t n)
+human_size (size_t n, size_t base, uintmax_t from_bs, uintmax_t to_bs)
{
static char hbuf[LONGEST_HUMAN_READABLE + 1];
+
int human_opts - (human_autoscale | human_round_to_nearest | human_base_1024
+ (human_autoscale | human_round_to_nearest
| human_space_before_unit | human_SI | human_B);
- return human_readable (n, hbuf, human_opts, 1, 1);
+
+ if (! base)
+ {
+ if ((n % 1000) && ! (n % 1024))
+ human_opts |= human_base_1024;
+ }
+ else if (base == 1024)
+ human_opts |= human_base_1024;
+
+ return human_readable (n, hbuf, human_opts, from_bs, to_bs);
}
/* Ensure input buffer IBUF is allocated. */
@@ -695,7 +705,8 @@ alloc_ibuf (void)
if (!real_buf)
error (EXIT_FAILURE, 0,
_("memory exhausted by input buffer of size %"PRIuMAX" bytes (%s)"),
- (uintmax_t) input_blocksize, human_size (input_blocksize));
+ (uintmax_t) input_blocksize,
+ human_size (input_blocksize, 1024, 1, 1));
real_buf += SWAB_ALIGN_OFFSET; /* allow space for swab */
@@ -718,7 +729,8 @@ alloc_obuf (void)
error (EXIT_FAILURE, 0,
_("memory exhausted by output buffer of size %"PRIuMAX
" bytes (%s)"),
- (uintmax_t) output_blocksize, human_size (output_blocksize));
+ (uintmax_t) output_blocksize,
+ human_size (output_blocksize, 1024, 1, 1));
obuf = ptr_align (real_obuf, page_size);
}
else
@@ -749,16 +761,19 @@ multiple_bits_set (int i)
/* Print transfer statistics. */
static void
-print_xfer_stats (xtime_t progress_time) {
- char hbuf[LONGEST_HUMAN_READABLE + 1];
- int human_opts - (human_autoscale | human_round_to_nearest
- | human_space_before_unit | human_SI | human_B);
+print_xfer_stats (xtime_t progress_time)
+{
double delta_s;
char const *bytes_per_second;
+ size_t w_bytes_base = 0; /* auto for final stats. */
if (progress_time)
- fputc ('\r', stderr);
+ {
+ fputc ('\r', stderr);
+ w_bytes_base = 1000;
+ if ((output_blocksize % 10) && ! (output_blocksize % 2))
+ w_bytes_base = 1024;
+ }
/* Use integer arithmetic to compute the transfer rate,
since that makes it easy to use SI abbreviations. */
@@ -767,8 +782,7 @@ print_xfer_stats (xtime_t progress_time) {
ngettext ("%"PRIuMAX" byte (%s) copied",
"%"PRIuMAX" bytes (%s) copied",
select_plural (w_bytes)),
- w_bytes,
- human_readable (w_bytes, hbuf, human_opts, 1, 1));
+ w_bytes, human_size (w_bytes, w_bytes_base, 1, 1));
xtime_t now = progress_time ? progress_time : gethrxtime ();
@@ -778,8 +792,8 @@ print_xfer_stats (xtime_t progress_time) {
uintmax_t delta_xtime = now;
delta_xtime -= start_time;
delta_s = delta_xtime / XTIME_PRECISIONe0;
- bytes_per_second = human_readable (w_bytes, hbuf, human_opts,
- XTIME_PRECISION, delta_xtime);
+ bytes_per_second = human_size (w_bytes, 1000, XTIME_PRECISION,
+ delta_xtime);
}
else
{
diff --git a/tests/dd/stats.sh b/tests/dd/stats.sh
index da2c2d2..7b7c856 100755
--- a/tests/dd/stats.sh
+++ b/tests/dd/stats.sh
@@ -63,11 +63,15 @@ for open in '' '1'; do
grep '250000000 bytes .* copied' err || { cat err; fail=1; }
done
+# Check progress is output, and check that the correct
+# IEC/SI units are used both for the progress and final status output.
progress_output()
{
- { sleep "$1"; echo 1; } | dd bs=1 status=progress of=/dev/null 2>err
- # Progress output should be for "byte ... copied", while final is "bytes ..."
- grep 'byte .* copied' err
+ { sleep "$1"; dd status=none bs24 count=1 if=/dev/zero; } |
+ dd bs00 of=/dev/null status=progress 2>err
+
+ grep -F '1000 bytes (1.0 kB) copied' err && # Progress line
+ grep -F '1024 bytes (1.0 KiB) copied' err # Final status line
}
retry_delay_ progress_output 1 4 || { cat err; fail=1; }
--
2.5.0
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic