[prev in list] [next in list] [prev in thread] [next in thread]
List: busybox
Subject: Re: [PATCH 1/1] xargs: support -P <N> (parallel execution)
From: Denys Vlasenko <vda.linux () googlemail ! com>
Date: 2017-08-28 11:39:50
Message-ID: CAK1hOcND-4BtVmJb57jLvYKPpiWHLwsF27VGRBwJJ4RyMGfRUw () mail ! gmail ! com
[Download RAW message or body]
On Fri, Aug 25, 2017 at 1:56 PM, Johannes Schindelin
<Johannes.Schindelin@gmx.de> wrote:
>> {
>> int status;
>>
>> +#if !ENABLE_FEATURE_XARGS_SUPPORT_PARALLEL
>> status = spawn_and_wait(G.args);
>> +#else
>> + if (G.max_procs == 1) {
>> + status = spawn_and_wait(G.args);
>> + } else {
>> + pid_t pid;
>> + int wstat;
>> + again:
>> + if (G.running_procs >= G.max_procs)
>> + pid = safe_waitpid(-1, &wstat, 0);
>> + else
>> + pid = wait_any_nohang(&wstat);
>
> What if running_procs == max_procs == 0? In that case, we should avoid
> calling either safe_waitpid() or wait_any_nohang(). Maybe instead
> something like
>
> + if (G.running_procs < G.max_procs)
> + pid = wait_any_nohang(&wstat);
> + else if (G.running_procs > 0)
> + pid = safe_waitpid(-1, &wstat, 0);
>
> But then, the wait_any_nohang() call is still unnecessary if running_procs
> == 0. But you want to save space, not necessarily make the thing more
> performant, so...
Exactly. If G.running_procs == 0, waitting can be skipped.
But it's harmless.
>> + if (pid > 0) {
>> + /* We may have children we don't know about:
>> + * sh -c 'sleep 1 & exec xargs ...'
>> + * Do not make G.running_procs go negative.
>> + */
>
> This hints very eloquently at the problem I am concerned about. But I
> think it is worse than just running_procs going negative. You could easily
> mistake child processes for having exited when they have not, including
> possibly failing children (and hence the exit status can be all wrong).
Having "unexpected children" is not typical.
Therefore we don't need to play nice with them: we only need to
avoid doing something catastrophic (such as crashing or looping forever).
>> + if (G.running_procs != 0)
>> + G.running_procs--;
>> + status = WIFSIGNALED(wstat)
>> + ? 0x180 + WTERMSIG(wstat)
>> + : WEXITSTATUS(wstat);
>> + if (status > 0 && status < 255) {
>> + /* See below why 123 does not abort */
>> + G.xargs_exitcode = 123;
>> + status = 0;
>> + }
>> + if (status == 0)
>> + goto again; /* maybe we have more children? */
>
> I think this logic still needs some love:
>
> - if running_procs is 0, there is no need to go at it again
Well, we are here only if pid > 0. This means, there *was* a child.
> - if running_procs is greater than 0, we still have to wait for the
> remaining children to exit, even if one child exited due to a signal.
Well, actually no, we don't have to wait for all children.
If one exited due to a signal, xargs can immediately exit with 125.
All remaining children will be reaped by init.
Anyway, code does wait for all children, at the end of main:
G.max_procs = 0;
xargs_exec(); /* final waitpid() loop */
>> + /* Manpage:
>> + * """xargs exits with the following status:
>> + * 0 if it succeeds
>> + * 123 if any invocation of the command exited with status 1-125
>> + * 124 if the command exited with status 255
>> + * ("""If any invocation of the command exits with a status of 255,
>> + * xargs will stop immediately without reading any further input.
>> + * An error message is issued on stderr when this happens.""")
>> + * 125 if the command is killed by a signal
>> + * 126 if the command cannot be run
>> + * 127 if the command is not found
>> + * 1 if some other error occurred."""
>> + */
>> if (status < 0) {
>> bb_simple_perror_msg(G.args[0]);
>> - return errno == ENOENT ? 127 : 126;
>> - }
>> - if (status == 255) {
>> - bb_error_msg("%s: exited with status 255; aborting", G.args[0]);
>> - return 124;
>> + status = (errno == ENOENT) ? 127 : 126;
>> }
>> - if (status >= 0x180) {
>> + else if (status >= 0x180) {
>> bb_error_msg("'%s' terminated by signal %d",
>> G.args[0], status - 0x180);
>> - return 125;
>> + status = 125;
>> }
>> - if (status)
>> - return 123;
>> - return 0;
>> + else if (status != 0) {
>> + if (status == 255) {
>> + bb_error_msg("%s: exited with status 255; aborting", G.args[0]);
>> + return 124;
>> + }
>> + /* "123 if any invocation of the command exited with status 1-125"
>> + * This implies that nonzero exit code is remembered,
>> + * but does not cause xargs to stop: we return 0.
>> + */
>> + G.xargs_exitcode = 123;
>> + status = 0;
>> + }
>> +
>> + if (status != 0)
>> + G.xargs_exitcode = status;
>
> Is it always safe to override the exitcode, even if it has already been
> set to something non-zero?
It can only be 123 if we continue xargs'ing. All other exitcodes make xargs
stop immediately.
It's not defined what to do if more than one stopping condition was seen,
I think it's okay to just use the latest status.
>> +#if ENABLE_FEATURE_XARGS_SUPPORT_PARALLEL
>> + if (G.max_procs <= 0) /* -P0 means "run lots of them" */
>> + G.max_procs = 100; /* let's not go crazy high */
>
> That is quite a bit arbitrary. I understand that you want to abuse
> max_procs == 0 to imply the final wrap-up, but 100 is such a non-digital
> number. If you must impose an artificial limit, why not 64, or 256, or
> 1024?
Yes, it's arbitrary. 100 would not fork bomb even a smallish machine,
yet gives a large speed boost on large SMP. 1024 might fork bomb.
63 might be a tiny bit better choice. "Store constant" insn on fixed width
insn arches usually has a restriction how wide constant can be.
63 is a maximum fitting in 6 bits.
I think the best value is `nproc`. Unfortunately, it's a bit more code.
Whoever cares about that, should use -P`nproc` in the script, I guess.
_______________________________________________
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic