[prev in list] [next in list] [prev in thread] [next in thread] 

List:       busybox
Subject:    busybox ash segfault on invalid substitutions
From:       Harald van Dijk <harald () gigawatt ! nl>
Date:       2022-11-23 23:25:09
Message-ID: 9be97a87-ad69-9ca7-f48b-9178e8ae2617 () gigawatt ! nl
[Download RAW message or body]

Hi,

Over on the dash mailing list, Christoph Anton Mitterer reported a 
segfault in dash when dealing with invalid substitutions: 
<https://lore.kernel.org/dash/fe9be702c676b43bba8e0136a94583f919a05a66.camel@scientia.org/T/#t>

busybox ash being based on dash, despite the segmentation fault not 
triggering there with the original script, it can be triggered in 
busybox ash with a different script as well. I am reporting this here as 
requested in that thread. The below is with a build with 'make 
defconfig', except CONFIG_DEBUG=y and CONFIG_DEBUG_PESSIMIZE=y.

$ gdb --args ./busybox_unstripped sh -c 'f() { echo ${PWD-${PWD!}}; }; f'
GNU gdb (GDB) 12.1
Copyright (C) 2022 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 
<http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
     <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./busybox_unstripped...
(gdb) run
Starting program: /home/harald/busybox/busybox_unstripped sh -c f\(\)\ 
\{\ echo\ \$\{PWD-\$\{PWD\!\}\}\;\ \}\;\ f
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

Program received signal SIGSEGV, Segmentation fault.
0x00348eab in argstr (p=0x1 <error: Cannot access memory at address 
0x1>, flag=1089) at shell/ash.c:6819
6819			if (*p == '~')
(gdb)

This happens because invalid substitutions (${PWD!}) are encoded using a 
null byte, but function definitions treat node text as C-style strings 
terminated by the first null byte, so we end up accessing the duplicated 
node text past the end of the buffer:

(gdb) b shell/ash.c:9148
Breakpoint 1 at 0x34ca03: file shell/ash.c, line 9148.
(gdb) run
Starting program: /home/harald/busybox/busybox_unstripped sh -c f\(\)\ 
\{\ echo\ \$\{PWD-\$\{PWD\!\}\}\;\ \}\;\ f
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

Breakpoint 1, copynode (n=0x3923ec) at shell/ash.c:9148
9148			new->narg.text = nodeckstrdup(n->narg.text);
(gdb) cont
Continuing.

Breakpoint 1, copynode (n=0x39240c) at shell/ash.c:9148
9148			new->narg.text = nodeckstrdup(n->narg.text);
(gdb) p n->narg.text
$1 = 0x3923fc "\202\002PWD=\202"
(gdb) step
nodeckstrdup (s=0xffffd1c8 "") at shell/ash.c:9059
9059	{
(gdb) next
9060		funcstring_end -= SHELL_ALIGN(strlen(s) + 1);
(gdb)
9061		return strcpy(funcstring_end, s);
(gdb)
9062	}

Here, \202 is CTLVAR. We can see that n->narg.text ends in CTLVAR 
followed by a null byte, and it is copied using strlen() and strcpy(), 
so any bytes after that null byte will be left out.

There are two possible ways of fixing this, depending on the intended 
behaviour. Nothing has yet been said on the list to definitively know 
what the dash intended behaviour here is, but regardless, busybox may 
choose to act now.

1: If the intended behaviour is to raise an error:

--- a/shell/ash.c
+++ b/shell/ash.c
@@ -7465,9 +7465,6 @@ varvalue(char *name, int varflags, int flags, int 
quoted)
         int discard = (subtype == VSPLUS || subtype == VSLENGTH) | 
(flags & EXP_DISCARD);

         if (!subtype) {
-               if (discard)
-                       return -1;
-
                 ifsfree();
                 raise_error_syntax("bad substitution");
         }

This preserves the copynode() behaviour of cutting off the word, but it 
is okay as now a null byte is guaranteed to terminate the expansion.

2: If the intended behaviour is to ignore the invalid substitution as 
long as it is skipped:

--- a/shell/ash.c
+++ b/shell/ash.c
@@ -12981,6 +12981,8 @@ parsesub: {
                         synstack->dblquote = newsyn != BASESYNTAX;
                 }

+               if (subtype == 0)
+                       subtype = VSNUL;
                 ((unsigned char *)stackblock())[typeloc] = subtype;
                 if (subtype != VSNORMAL) {
                         synstack->varnest++;

This encodes invalid substitutions using VSNUL, which when masked with 
VSTYPE will result in 0 like before, but does not result in all bits 
zero, so does not terminate the string.

I know my mail client will have mangled the formatting. Apologies for that.

I am not expecting either of these patches to be applied as is anyway, 
at this point I am more interested in getting the busybox views on 
whether either fix is wanted now already (before dash acts), and if so, 
which one. Especially the second one will likely have opportunities to 
clean up and reduce code size by making sure subtype is already set to 
VSNUL at this point, rather than 0, meaning it does not need to be 
patched up here.

Cheers,
Harald van Dijk
_______________________________________________
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic