[prev in list] [next in list] [prev in thread] [next in thread]
List: busybox
Subject: Re: series of ctrl-c makes ssh session hang
From: Denys Vlasenko <vda.linux () googlemail ! com>
Date: 2017-02-03 11:58:51
Message-ID: CAK1hOcN2gZ6gJiRsNfdDkO2pRsXbKoNouEpORBfJa5CWvKbLeg () mail ! gmail ! com
[Download RAW message or body]
On Thu, Feb 2, 2017 at 3:25 PM, Ronny Meeus <ronny.meeus@gmail.com> wrote:
> > > When pressing enter in the ssh session I see for dropbear:
> > > # strace -p 2066
> > > strace: Process 2066 attached
> > > _newselect(8, [3 5 7], [], NULL, {3516, 826061}) = 1 (in [5], left
> > > {3512, 787390})
> > > clock_gettime(0x6 /* CLOCK_??? */, {327, 955324187}) = 0
> > > read(5, ";\235\21\332\365\210T\200X}\230\"\306.\363\221", 16) = 16
> > > read(5, "\2\345\252\274\24Y\253\21\316>}\266\fU\20259\324\254Tu\3534\0238bMXzV\274\270",
> > > 32) = 32
> > > clock_gettime(0x6 /* CLOCK_??? */, {327, 955324187}) = 0
> > > writev(7, [{iov_base="\r", iov_len=1}], 1) = 1
> > > clock_gettime(0x6 /* CLOCK_??? */, {327, 956324197}) = 0
> > > _newselect(8, [3 5 7], [], NULL, {3600, 0}) = 1 (in [7], left {3599, 999987})
> > > clock_gettime(0x6 /* CLOCK_??? */, {327, 956324197}) = 0
> > > read(7, "\r\n", 16375) = 2
> > > clock_gettime(0x6 /* CLOCK_??? */, {327, 956324197}) = 0
> > > writev(5, [{iov_base="\231\310\271\315\354\243\342\271\22,\325Tj\n\356\345\"t\332d\205\317.\213\376\200\274h\201\347$\324"...,
> > > iov_len=48}], 1) = 48
> > > clock_gettime(0x6 /* CLOCK_??? */, {327, 957324207}) = 0
> > > _newselect(8, [3 5 7], [], NULL, {3600, 0}^Cstrace: Process 2066 detached
> > >
> > > While the sh process is not printing any additional traces. So this
> > > process is completely blocked:
> > > /isam/slot_default/run # strace -p 2078
> > > strace: Process 2078 attached
> > > futex(0xffed598, FUTEX_WAIT_PRIVATE, 2, NULL
> > >
> > >
> > > Connecting a debugger to the system (sh pid 2078) shows that the only
> > > thread the process has is blocked
> > > on a mutex in the C library.
> > >
> > > (gdb) info threads
> > > Id Target Id Frame
> > > * 1 Thread 2078 0x1003d0ec in putprompt (s=<optimized out>)
> > > at shell/ash.c:2455
> > > (gdb) bt
> > > #0 0x0ff5c708 in __lll_lock_wait_private (futex=0xffed598
> > > <main_arena>) at ../nptl/sysdeps/unix/sysv/linux/lowlevellock.c:31
> > > #1 0x0fef07a8 in *__GI___libc_free (mem=<optimized out>) at malloc.c:3714
> > > #2 0x1003d0ec in putprompt (s=<optimized out>) at shell/ash.c:2455
> > > #3 setprompt_if (do_set=<optimized out>, whichprompt=<optimized out>)
> > > at shell/ash.c:2501
> > > #4 0x1003d448 in parsecmd (interact=<optimized out>) at shell/ash.c:12074
> > > #5 0x1004100c in cmdloop (top=<optimized out>) at shell/ash.c:12215
> > > #6 0x10042730 in ash_main (argc=<optimized out>, argv=<optimized
> > > out>) at shell/ash.c:13350
> >
> > Looks like signal interrupted malloc or free, then
> > signal handler longjmped (ash by design does that)
> > without returning to the malloc or free.
> > malloc state is now corrupted, and free()
> > in putprompt() deadlocks.
> >
> > INT_OFF/INT_ON pais guarding code which must not be
> > interrupted like this is missing somewhere.
>
> Interesting info, thanks.
>
> How do we continue to identify the place in the code?
I guess by code review and experiments. For example,
try adding "INT_OFF;" and "INT_ON;" around this
code block:
# if ENABLE_FEATURE_TAB_COMPLETION
line_input_state->path_lookup = pathval();
# endif
reinit_unicode_for_ash();
nr = read_line_input(line_input_state, cmdedit_prompt,
buf, IBUFSIZ, timeout);
> Does this not mean that before all library calls we need to make sure
> signals are disabled?
Not all library calls, only some. For example, read() or strlen()
can be interrupted and longjmp'ed away with no ill effects.
_______________________________________________
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic