[prev in list] [next in list] [prev in thread] [next in thread]
List: gdb
Subject: Same thread is reported as [New thread xyz] endlessly
From: Raphael Zulliger <zulliger () indel ! ch>
Date: 2014-12-04 7:27:58
Message-ID: 54800CFE.8070407 () indel ! ch
[Download RAW message or body]
Hi
With my slightly patched GDB, debugging an extended-remote target, I
encountered issues showing a kind of "phantom thread": A thread that's
permanently reported to be 'created' again and again (gdb reports [New
Thread xyz], xyz is always the same). This happened although the remote
target definitely did not send notifications related to that "phantom
thread". This renders GDB useless, as I can't, for example, expand the
callstack in Eclipse/CDT anymore.
I think I found the bug - or at least a way to circumvent the issue. I
thought I send it to this list, although the information I can give you
is quite limited... My hope is that it may just be obvious for someone
of you guys and you can come up with a fix, even with this little
information.
The following happens in the case of the "phantom thread":
'struct thread_info *add_thread_silent (ptid_t ptid)'
is called with ptid of the phantom thread. Then,
'find_thread_ptid (ptid);'
returns a non-NULL pointer. Because
ptid_equal (inferior_ptid, ptid)
is false we call
'delete_thread(ptid)'
The problem with that seems to be that
'tp->refcount > 0'
is true - and therefore
'static void delete_thread_1 (ptid_t ptid, int silent)'
will not actually delete the thread right away, but only marks it as
'tp->state = THREAD_EXITED'.
After that, we call
'tp = new_thread (ptid);'
Although the *thread has not been deleted yet*. This then creates a kind
of yet another thread with that ptid...
I think there's a flaw in that mechanism:
'struct thread_info *add_thread_silent (ptid_t ptid)'
only checks for
'ptid_equal (inferior_ptid, ptid)'
while
'static void delete_thread_1 (ptid_t ptid, int silent)'
checks for
'tp->refcount > 0 || ptid_equal (tp->ptid, inferior_ptid)'
Shouldn't those checks by 'in-sync'?
What seems to help is the following: In 'struct thread_info
*add_thread_silent (ptid_t ptid)', add an additional 'else if' like in here:
struct thread_info *
add_thread_silent (ptid_t ptid)
{
struct thread_info *tp;
tp = find_thread_ptid (ptid);
if (tp)
/* Found an old thread with the same id. It has to be dead,
otherwise we wouldn't be adding a new thread with the same id.
The OS is reusing this id --- delete it, and recreate a new
one. */
{
/* In addition to deleting the thread, if this is the current
thread, then we need to take care that delete_thread doesn't
really delete the thread if it is inferior_ptid. Create a
new template thread in the list with an invalid ptid, switch
to it, delete the original thread, reset the new thread's
ptid, and switch to it. */
if (ptid_equal (inferior_ptid, ptid))
{
...
}
else if((tp->refcount > 0)) {
/* Now reset its ptid, and reswitch inferior_ptid to it. */
tp->state = THREAD_STOPPED;
observer_notify_new_thread (tp);
/* All done. */
return tp;
}
else
/* Just go ahead and delete it. */
delete_thread (ptid);
}
...
Btw: The callstack at the time GDB came into that newly added 'else if'
was like this:
Thread [1] 18196 [core: 3] (Suspended : Signal : 0:Signal 0)
add_thread_silent() at thread.c:261 0x5dd485
add_thread_with_info() at thread.c:292 0x5dd5e8
add_thread() at thread.c:306 0x5dd67d
remote_add_thread() at remote.c:1,524 0x4a94d2
remote_notice_new_inferior() at remote.c:1,547 0x4a959d
process_stop_reply() at remote.c:5,837 0x4b2168
remote_wait_ns() at remote.c:5,887 0x4b22dc
remote_wait() at remote.c:6,062 0x4b281a
target_wait() at target.c:2,660 0x611963
fetch_inferior_event() at infrun.c:2,821 0x5cac1c
fetch_inferior_event_wrapper() at inf-loop.c:149 0x5ee1f6
catch_errors() at exceptions.c:546 0x5e1504
inferior_event_handler() at inf-loop.c:53 0x5edf4b
remote_async_inferior_event_handler() at remote.c:11,737 0x4bd768
invoke_async_event_handler() at event-loop.c:1,073 0x5ec774
process_event() at event-loop.c:342 0x5eb291
gdb_do_one_event() at event-loop.c:394 0x5eb333
start_event_loop() at event-loop.c:431 0x5eb3a8
mi_command_loop() at mi-interp.c:354 0x4eb99e
mi2_command_loop() at mi-interp.c:334 0x4eb94f
current_interp_command_loop() at interps.c:326 0x5e306c
captured_command_loop() at main.c:260 0x5e42ac
catch_errors() at exceptions.c:546 0x5e1504
captured_main() at main.c:1,055 0x5e572a
catch_errors() at exceptions.c:546 0x5e1504
gdb_main() at main.c:1,064 0x5e5760
main() at gdb.c:34 0x457a2b
Note that this issue is strongly timing relevant. Right now, I've a
situation in which it's quite good reproducible - but usually it is not.
GDB (as said: slightly patched):
GNU gdb (GDB) 7.6.50.20130604-cvs
...
This GDB was configured as "--host=x86_64-unknown-linux-gnu
--target=powerpc-indel-eabi".
Note that I can't easily check this issue against 'master' as debugging
my remote target doesn't work out of the box with it.
Raphael
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic