'[non-stop] 10/10 split user/internal threads'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       gdb-patches
Subject:    [non-stop] 10/10 split user/internal threads
From:       pedro () codesourcery ! com (Pedro Alves)
Date:       2008-06-30 12:21:00
Message-ID: 200806300108.49392.pedro () codesourcery ! com
[Download RAW message or body]

A Thursday 26 June 2008 14:32:51, Daniel Jacobowitz wrote:
> On Sun, Jun 15, 2008 at 10:09:44PM +0100, Pedro Alves wrote:
> > I've added three points where GDB makes the internal (inferior_ptid)
> > thread switch to the user selected thread (user_selected_ptid):
> >
> > ?1 - execute_command, before executing a command
> > ?2 - execute_command, after executing a command,
> > ?3 - fetch_inferior_event, after handling the event.
> >
> > With these in place, the user never notices when GDB switches
> > between threads internally while handing events.
>
> The _internal name is a warning sign to me: it means we have two
> functions that conceptually do the same thing, but one of them is only
> "safe for internal use".  We call both of them from internal to GDB,
> so let's try to find a better name.  How about
> execute_command_in_current_context?
>

Ack.  internal refered to "internal selected thread" vs "user/external
selected thread".  Please keep reading, though.

> Is -interpreter-exec ever going to get a --thread argument?  We have
> to be careful because it currently restores the globally selected
> thread via cli_interpreter_exec, and even if it didn't it may call
> a user defined command.
>
> There are 22 calls to execute_command in GDB (including two from
> Insight).  This patch changes seven of them to
> execute_command_internal: thread.c, mi-main.c, mi-cmd-env.c.
> I checked all the others, just in case - they mostly look correct.
>
> The only cases I'm worried about are breakpoint commands and
> conditions.  For instance bpstat_do_actions to execute_control_command
> to execute_command will mess with inferior_ptid.  Should it?

Ack.  Keep reading.

> What about bpstat_check_breakpoint_conditions to breakpoint_cond_eval
> to evaluate_expression?
>
> Also, hook- and hookpost- commands will still be run after
> execute_command_internal - but every call to execute_command_internal
> manually sets the appropriate thread, so this looks fine.
>

Yeah, this can get twisty.  I've been computing these issue
in the background for a while, as these issues have been
worrying me.  I'd been through several completelly different versions
of this patch, as it's tricky to get all issues covered correctly.  I'd
prefer that the next revision goes in the right direction.  :-)

Let me start with a bit of some goals and problem description,
then work my way into a proposal, and then there's a patch
attached that applies on top of this one you've reviewed,
so the incremental changes can be seen better.

Here are some thoughts I keep bumping myself into:

- Ideally, at least the target interfaces and register frame
  and stack management components of GDB shouldn't
  know about a "current thread" (inferior_ptid) at all.
  I count over 750 references to it, so that is very far
  from being true.

- In the long run, it should be possible to support more
  than one context.  Say, the IDE can display two or
  even more frame info windows, each with its own
  thread selected.  Something similar to what described
  here (3.4.1):

http://www.sourceware.org/gdb/papers/multi-arch/real-multi-arch/

  ... should be supportable.

  MI's --thread and --frame new switches should enable
  implementing that on the frontend side.

- MI is gaining "stateless" --thread and --frame commands.
  Commands that take those arguments should not change the
  currently user selected thread and frame.

- Non-stop shouldn't change the current user selected thread
  and frame.  At least if the current thread is stopped.

- In non-stop mode, the current thread may exit while the
  user has it selected.  Switching current thread, and restoring
  it with a cleanup is problematic in non-stop mode.

  1) The original thread may exit before we re-select it

  - current thread 1 is running
  - switch to thread 2
  - <thread 1 exits>
  - trying to restore thread 1 is messy
     * ideally, thread 1 is no longer in the thread list at this point.
      The main reason the current patch could keep exited threads listed
      was due to context switching needing inferior_ptid to
      be listed...)
     * restoring by ptid is ambiguous is the app/OS quickly reuses
       thread 1's ptid (remember it had exited).

  2) The selected thread may change between creating the cleanup, and
     running it.  E.g.,

    To handle an event, we need to switch to the thread that got it
    --- Frame parsing/handling only being aplicable to inferior_ptid,
    and context switching are the main reasons behind this, off the top
    of my head.  There's a bunch of cleanup work and core rework to do
    to be able to remove this requirement.


    Picture these examples:

   - User has thread 1 selected
   - thread 2 has an event
   - GDB switches to thread 2 to handle it
   - It was an internal breakpoint, thread 2 is immediatelly resumed,
     so the user should not notice the thread switch.
   - GDB has finished handling the event, and reverts
     to thread 2 (and also the frame the user had selected in it),
     (the user didn't notice the thread switching, unless the original
      current thread or frame were destroyed by the just handled event.)

     In this case, restoring with a cleanup might seem appropriate.

     Now this example:

   - User has thread 1 selected.
   - thread 2 has an event
   - GDB switches to thread 2 to handle it
   - It was a breakpoint, and has breakpoint commands attached to it.
   - The breakpoint has several commands attached actually:
         thread 3
         print value1
         up
         print value2
         up
         print value3
         end

   - GDB has finished handling the event, so should restore to the
     user selected thread and frame.  But, which is it now?
     thread 1, or thread 3?  In all-stop mode?  In non-stop mode?


  After posting this patch I realized that the _internal variant
  indeed isn't good enough in situations like:

(gdb) define mycomm
>up
>print a
>end

(gdb) thread 1
(gdb) thread apply 2 mycomm

  mycomm is executed with execute_command_internal (on thread 2),
  but "up" and "print a" are ran with execute_command
  (hence on thread 1)...


  In non stop mode, we don't want stop events to switch
  the current thread, but, breakpoint commands should apply to
  the context of the thread that just hit the breakpoint, e.g.:

(gdb) b foo.c:100
Breakpoint 1 ...
(gdb) commands
>thread
>print func_arg
>c&
>end
(gdb) thread 1
(gdb) c -a&
  (thread 2 hits breakpoint 1)
    (GDB switches to it to handle the event)
    (GDB calls bpstat_do_actions, which apply in
    the context of thread 1) Boo!  Breakpoint command
    became useless...

  Now another example.  Imagine in non-stop mode, this case:

(gdb) b foo.c:100
Breakpoint 1 ...
(gdb) commands
>thread 3
>c&
>end
(gdb) interrupt -a
   (all threads stopped)
(gdb) thread 2
(gdb) c&
  (thread 2 resumes)
(gdb) thread 1
(gdb) print foo_c
(gdb) print foo_b
(gdb) pri
   (thread 2 hits breakpoint 1, and runs its commands,
    which included switching to thread 3)
nt foo_a</enter>

  (note that the breakpoint was hit while the user was
  inspecting thread 1, typing a print command)

  To which thread should the last "print foo_a" apply to?
  My opinion is that is should apply to thread 1.  Does
  anyone disagree?

  We still need something else to make GDB behave
  this way.

---------------------------------------------------------------

With all the above in mind, I thought of:

- calling the { selected thread and frame } thing a "context".
- having a context stack
  (The top level context being what the user/frontend
   considers current.)

Instead of temporarilly switching to another thread and frame;
execute commands in it, and restoring the selected thread and
frame, we do:

- push a new context on the context stack
- switch the selected thread and frame in this
  new now top level context
- execute command(s)
    which can switch thread and frame at will
- pop context

- Whenever a thread exit is detected, we go through all
contexts and if it was the selected thread in it, invalidate
it.


Note that execute_command_internal is then gone.  No need
for it anymore.


E.g. 1 - thread apply, pushing context to execute commands:

(top level context selected)
(gdb) thread 1
(gdb) thread apply 2 thread 3
 (push new context)
 (selected thread 2)
 (execute "thread 3")
 (pop context)
(gdb) thread
[Current thread is 1 ...]

>From the user point of view, this is the current HEAD's behaviour
actually.

E.g. 2 - thread apply frame restoring broke with execute_command_internal,
if command applied to the already selected thread, with contexts, it doesn't:

(top level context selected)
(gdb) thread 1
(gdb) frame
#0 ...
(gdb) thread apply 1 up
 (push new context)
 (selected thread 1)
 (execute "up")
 (pop context)
(gdb) thread
[Current thread is 1 ...]
(gdb) frame
#0 ...


E.g. 2:

  (top level context selected)
(gdb) b foo.c:100
Breakpoint 1 ...
(gdb) commands
>print func_arg
>c&
>end

(gdb) thread 1
        (thread 1 is running)
(gdb) c -a&
  (thread 2 hits breakpoint 1)
    (GDB switches to it to handle the event)
    (GDB pushes new context)
    (GDB switches to the event thread (thread 2))
    (calls bpstat_do_actions)
    (pops context)
    (restores current context, which is thread 1)


E.g. 3 - thread exits:

(gdb) b foo.c:100
Breakpoint 1 ...
(gdb) commands
>print inferior_func ()
>c&
>end

  (top level context selected)
(gdb) thread 1
(gdb) c -a&
  (all threads running now)
  (thread 2 hits breakpoint 1)
    (GDB switches to it to handle the event)
    (GDB pushes new context)
    (GDB switches to the event thread (thread 2))
    (calls bpstat_do_actions)
      (inferior function call, inferior is set running for a bit
        (thread 1 exits
           (all contexts with thread 1 selected are invalidated)
        )
      )
    (pops context)
    (restores current context, which was thread 1, but is not invalidaded.    
     i.e., there is now no selected thread in the current context.)
(gdb) interrupt
Cannot execute this command without a selected thread.  See `help thread'.

E.g. 4:

(gdb) b foo.c:100
Breakpoint 1 ...
(gdb) commands
>thread 3
>c&
>end

(gdb) interrupt -a
   (all threads stopped)
(gdb) thread 2
(gdb) c&
  (thread 2 resumes)
(gdb) thread 1
(gdb) print foo_c
(gdb) print foo_b
(gdb) pri
   (thread 2 hits breakpoint 1
     (a new context is pushed)
     (breakpoint commands are ran
       (thread 3; c&)
     )
     (context is poped, now back in thread 1)
nt foo_a
  (foo_a is evaluated in thread 1)


Hope I've made some sense.


Attached is a first draft at implementing this, needs a bit
of cleanup, but is does work.  What do you guys think?  Is this
the right direction?  It does regtest regression free on
x86_64-unknown-linux-gnu/all-stop.

Vladimir, I've added a basic --thread switch to MI just for
testing, based on older MI non-stop patches I had
here, just so you could see what would change in your
perspective.  As you see, only a couple of lines should change.
I'll remove it from any final patch.


> > +enum thread_state
> > +{
> > +  THREAD_STOPPED,
> > +  THREAD_RUNNING,
> > +  THREAD_EXITED,
> > +};
> > +
> > +static enum thread_state main_thread_state = 0;
>
> THREAD_STOPPED instead of 0?

Done.

>
> > -  /* Backup current thread and selected frame.  */
> > -  if (!is_running (inferior_ptid))
> > -    saved_frame_id = get_frame_id (get_selected_frame (NULL));
> > -  else
> > -    saved_frame_id = null_frame_id;
> > -
> > -  old_chain = make_cleanup_restore_current_thread (inferior_ptid,
> > saved_frame_id);
>
> Is this sort of simplification really a good idea?  Functions which
> leave inferior_ptid with an essentially unspecified value, while it's
> still global state used by most of GDB.
>

The idea is that execute_command restores the user selected
thread and frame anyway (and MI's equivalent does as well), so
this isn't needed.
Let's discuss the other general issues first, as if the direction
I'm proposing is better, this function will still be restoring
the thread and frame, when the context is poped.

> > +/* Only safe to use this cleanup on a stopped thread, and
> > +   inferior_ptid isn't ran before running the cleanup.  */
> > +
> >  struct current_thread_cleanup
>
> Not sure what you mean by this comment.
>

I've changed the comment to this:

/* It is only safe to use this cleanup iff inferior_ptid is alive and
   stopped, and, by if by the time it is ran, inferior_ptid represents
   the same thread, it is alive and it is stopped.  */

Does it make it clearer?

> > +  /* In non-stop mode, we don't want GDB to switch threads on the
> > +     users back, to avoid races where the user is typing a command to
>
> user's

Fixed, thanks.

-- 
Pedro Alves
-------------- next part --------------
A non-text attachment was scrubbed...
Name: context_thread.diff
Type: text/x-diff
Size: 29249 bytes
Desc: not available
URL: <http://sourceware.org/pipermail/gdb-patches/attachments/20080630/2e11ce4b/attachment.bin>

[prev in list] [next in list] [prev in thread] [next in thread]
Configure | About | News | Add a list | Sponsored by KoreLogic