[prev in list] [next in list] [prev in thread] [next in thread] 

List:       rhq-devel
Subject:    Re: AS is killed if the agent is killed (provided it was started/restarted by the agent)
From:       John Mazzitelli <mazz () redhat ! com>
Date:       2013-02-26 18:41:40
Message-ID: 718963107.9261993.1361904100593.JavaMail.root () redhat ! com
[Download RAW message or body]

I put some comments on that wiki page at the bottom in yellow boxes.

----- Original Message -----
> Here is my update. I've created a wiki [1] for it and some prototype
> implementation [2].
> 
> Please send me some feedback, if you like the idea or not.
> 
> 
> [1]
> https://docs.jboss.org/author/display/RHQ/Improvements+to+Process+Execution
> [2]
> https://github.com/Jiri-Kremser/RHQ/commits/jkremser/separateFromParent
> 
> jk
> 
> ----- Original Message -----
> > From: "Jiri Kremser" <jkremser@redhat.com>
> > To: rhq-devel@lists.fedorahosted.org
> > Sent: Thursday, February 21, 2013 4:40:14 PM
> > Subject: Re: AS is killed if the agent is killed (provided it was
> > 	started/restarted by the agent)
> > 
> > > Basically, it isn't that it doesn't work with commands like ...
> > > it
> > > really is that it doesn't work with anything that is not a shell?
> > Yes, unfortunately.
> > 
> > > Which
> > > does make me wonder about the 'java -jar' scenario that was
> > > mentioned.
> > > In that case, did the " & wait $!" get passed as parameters to
> > > the
> > > main
> > > method of the invoked class?
> > Yes, I did an experiment. I created a simple program (dumb.jar)
> > that
> > prints its arguments, and when I run it with:
> > 
> > Runtime.getRuntime().exec(new String[] { "java", "-jar",
> > "/home/jkremser/dumb.jar", " & wait $!" });
> > 
> > It just writes the " & wait $!" on stdout.
> > 
> > 
> > Now I am working on the shell script, here is the latest version:
> > http://pastebin.test.redhat.com/129137
> > It uses the RHQ_CONTROL_WAIT property.
> > * If it has the value set to some number N, than the script will
> > wait
> > for N seconds
> >  and then sends the 0 signal (checking if it is running) to the
> >  process and exits successfully if the process is running.
> > N = 0 means don't care and return 0
> > * If it is set to "true", it will wait for the process to finish
> > its
> > work and return its exit code.
> > There should be some check that the env. prop is set and it has
> > some
> > allowed value.
> > I've slightly modified the Hyperic HQ script for this purpose.
> > 
> > There were only few changes to code:
> > http://git.fedorahosted.org/cgit/rhq/rhq.git/diff/?id=1f1b77f0c
> > Right now the path to the script is hard coded in
> > ProcessExecutor.RUN_IN_BACKGROUND_PREFIX I am looking into how
> > propagate this information from AgentMain.getAgentHomeDirectory()
> > to
> > rhq-core-util module (I was trying to utilize the resource context,
> > but the paths there are the paths to the plugin data dirs.)
> > 
> > jk
> > 
> > 
> > 
> > ----- Original Message -----
> > > From: "Larry O'Leary" <loleary@redhat.com>
> > > To: rhq-devel@lists.fedorahosted.org
> > > Sent: Thursday, February 21, 2013 4:38:36 AM
> > > Subject: Re: AS is killed if the agent is killed (provided it was
> > > 	started/restarted by the agent)
> > > 
> > > On Tue, 2013-02-19 at 11:18 -0500, Jiri Kremser wrote:
> > > > here is the magic :)
> > > > http://git.fedorahosted.org/cgit/rhq/rhq.git/diff/?id=626bd1762&h=jkremser%2FseparateFromParent
> > > > 
> > > > Unfortunately, it doesn't work for commands like gedit, gvim or
> > > > firefox, that interpret the " & wait $!" suffix as their
> > > > parameter. Shell scripts fork well.
> > > 
> > > Basically, it isn't that it doesn't work with commands like ...
> > > it
> > > really is that it doesn't work with anything that is not a shell?
> > > Which
> > > does make me wonder about the 'java -jar' scenario that was
> > > mentioned.
> > > In that case, did the " & wait $!" get passed as parameters to
> > > the
> > > main
> > > method of the invoked class?
> > > 
> > > 
> > > > I was trying the solution Thomas has proposed yesterday (using
> > > > the
> > > > same approach as rhq-wrapper script does), but afaik, there is
> > > > no
> > > > way
> > > > to handle user input and relay it to the process in the
> > > > background
> > > > and
> > > > vice versa.
> > > 
> > > Yeah. I think the only way something like that would work is if
> > > the
> > > agent command-prompt/shell was separated from the agent itself.
> > > Essentially two separate JVMs which would require quite a bit
> > > more
> > > work.
> > > 
> > > 
> > > Another thought is if the process exec itself perhaps launched a
> > > JVM
> > > which launched the target or perhaps we used a shell wrapper (I
> > > think
> > > someone suggested this before) which did the same. In other
> > > words,
> > > a
> > > shell wrapper that was embedded in one of the agent's libraries
> > > and
> > > simply executed the target command?
> > > 
> > > 
> > > > 
> > > > ----- Original Message -----
> > > > > From: "Charles Crouch" <ccrouch@redhat.com>
> > > > > To: rhq-devel@lists.fedorahosted.org
> > > > > Sent: Monday, February 18, 2013 4:49:56 PM
> > > > > Subject: Re: AS is killed if the agent is killed (provided it
> > > > > was
> > > > > 	started/restarted by the agent)
> > > > > 
> > > > > So what is the magic change?
> > > > > 
> > > > > ----- Original Message -----
> > > > > > I've implemented something that solves the issue. I've
> > > > > > added
> > > > > > a
> > > > > > flag
> > > > > > called "separatedFromParent" on ProcessExecution class and
> > > > > > if
> > > > > > user
> > > > > > sets it to true it, surprisingly, separates the process
> > > > > > from
> > > > > > the
> > > > > > parent process without loosing the exit code and/or
> > > > > > streams.
> > > > > > 
> > > > > > I want to make a hangout on G+ about it today at 17:00.
> > > > > > 
> > > > > > 
> > > > > > jk
> > > > > > 
> > > > > > ----- Original Message -----
> > > > > > > From: "Larry O'Leary" <loleary@redhat.com>
> > > > > > > To: rhq-devel@lists.fedorahosted.org
> > > > > > > Sent: Thursday, February 14, 2013 11:02:03 PM
> > > > > > > Subject: Re: AS is killed if the agent is killed
> > > > > > > (provided
> > > > > > > it
> > > > > > > was
> > > > > > > 	started/restarted by the agent)
> > > > > > > 
> > > > > > > On Thu, 2013-02-14 at 13:39 -0500, Jiri Kremser wrote:
> > > > > > > > > If this happens when you run the agent using the
> > > > > > > > > recommended
> > > > > > > > > mechanism (rhq-agent-wrapper.sh) then I agree this is
> > > > > > > > > a
> > > > > > > > > problem.
> > > > > > > > > If this is true, we need to know this and fix it.
> > > > > > > > > So, if someone starts the agent via
> > > > > > > > > rhq-agent-wrapper.sh,
> > > > > > > > > that
> > > > > > > > > agent starts an AS7, the user then stops that agent,
> > > > > > > > > does
> > > > > > > > > the
> > > > > > > > > AS7 die? If so, that's a problem.
> > > > > > > > 
> > > > > > > > Neither sending the SIGINT to the agent started by
> > > > > > > > rhq-agent-wrapper.sh nor sending the SIGKILL stops the
> > > > > > > > AS7.
> > > > > > > > Because the rhq-wrapper is separated from the agent
> > > > > > > 
> > > > > > > Okay. If this is the case then perhaps this does reduce
> > > > > > > the
> > > > > > > impact
> > > > > > > of
> > > > > > > this issue. However, I do not think it reduces the
> > > > > > > severity
> > > > > > > of
> > > > > > > the
> > > > > > > issue
> > > > > > > or mean that this is not a bug.
> > > > > > > 
> > > > > > > > > That is not the same thing. You are executing the
> > > > > > > > > code
> > > > > > > > > within
> > > > > > > > > the
> > > > > > > > > shell.
> > > > > > > > > You are not forking a process. Only creating a child
> > > > > > > > > shell
> > > > > > > > > within
> > > > > > > > > the
> > > > > > > > > running shell. If that is what we are doing, then
> > > > > > > > > that
> > > > > > > > > seems
> > > > > > > > > to
> > > > > > > > > be the
> > > > > > > > > flaw.
> > > > > > > > 
> > > > > > > > I think it is the same as what we do in our
> > > > > > > > environment.
> > > > > > > > The
> > > > > > > > only
> > > > > > > > way of forking a process in bash is the & as far as I
> > > > > > > > know,
> > > > > > > > we
> > > > > > > > don't have any C code doing fork().
> > > > > > > 
> > > > > > > Agreed. I am not saying it is not the same as what we are
> > > > > > > doing.
> > > > > > > I
> > > > > > > am
> > > > > > > saying that what you described is not the same as what I
> > > > > > > was
> > > > > > > referring
> > > > > > > to. I can start processes or have init start processes
> > > > > > > without
> > > > > > > the
> > > > > > > concern or risk killing my terminal or the init service
> > > > > > > resulting
> > > > > > > in
> > > > > > > everything be stopped.
> > > > > > > 
> > > > > > > What you are referring to is executing processes within a
> > > > > > > thread
> > > > > > > in
> > > > > > > the
> > > > > > > foreground. Perhaps that is a limitation of Java but
> > > > > > > shouldn't
> > > > > > > change
> > > > > > > the fact that such a limitation introduces a critical
> > > > > > > flaw
> > > > > > > in
> > > > > > > how
> > > > > > > plug-ins control the life-cycle of a resource they manage
> > > > > > > and
> > > > > > > by
> > > > > > > using
> > > > > > > such functionality, you are now exposing all your
> > > > > > > services
> > > > > > > to
> > > > > > > the
> > > > > > > potential of ultimate termination by the agent in an
> > > > > > > uncontrolled
> > > > > > > manner. In other others, the only option I have is to use
> > > > > > > RHQ
> > > > > > > or
> > > > > > > not
> > > > > > > to
> > > > > > > use RHQ. There is no happy medium other then just telling
> > > > > > > users
> > > > > > > not
> > > > > > > to
> > > > > > > do something and hoping that everyone remembers that for
> > > > > > > ever
> > > > > > > and
> > > > > > > no
> > > > > > > new
> > > > > > > administrators get added to the team. Essentially saying
> > > > > > > that
> > > > > > > it
> > > > > > > is
> > > > > > > human nature and an accepted standard that unrelated
> > > > > > > processes
> > > > > > > don't
> > > > > > > share the same life-cycle.
> > > > > > > 
> > > > > > > > 
> > > > > > > > > All processes that have been forked from their parent
> > > > > > > > > will
> > > > > > > > > not
> > > > > > > > > receive
> > > > > > > > signals from their parent.
> > > > > > > > 
> > > > > > > > agree, or if the signals are intercepted by traps or a
> > > > > > > > signal
> > > > > > > > handlers (for JVM).
> > > > > > > > 
> > > > > > > > 
> > > > > > > > 
> > > > > > > > There is a JVM option -Xnosigchain for disabling the
> > > > > > > > JVM
> > > > > > > > signal
> > > > > > > > handler chaining on IBM JVM, on the Oracle's JVM there
> > > > > > > > is
> > > > > > > > -Xrs
> > > > > > > > option for that. I wouldn't do that. It would disable
> > > > > > > > our
> > > > > > > > ShutdownHookMechanism.
> > > > > > > > 
> > > > > > > > Other option is to implement our own SignalHandler
> > > > > > > > (http://pastebin.test.redhat.com/127934). I am playing
> > > > > > > > with
> > > > > > > > it
> > > > > > > > right now.
> > > > > > > 
> > > > > > > > 
> > > > > > > > 
> > > > > > > > 
> > > > > > > > 
> > > > > > > > . The default signal handler can be replaced
> > > > > > > > IBM java specific, for oracle there is -Xrs
> > > > > > > > 
> > > > > > > > not good idea
> > > > > > > > 
> > > > > > > > 
> > > > > > > > 
> > > > > > > > ----- Original Message -----
> > > > > > > > > From: "John Mazzitelli" <mazz@redhat.com>
> > > > > > > > > To: rhq-devel@lists.fedorahosted.org
> > > > > > > > > Sent: Thursday, February 14, 2013 4:32:27 PM
> > > > > > > > > Subject: Re: AS is killed if the agent is killed
> > > > > > > > > (provided it
> > > > > > > > > was
> > > > > > > > > 	started/restarted by the agent)
> > > > > > > > > 
> > > > > > > > > > There are other ways this could happen. For
> > > > > > > > > > example,
> > > > > > > > > > platform
> > > > > > > > > > level
> > > > > > > > > > monitoring systems may decide to restart this agent
> > > > > > > > > > by
> > > > > > > > > > sending
> > > > > > > > > > it
> > > > > > > > > > the
> > > > > > > > > > SIGINT signal. From the sounds of it, this too
> > > > > > > > > > would
> > > > > > > > > > cause
> > > > > > > > > > the
> > > > > > > > > > issue
> > > > > > > > > > whether the agent is in the foreground or
> > > > > > > > > > background?
> > > > > > > > > 
> > > > > > > > > If this happens when you run the agent using the
> > > > > > > > > recommended
> > > > > > > > > mechanism (rhq-agent-wrapper.sh) then I agree this is
> > > > > > > > > a
> > > > > > > > > problem.
> > > > > > > > > If
> > > > > > > > > this is true, we need to know this and fix it.
> > > > > > > > > 
> > > > > > > > > So, if someone starts the agent via
> > > > > > > > > rhq-agent-wrapper.sh,
> > > > > > > > > that
> > > > > > > > > agent
> > > > > > > > > starts an AS7, the user then stops that agent, does
> > > > > > > > > the
> > > > > > > > > AS7
> > > > > > > > > die?
> > > > > > > > > If
> > > > > > > > > so, that's a problem.
> > > > > > > > > _______________________________________________
> > > > > > > > > rhq-devel mailing list
> > > > > > > > > rhq-devel@lists.fedorahosted.org
> > > > > > > > > https://lists.fedorahosted.org/mailman/listinfo/rhq-devel
> > > > > > > > > 
> > > > > > > > _______________________________________________
> > > > > > > > rhq-devel mailing list
> > > > > > > > rhq-devel@lists.fedorahosted.org
> > > > > > > > https://lists.fedorahosted.org/mailman/listinfo/rhq-devel
> > > > > > > 
> > > > > > > 
> > > > > > > _______________________________________________
> > > > > > > rhq-devel mailing list
> > > > > > > rhq-devel@lists.fedorahosted.org
> > > > > > > https://lists.fedorahosted.org/mailman/listinfo/rhq-devel
> > > > > > > 
> > > > > > _______________________________________________
> > > > > > rhq-devel mailing list
> > > > > > rhq-devel@lists.fedorahosted.org
> > > > > > https://lists.fedorahosted.org/mailman/listinfo/rhq-devel
> > > > > > 
> > > > > _______________________________________________
> > > > > rhq-devel mailing list
> > > > > rhq-devel@lists.fedorahosted.org
> > > > > https://lists.fedorahosted.org/mailman/listinfo/rhq-devel
> > > > > 
> > > > _______________________________________________
> > > > rhq-devel mailing list
> > > > rhq-devel@lists.fedorahosted.org
> > > > https://lists.fedorahosted.org/mailman/listinfo/rhq-devel
> > > 
> > > 
> > > _______________________________________________
> > > rhq-devel mailing list
> > > rhq-devel@lists.fedorahosted.org
> > > https://lists.fedorahosted.org/mailman/listinfo/rhq-devel
> > > 
> > _______________________________________________
> > rhq-devel mailing list
> > rhq-devel@lists.fedorahosted.org
> > https://lists.fedorahosted.org/mailman/listinfo/rhq-devel
> > 
> _______________________________________________
> rhq-devel mailing list
> rhq-devel@lists.fedorahosted.org
> https://lists.fedorahosted.org/mailman/listinfo/rhq-devel
> 
_______________________________________________
rhq-devel mailing list
rhq-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/rhq-devel

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic