[prev in list] [next in list] [prev in thread] [next in thread] 

List:       freenx-knx
Subject:    Re: [FreeNX-kNX] nxagent session gets lost,
From:       Freerk Kalsbeek <f.kalsbeek () mindswitch ! nl>
Date:       2009-02-16 19:50:54
Message-ID: e9542db00902161150r96c3f70q85ffb8af047df0a5 () mail ! gmail ! com
[Download RAW message or body]

[Attachment #2 (multipart/alternative)]


I've implemented this patch at one of our servers. Let's see what happens
next few days.
Haven't had the time to analyse our issues in more detail. Hopefully this
fixes it.

Regards,
Freerk

On Sun, Jan 25, 2009 at 4:32 AM, Mario Becroft <mb@gem.win.co.nz> wrote:

> I still don't fully understand this problem, but I have a solution.
>
> I am not very sure about Marcelo's patch because as far as I can see,
> NODE_SUSPEND_STATUS is never set to "Suspending". What is this patch
> meant to do exactly?
>
> I found that with slave mode disabled, everything is much easier to
> understand, and it does not appear to make it any slower. It did not
> exactly fix the problem though, just modified the symptoms.
>
> The key problem is that when the client nxssh is killed, nxserver hangs
> in the echo inside server_nxnode_echo(). It attempts to handle this
> situation by installing a SIGPIPE handler that sets
> SERVER_CHANNEL=0. Unfortunately, SIGPIPE is never received in this
> situation; instead the echo hangs forever. This is what causes it never
> to process any more commands from nxnode.
>
> It is not entirely clear why it happens in this way.
>
> Anyway, the workaround is to change echo to /bin/echo. /bin/echo returns
> immediately if the client is disconnected. Probably it should also check
> the status and set SERVER_CHANNEL=0 if /bin/echo failed. However I have
> not bothered to do this. It does not seem to matter a great deal.
>
> This solves the problem both with and without slave mode. I think there
> may still be some sort of timing related potential problem here, but I
> am not sure, it is all rather complicated.
>
> I have also noticed another problem that I thought might be related, but
> is probably different. If you unplug the network from the currently
> logged in client, it takes about 30 seconds before nxagent notices that
> the client is gone and suspends the session. If, in this 30-second
> window, you login from another client, everything works, but the session
> status incorrectly remains in suspended state. I guess this is because
> when the second client logs in, it must suspend the session before
> restoring it on the new client. Somehow the suspended state of the
> session is set after the resumed state. I am out of time and this
> problem is not so serious, so I am ignoring it for now. Maybe someone
> else has time to look into this one.
>
> Anyway, for anyone else who has the present problem, please try the
> following patch and report back.
>
> See the patch below (the line numbers might be a bit off since my file
> has lots of extra instrumentation):
>
> --8<---------------cut here---------------start------------->8---
> --- nxserver.foo        2009-01-25 16:07:46.590977440 +1300
> +++ nxserver    2009-01-25 16:07:54.498952944 +1300
> @@ -967,8 +967,8 @@
>  server_nxnode_echo()
>  {
>        log 6 "server_nxnode_echo: $@"
> -       [ "$SERVER_CHANNEL" = "1" ] && echo "$@"
> -       [ "$SERVER_CHANNEL" = "2" ] && echo "$@" >&2
> +       [ "$SERVER_CHANNEL" = "1" ] && /bin/echo "$@"
> +       [ "$SERVER_CHANNEL" = "2" ] && /bin/echo "$@" >&2
>  }
>
>  server_nxnode_exit_func()
> --8<---------------cut here---------------end--------------->8---
>
> --
> Mario Becroft <mb@gem.win.co.nz>
> ________________________________________________________________
>     Were you helped on this list with your FreeNX problem?
>    Then please write up the solution in the FreeNX Wiki/FAQ:
>
> http://openfacts2.berlios.de/wikien/index.php/BerliosProject:FreeNX_-_FAQ
>
>         Don't forget to check the NX Knowledge Base:
>                 http://www.nomachine.com/kb/
>
> ________________________________________________________________
>       FreeNX-kNX mailing list --- FreeNX-kNX@kde.org
>      https://mail.kde.org/mailman/listinfo/freenx-knx
> ________________________________________________________________
>

[Attachment #5 (text/html)]

I&#39;ve implemented this patch at one of our servers. Let&#39;s see what \
happens next few days.<br>Haven&#39;t had the time to analyse our issues in \
more detail. Hopefully this fixes it.<br><br>Regards,<br>Freerk<br><br><div \
class="gmail_quote"> On Sun, Jan 25, 2009 at 4:32 AM, Mario Becroft <span \
dir="ltr">&lt;<a href="mailto:mb@gem.win.co.nz">mb@gem.win.co.nz</a>&gt;</span> \
wrote:<br><blockquote class="gmail_quote" style="border-left: 1px solid \
rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"> I still \
don&#39;t fully understand this problem, but I have a solution.<br> <br>
I am not very sure about Marcelo&#39;s patch because as far as I can \
see,<br> NODE_SUSPEND_STATUS is never set to &quot;Suspending&quot;. What \
is this patch<br> meant to do exactly?<br>
<br>
I found that with slave mode disabled, everything is much easier to<br>
understand, and it does not appear to make it any slower. It did not<br>
exactly fix the problem though, just modified the symptoms.<br>
<br>
The key problem is that when the client nxssh is killed, nxserver hangs<br>
in the echo inside server_nxnode_echo(). It attempts to handle this<br>
situation by installing a SIGPIPE handler that sets<br>
SERVER_CHANNEL=0. Unfortunately, SIGPIPE is never received in this<br>
situation; instead the echo hangs forever. This is what causes it never<br>
to process any more commands from nxnode.<br>
<br>
It is not entirely clear why it happens in this way.<br>
<br>
Anyway, the workaround is to change echo to /bin/echo. /bin/echo \
returns<br> immediately if the client is disconnected. Probably it should \
also check<br> the status and set SERVER_CHANNEL=0 if /bin/echo failed. \
However I have<br> not bothered to do this. It does not seem to matter a \
great deal.<br> <br>
This solves the problem both with and without slave mode. I think there<br>
may still be some sort of timing related potential problem here, but I<br>
am not sure, it is all rather complicated.<br>
<br>
I have also noticed another problem that I thought might be related, \
but<br> is probably different. If you unplug the network from the \
currently<br> logged in client, it takes about 30 seconds before nxagent \
notices that<br> the client is gone and suspends the session. If, in this \
30-second<br> window, you login from another client, everything works, but \
the session<br> status incorrectly remains in suspended state. I guess this \
is because<br> when the second client logs in, it must suspend the session \
before<br> restoring it on the new client. Somehow the suspended state of \
the<br> session is set after the resumed state. I am out of time and \
this<br> problem is not so serious, so I am ignoring it for now. Maybe \
someone<br> else has time to look into this one.<br>
<br>
Anyway, for anyone else who has the present problem, please try the<br>
following patch and report back.<br>
<br>
See the patch below (the line numbers might be a bit off since my file<br>
has lots of extra instrumentation):<br>
<div class="Ih2E3d"><br>
--8&lt;---------------cut here---------------start-------------&gt;8---<br>
</div>--- nxserver.foo &nbsp; &nbsp; &nbsp; &nbsp;2009-01-25 \
                16:07:46.590977440 +1300<br>
+++ nxserver &nbsp; &nbsp;2009-01-25 16:07:54.498952944 +1300<br>
@@ -967,8 +967,8 @@<br>
&nbsp;server_nxnode_echo()<br>
&nbsp;{<br>
 &nbsp; &nbsp; &nbsp; &nbsp;log 6 &quot;server_nxnode_echo: $@&quot;<br>
- &nbsp; &nbsp; &nbsp; [ &quot;$SERVER_CHANNEL&quot; = &quot;1&quot; ] \
                &amp;&amp; echo &quot;$@&quot;<br>
- &nbsp; &nbsp; &nbsp; [ &quot;$SERVER_CHANNEL&quot; = &quot;2&quot; ] \
&amp;&amp; echo &quot;$@&quot; &gt;&amp;2<br> + &nbsp; &nbsp; &nbsp; [ \
&quot;$SERVER_CHANNEL&quot; = &quot;1&quot; ] &amp;&amp; /bin/echo \
&quot;$@&quot;<br> + &nbsp; &nbsp; &nbsp; [ &quot;$SERVER_CHANNEL&quot; = \
&quot;2&quot; ] &amp;&amp; /bin/echo &quot;$@&quot; &gt;&amp;2<br> \
&nbsp;}<br> <br>
&nbsp;server_nxnode_exit_func()<br>
<div><div></div><div class="Wj3C7c">--8&lt;---------------cut \
here---------------end---------------&gt;8---<br> <br>
--<br>
Mario Becroft &lt;<a \
href="mailto:mb@gem.win.co.nz">mb@gem.win.co.nz</a>&gt;<br> \
________________________________________________________________<br>  \
&nbsp; &nbsp; Were you helped on this list with your FreeNX problem?<br>  \
&nbsp; &nbsp;Then please write up the solution in the FreeNX Wiki/FAQ:<br> \
<br> <a href="http://openfacts2.berlios.de/wikien/index.php/BerliosProject:FreeNX_-_FAQ" \
target="_blank">http://openfacts2.berlios.de/wikien/index.php/BerliosProject:FreeNX_-_FAQ</a><br>
 <br>
 &nbsp; &nbsp; &nbsp; &nbsp; Don&#39;t forget to check the NX Knowledge \
Base:<br>  &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <a \
href="http://www.nomachine.com/kb/" \
target="_blank">http://www.nomachine.com/kb/</a><br> <br>
________________________________________________________________<br>
 &nbsp; &nbsp; &nbsp; FreeNX-kNX mailing list --- <a \
href="mailto:FreeNX-kNX@kde.org">FreeNX-kNX@kde.org</a><br>  &nbsp; &nbsp; \
&nbsp;<a href="https://mail.kde.org/mailman/listinfo/freenx-knx" \
target="_blank">https://mail.kde.org/mailman/listinfo/freenx-knx</a><br> \
________________________________________________________________<br> \
</div></div></blockquote></div><br>



________________________________________________________________
     Were you helped on this list with your FreeNX problem?
    Then please write up the solution in the FreeNX Wiki/FAQ:

http://openfacts2.berlios.de/wikien/index.php/BerliosProject:FreeNX_-_FAQ
  
         Don't forget to check the NX Knowledge Base:
                 http://www.nomachine.com/kb/ 

________________________________________________________________
       FreeNX-kNX mailing list --- FreeNX-kNX@kde.org
      https://mail.kde.org/mailman/listinfo/freenx-knx
________________________________________________________________

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic