[prev in list] [next in list] [prev in thread] [next in thread]
List: freenx-knx
Subject: Re: [FreeNX-kNX] nxagent session gets lost,
From: Marcelo Boveto Shima <marceloshima () gmail ! com>
Date: 2009-03-01 0:55:38
Message-ID: 7d3bf3160902281655o52664ec4me98d9e4fefee4398 () mail ! gmail ! com
[Download RAW message or body]
[Attachment #2 (multipart/alternative)]
The following patch fixes this bug.
http://bazaar.launchpad.net/~freenx-team/freenx-server/teambzr/revision/91
This line seems to trigger the problem.
echo "NX> 596 Error: Session $1 failed. Reason was: $line"
Running it only when the node failed to restore the session solves the bug.
Regards.
Shima
On Mon, Feb 16, 2009 at 4:50 PM, Freerk Kalsbeek
<f.kalsbeek@mindswitch.nl>wrote:
> I've implemented this patch at one of our servers. Let's see what happens
> next few days.
> Haven't had the time to analyse our issues in more detail. Hopefully this
> fixes it.
>
> Regards,
> Freerk
>
>
> On Sun, Jan 25, 2009 at 4:32 AM, Mario Becroft <mb@gem.win.co.nz> wrote:
>
>> I still don't fully understand this problem, but I have a solution.
>>
>> I am not very sure about Marcelo's patch because as far as I can see,
>> NODE_SUSPEND_STATUS is never set to "Suspending". What is this patch
>> meant to do exactly?
>>
>> I found that with slave mode disabled, everything is much easier to
>> understand, and it does not appear to make it any slower. It did not
>> exactly fix the problem though, just modified the symptoms.
>>
>> The key problem is that when the client nxssh is killed, nxserver hangs
>> in the echo inside server_nxnode_echo(). It attempts to handle this
>> situation by installing a SIGPIPE handler that sets
>> SERVER_CHANNEL=0. Unfortunately, SIGPIPE is never received in this
>> situation; instead the echo hangs forever. This is what causes it never
>> to process any more commands from nxnode.
>>
>> It is not entirely clear why it happens in this way.
>>
>> Anyway, the workaround is to change echo to /bin/echo. /bin/echo returns
>> immediately if the client is disconnected. Probably it should also check
>> the status and set SERVER_CHANNEL=0 if /bin/echo failed. However I have
>> not bothered to do this. It does not seem to matter a great deal.
>>
>> This solves the problem both with and without slave mode. I think there
>> may still be some sort of timing related potential problem here, but I
>> am not sure, it is all rather complicated.
>>
>> I have also noticed another problem that I thought might be related, but
>> is probably different. If you unplug the network from the currently
>> logged in client, it takes about 30 seconds before nxagent notices that
>> the client is gone and suspends the session. If, in this 30-second
>> window, you login from another client, everything works, but the session
>> status incorrectly remains in suspended state. I guess this is because
>> when the second client logs in, it must suspend the session before
>> restoring it on the new client. Somehow the suspended state of the
>> session is set after the resumed state. I am out of time and this
>> problem is not so serious, so I am ignoring it for now. Maybe someone
>> else has time to look into this one.
>>
>> Anyway, for anyone else who has the present problem, please try the
>> following patch and report back.
>>
>> See the patch below (the line numbers might be a bit off since my file
>> has lots of extra instrumentation):
>>
>> --8<---------------cut here---------------start------------->8---
>> --- nxserver.foo 2009-01-25 16:07:46.590977440 +1300
>> +++ nxserver 2009-01-25 16:07:54.498952944 +1300
>> @@ -967,8 +967,8 @@
>> server_nxnode_echo()
>> {
>> log 6 "server_nxnode_echo: $@"
>> - [ "$SERVER_CHANNEL" = "1" ] && echo "$@"
>> - [ "$SERVER_CHANNEL" = "2" ] && echo "$@" >&2
>> + [ "$SERVER_CHANNEL" = "1" ] && /bin/echo "$@"
>> + [ "$SERVER_CHANNEL" = "2" ] && /bin/echo "$@" >&2
>> }
>>
>> server_nxnode_exit_func()
>> --8<---------------cut here---------------end--------------->8---
>>
>> --
>> Mario Becroft <mb@gem.win.co.nz>
>> ________________________________________________________________
>> Were you helped on this list with your FreeNX problem?
>> Then please write up the solution in the FreeNX Wiki/FAQ:
>>
>> http://openfacts2.berlios.de/wikien/index.php/BerliosProject:FreeNX_-_FAQ
>>
>> Don't forget to check the NX Knowledge Base:
>> http://www.nomachine.com/kb/
>>
>> ________________________________________________________________
>> FreeNX-kNX mailing list --- FreeNX-kNX@kde.org
>> https://mail.kde.org/mailman/listinfo/freenx-knx
>> ________________________________________________________________
>>
>
>
> ________________________________________________________________
> Were you helped on this list with your FreeNX problem?
> Then please write up the solution in the FreeNX Wiki/FAQ:
>
> http://openfacts2.berlios.de/wikien/index.php/BerliosProject:FreeNX_-_FAQ
>
> Don't forget to check the NX Knowledge Base:
> http://www.nomachine.com/kb/
>
> ________________________________________________________________
> FreeNX-kNX mailing list --- FreeNX-kNX@kde.org
> https://mail.kde.org/mailman/listinfo/freenx-knx
> ________________________________________________________________
>
[Attachment #5 (text/html)]
The following patch fixes this bug.<br><a \
href="http://bazaar.launchpad.net/~freenx-team/freenx-server/teambzr/revision/91">http \
://bazaar.launchpad.net/~freenx-team/freenx-server/teambzr/revision/91</a><br><br>This \
line seems to trigger the problem.<br> echo "NX> 596 Error: Session $1 \
failed. Reason was: $line"<br><br>Running it only when the node failed to \
restore the session solves the bug.<br><br>Regards.<br>Shima<br><br><div \
class="gmail_quote">On Mon, Feb 16, 2009 at 4:50 PM, Freerk Kalsbeek <span \
dir="ltr"><<a href="mailto:f.kalsbeek@mindswitch.nl">f.kalsbeek@mindswitch.nl</a>></span> \
wrote:<br> <blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, \
204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">I've implemented this \
patch at one of our servers. Let's see what happens next few days.<br> \
Haven't had the time to analyse our issues in more detail. Hopefully this fixes \
it.<br><br>Regards,<br><font color="#888888">Freerk</font><div><div></div><div \
class="Wj3C7c"><br><br><div class="gmail_quote"> On Sun, Jan 25, 2009 at 4:32 AM, \
Mario Becroft <span dir="ltr"><<a href="mailto:mb@gem.win.co.nz" \
target="_blank">mb@gem.win.co.nz</a>></span> wrote:<br><blockquote \
class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt \
0pt 0.8ex; padding-left: 1ex;">
I still don't fully understand this problem, but I have a solution.<br>
<br>
I am not very sure about Marcelo's patch because as far as I can see,<br>
NODE_SUSPEND_STATUS is never set to "Suspending". What is this patch<br>
meant to do exactly?<br>
<br>
I found that with slave mode disabled, everything is much easier to<br>
understand, and it does not appear to make it any slower. It did not<br>
exactly fix the problem though, just modified the symptoms.<br>
<br>
The key problem is that when the client nxssh is killed, nxserver hangs<br>
in the echo inside server_nxnode_echo(). It attempts to handle this<br>
situation by installing a SIGPIPE handler that sets<br>
SERVER_CHANNEL=0. Unfortunately, SIGPIPE is never received in this<br>
situation; instead the echo hangs forever. This is what causes it never<br>
to process any more commands from nxnode.<br>
<br>
It is not entirely clear why it happens in this way.<br>
<br>
Anyway, the workaround is to change echo to /bin/echo. /bin/echo returns<br>
immediately if the client is disconnected. Probably it should also check<br>
the status and set SERVER_CHANNEL=0 if /bin/echo failed. However I have<br>
not bothered to do this. It does not seem to matter a great deal.<br>
<br>
This solves the problem both with and without slave mode. I think there<br>
may still be some sort of timing related potential problem here, but I<br>
am not sure, it is all rather complicated.<br>
<br>
I have also noticed another problem that I thought might be related, but<br>
is probably different. If you unplug the network from the currently<br>
logged in client, it takes about 30 seconds before nxagent notices that<br>
the client is gone and suspends the session. If, in this 30-second<br>
window, you login from another client, everything works, but the session<br>
status incorrectly remains in suspended state. I guess this is because<br>
when the second client logs in, it must suspend the session before<br>
restoring it on the new client. Somehow the suspended state of the<br>
session is set after the resumed state. I am out of time and this<br>
problem is not so serious, so I am ignoring it for now. Maybe someone<br>
else has time to look into this one.<br>
<br>
Anyway, for anyone else who has the present problem, please try the<br>
following patch and report back.<br>
<br>
See the patch below (the line numbers might be a bit off since my file<br>
has lots of extra instrumentation):<br>
<div><br>
--8<---------------cut here---------------start------------->8---<br>
</div>--- nxserver.foo 2009-01-25 16:07:46.590977440 +1300<br>
+++ nxserver 2009-01-25 16:07:54.498952944 +1300<br>
@@ -967,8 +967,8 @@<br>
server_nxnode_echo()<br>
{<br>
log 6 "server_nxnode_echo: $@"<br>
- [ "$SERVER_CHANNEL" = "1" ] && echo \
"$@"<br>
- [ "$SERVER_CHANNEL" = "2" ] && echo \
"$@" >&2<br> + [ "$SERVER_CHANNEL" = "1" ] \
&& /bin/echo "$@"<br> + [ "$SERVER_CHANNEL" = \
"2" ] && /bin/echo "$@" >&2<br> }<br>
<br>
server_nxnode_exit_func()<br>
<div><div></div><div>--8<---------------cut \
here---------------end--------------->8---<br> <br>
--<br>
Mario Becroft <<a href="mailto:mb@gem.win.co.nz" \
target="_blank">mb@gem.win.co.nz</a>><br> \
________________________________________________________________<br> Were you helped \
on this list with your FreeNX problem?<br> Then please write up the solution in the \
FreeNX Wiki/FAQ:<br> <br>
<a href="http://openfacts2.berlios.de/wikien/index.php/BerliosProject:FreeNX_-_FAQ" \
target="_blank">http://openfacts2.berlios.de/wikien/index.php/BerliosProject:FreeNX_-_FAQ</a><br>
<br>
Don't forget to check the NX Knowledge Base:<br>
<a href="http://www.nomachine.com/kb/" \
target="_blank">http://www.nomachine.com/kb/</a><br> <br>
________________________________________________________________<br>
FreeNX-kNX mailing list --- <a href="mailto:FreeNX-kNX@kde.org" \
target="_blank">FreeNX-kNX@kde.org</a><br>
<a href="https://mail.kde.org/mailman/listinfo/freenx-knx" \
target="_blank">https://mail.kde.org/mailman/listinfo/freenx-knx</a><br> \
________________________________________________________________<br> \
</div></div></blockquote></div><br> \
</div></div><br>________________________________________________________________<br> \
Were you helped on this list with your FreeNX problem?<br> Then please write up the \
solution in the FreeNX Wiki/FAQ:<br> <br>
<a href="http://openfacts2.berlios.de/wikien/index.php/BerliosProject:FreeNX_-_FAQ" \
target="_blank">http://openfacts2.berlios.de/wikien/index.php/BerliosProject:FreeNX_-_FAQ</a><br>
<br>
Don't forget to check the NX Knowledge Base:<br>
<a href="http://www.nomachine.com/kb/" \
target="_blank">http://www.nomachine.com/kb/</a><br> <br>
________________________________________________________________<br>
FreeNX-kNX mailing list --- <a \
href="mailto:FreeNX-kNX@kde.org">FreeNX-kNX@kde.org</a><br>
<a href="https://mail.kde.org/mailman/listinfo/freenx-knx" \
target="_blank">https://mail.kde.org/mailman/listinfo/freenx-knx</a><br> \
________________________________________________________________<br></blockquote></div><br>
________________________________________________________________
Were you helped on this list with your FreeNX problem?
Then please write up the solution in the FreeNX Wiki/FAQ:
http://openfacts2.berlios.de/wikien/index.php/BerliosProject:FreeNX_-_FAQ
Don't forget to check the NX Knowledge Base:
http://www.nomachine.com/kb/
________________________________________________________________
FreeNX-kNX mailing list --- FreeNX-kNX@kde.org
https://mail.kde.org/mailman/listinfo/freenx-knx
________________________________________________________________
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic