'Re: [Netatalk-admins] Disconnections from Mac'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       netatalk
Subject:    Re: [Netatalk-admins] Disconnections from Mac
From:       didier <dgautheron () magic ! fr>
Date:       2009-08-14 10:37:48
Message-ID: 1250246268.7779.108.camel () server
[Download RAW message or body]

Hi,

Le jeudi 13 août 2009 à 12:22 -0700, manishmotwani@yahoo.com a écrit :
> I updated my Mac client to 10.5.8 and Linux server to Netatalk 2.0.4
> and I thought at first that the disconnect problem has gone away. I
> also switched from "cbd" to "dbd" and I don't see database corruption
> anymore.  
> But the disconnection issue is still there. 
> Here's what I'm getting on the server logs while transferring big
> files: 
> 
> 06 afpd[23981]: ASIP session:548(5) from 192.168.5.147:57686(7) 
> 06 afpd[32077]: dsi_stream_send: Broken pipe 
> 06 afpd[32077]: dsi_wrtreply: Broken pipe
> 

Network error (surely a TCP Reset) in the server when trying to write to
the client.


> 06 afpd[32077]: dsi_stream_write: Broken pipe 
> 06 afpd[32077]: 2314371013.08KB read, 2286473429.02KB written 
> 06 afpd[32077]: dsi_stream_write: Broken pipe 
> 06 afpd[32077]: Connection terminated

Server exit, normal behavior after the previous error.


> 
> 06 afpd[23981]: login noauth 
> 06 afpd[23981]: login nobody (uid 99, gid 99) AFP3.1 
> 06 afpd[31757]: server_child[1] 32077 exited 1 
> 06 afpd[23981]: ipc_write: command: 1, pid: 23981, msglen: 4 
> 06 afpd[31757]: ipc_read: command: 1, pid: 23981, len: 4 
> 06 afpd[31757]: child 32077 user 99 disconnected 
> 
Client reconnect an try to kill the old session, again nothing unusual.

> 
> 07 afpd[23981]: ipc_write: command: 2, pid: 23981, msglen: 24 
> 07 afpd[31757]: ipc_read: command: 2, pid: 23981, len: 24 
> 07 afpd[31757]: Setting clientid (len 16) for 23981, boottime 4A7B0EB6
> 07 afpd[31757]: ipc_get_session: len: 24, idlen 16, time 4a7b0eb6 
> 
New session start.

> 
> 
> 
> My afpd.conf looks like this:
> - -tcp -noddp -uamlist uams_guest.so
> 
> And the AppleVolumes.default:
> 
> /shares/myshare "share" options:mswindows cnidscheme:dbd dbpath:/etc/atalk/afpdb
> 
> Am I doing something wrong here in the setup? It takes a long time to
> reproduce but I keep getting the same disconnection error eventually.
> This time it took over 4 days of constant reading and writing large
> files (10-20gb files) for the disconnection to occur. I have a nearly
> perfect local network with 2-3ms latency at worst. If it's not the
> network, what could cause the broken pipe and connection termination?
> 
It could be:
1) - a bug in the server, we send junk to the client, it dislikes it and
disconnects.
2) - A problem with your setup, the server sleeps for a long time when
trying to write data to its disks and the client disconnects.
3) - a bug in the client.
4) - a deadlock, it seems our workaround isn't applied to write reply.

What deadlock?
First a typical traffic when clients read and write between two files:

client                  server
request read1 128kB     
                        send read1 (write 128kB to the network)
request read2 128kB
request read3 128kB
                        end read1
                        send read2
write1 128kB (write 128kB to the network)
end write1
request read4 128kB     

                        end read2
                        send read3
write2 128kB
end write2
request read5 128kB     

                        end read3
write3 128kB
end write3
request read6 128kB     
                        read write1 (read 128kB from the network)
			read write2
			read write3
                        send read4
			

and so one.
---------------------------

But sometime, because small transient network errors whatever, we have
request read1 128kB     
                        send read1 (write 128kB to the network)
request read2 128kB
request read3 128kB
                        end read1
                        send read2
write1 128kB 
the socket buffer is full, the client sleeps in write and *doesn't*
empty its socket read buffer...
                        end read2
                        send read3 as there's no reader for read2 and
read3 the server write buffer is full and it sleeps in write too...

Deadlock, both side sleep in write with no reader.
There's a workaround in netatalk for this case but it's only a
workaround and it may have not working corner cases.

You can try to capture the traffic but you need wireshark dumpcap and
use the '-b  <capture ring buffer option>' option, you only need
to capture tcp headers. This deadlock shows up as TCP zero window on
*both* sides.

Didier



------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
_______________________________________________
Netatalk-admins mailing list
Netatalk-admins@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/netatalk-admins

[prev in list] [next in list] [prev in thread] [next in thread]
Configure | About | News | Add a list | Sponsored by KoreLogic