[prev in list] [next in list] [prev in thread] [next in thread] 

List:       netatalk
Subject:    Re: [Netatalk-admins] CNID Backend for busy network homes
From:       Josh Beard <josh () signalboxes ! net>
Date:       2012-03-01 2:53:50
Message-ID: 4F4EE4BE.5020005 () signalboxes ! net
[Download RAW message or body]

[Attachment #2 (multipart/alternative)]


On 02/23/2012 04:17 PM, Josh Beard wrote:
> Hello,
>
> I'm using a self-built Netatalk 2.2.2 on an otherwise vanilla Ubuntu
> 10.04 server on an EXT4 filesystem with ACLs (with a boat load of
> messages in the logs related to acls).
>
> Normal traffic for this server is 150-200 simultaneous clients with
> "live" network homes, using mostly Snow Leopard systems.
>
> I've been experimenting with various combinations here, but haven't
> found a "sweet spot" yet.  I started with the default dbd backend, but
> switched to cdb after reading that it may be better for network homes.
> Unfortunately, cdb would produce the ugly CNID warning messages upon
> login every day or two, so I switched back to dbd.
>
> Now that I'm on dbd, I've had at least two afpd crashes (Signal 11),
> where the symptoms are *extremely* slow logins (5 minutes) and very poor
> throughput (~100 KBps).  This has occurred with two different shares.
>
> For trial, I've set the dbpath for one of the shares to store on a
> different filesystem than the homes are on.  It's still too soon to see
> if that's worked.
>
> Between switching backends, I've just been stopping netatalk and
> recursively removing any .AppleD* data in the shares' paths.
>
> When the crash occurs, it seems like sending a SIGHUP to cnid_metad gets
> things moving again.  A complete restart of afpd is hardly an option
> when there's 200 users connected across campus.
>
> With the last crash, I couldn't test too much, but it almost seemed like
> the performance issue was limited to a specific share, but I don't have
> conclusive results on that.
>
> Log snips are below.  My questions are:
> Does anyone have any insight on this?
> Is the cnid db corruption causing the afpd crash?
> Is afpd likely crashing for another reason?
> Is moving the dbpath to a different filesystem even practical?
> Can the CNID warning messages be silenced for the end users?
>
> Thanks!
>
> Leading up to the crash, the following message starts appearing in the logs:
> Feb 23 14:38:40.353861 afpd[7835] {cnid_dbd.c:425} (E:CNID): transmit:
> Request to dbd daemon (db_dir /media/store/homes/ms/students) timed out.
>
> Here's a snip from the log:
>
<snip>

Just for kicks, this was from earlier today, using dbd and 150 connected 
users (Xeon E5506  @ 2.13GHz 8c):

load average: 81.67, 66.21, 38.74

Brutal, to say the least.

:D



[Attachment #5 (text/html)]

<html>
  <head>
    <meta content="text/html; charset=ISO-8859-1"
      http-equiv="Content-Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    On 02/23/2012 04:17 PM, Josh Beard wrote:
    <blockquote cite="mid:4F46C8FF.6000106@signalboxes.net" type="cite">
      <pre wrap="">Hello,

I'm using a self-built Netatalk 2.2.2 on an otherwise vanilla Ubuntu 
10.04 server on an EXT4 filesystem with ACLs (with a boat load of 
messages in the logs related to acls).

Normal traffic for this server is 150-200 simultaneous clients with 
"live" network homes, using mostly Snow Leopard systems.

I've been experimenting with various combinations here, but haven't 
found a "sweet spot" yet.  I started with the default dbd backend, but 
switched to cdb after reading that it may be better for network homes.  
Unfortunately, cdb would produce the ugly CNID warning messages upon 
login every day or two, so I switched back to dbd.

Now that I'm on dbd, I've had at least two afpd crashes (Signal 11), 
where the symptoms are *extremely* slow logins (5 minutes) and very poor 
throughput (~100 KBps).  This has occurred with two different shares.

For trial, I've set the dbpath for one of the shares to store on a 
different filesystem than the homes are on.  It's still too soon to see 
if that's worked.

Between switching backends, I've just been stopping netatalk and 
recursively removing any .AppleD* data in the shares' paths.

When the crash occurs, it seems like sending a SIGHUP to cnid_metad gets 
things moving again.  A complete restart of afpd is hardly an option 
when there's 200 users connected across campus.

With the last crash, I couldn't test too much, but it almost seemed like 
the performance issue was limited to a specific share, but I don't have 
conclusive results on that.

Log snips are below.  My questions are:
Does anyone have any insight on this?
Is the cnid db corruption causing the afpd crash?
Is afpd likely crashing for another reason?
Is moving the dbpath to a different filesystem even practical?
Can the CNID warning messages be silenced for the end users?

Thanks!

Leading up to the crash, the following message starts appearing in the logs:
Feb 23 14:38:40.353861 afpd[7835] {cnid_dbd.c:425} (E:CNID): transmit: 
Request to dbd daemon (db_dir /media/store/homes/ms/students) timed out.

Here's a snip from the log:

</pre>
    </blockquote>
    &lt;snip&gt;<br>
    <br>
    Just for kicks, this was from earlier today, using dbd and 150
    connected users (Xeon E5506&nbsp; @ 2.13GHz 8c):<br>
    <br>
    <meta http-equiv="content-type" content="text/html;
      charset=ISO-8859-1">
    <span class="Apple-style-span" style="border-collapse: separate;
      color: rgb(0, 0, 0); font-family: Ubuntu; font-style: normal;
      font-variant: normal; font-weight: normal; letter-spacing: normal;
      line-height: normal; orphans: 2; text-align: -webkit-auto;
      text-indent: 0px; text-transform: none; white-space: normal;
      widows: 2; word-spacing: 0px; -webkit-border-horizontal-spacing:
      0px; -webkit-border-vertical-spacing: 0px;
      -webkit-text-decorations-in-effect: none;
      -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px;
      font-size: medium; "><span class="Apple-style-span"
        style="white-space: pre-wrap; ">load average: 81.67, 66.21,
        38.74</span></span><br>
    <br>
    Brutal, to say the least.<br>
    <br>
    :D<br>
    <br>
    <br>
  </body>
</html>


------------------------------------------------------------------------------
Virtualization & Cloud Management Using Capacity Planning
Cloud computing makes use of virtualization - but cloud computing 
also focuses on allowing computing to be delivered as a service.
http://www.accelacomm.com/jaw/sfnl/114/51521223/

_______________________________________________
Netatalk-admins mailing list
Netatalk-admins@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/netatalk-admins


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic