[prev in list] [next in list] [prev in thread] [next in thread] 

List:       linux-ha-dev
Subject:    Re: [Linux-ha-dev] [Pacemaker] [PATCH 0/8 ] ha_logd and cl_log
From:       Bernd Schubert <bs_lists () aakef ! fastmail ! fm>
Date:       2010-09-17 15:42:06
Message-ID: 201009171742.06514.bs_lists () aakef ! fastmail ! fm
[Download RAW message or body]

Hello Dejan,

On Thursday, September 16, 2010, Dejan Muhamedagic wrote:
> Hi Bernd,
> 
> On Thu, Sep 16, 2010 at 12:09:11AM +0200, Bernd Schubert wrote:
> > Hi all,
> > 
> > the following patches are to better handle bug 2470 and have some generic
> > improvements. I'm not sure if I shall attach it to the bugzilla or if the
> > mailing list is preferred.
> 
> Well, ML is the right place if you expect to have a discussion
> about the patches, i.e. that the patches are actually a proposal.
> Otherwise, bugzilla is better. Also, these are about cl_log/logd,
> so the right place would be linux-ha-dev (and let's move the
> discussion there).

ok great. Ever since heartbeat split up, I'm a bit confused which list is 
reponsible for which component. I remobed the pacemaker list.

> 
> Didn't have time to take a closer look at the patches, but
> judging by the subject lines, they are all mainly cleanups,
> right? What is your motivation, i.e. do they target any
> particular functionality?

All patches not only have a subject line, but also a commit message :) That is 
plain English (at least I hope so, when it gets past midnight, my writing gets 
bad in any language...).

> 
> I was also confused by the problems you ran into while logging
> through syslog, because I've never encountered that nor heard
> anybody complaining about it, iirc.

So to debug the stop problem we had, when not all RA environment parameters 
had been set, it took me 10 min to reproduce it and about 9 hours to get the 
logs. I was close to pull out all of my hairs. Most of the time only to figure 
out what was going on...
As I said in the bugzilla, probably not so many people use HA software on top 
of NFS + unionfs-fuse. However, that is our test cluster, where I often switch 
kernels, distributions, etc. So a diskless NFS environment simplifies 
management a lot. I also probably would not have bothered to look further into 
logd, if I wouldn't work on a new DDN storage product, that uses virtual 
machines. Now it might sound a bit contradictory  that I try to optimize the 
IO pattern of log messages for a storage product, that is supposed to deliver 
high IO anyway. Well, all the high speed devices are used for real services, 
such as Lustre, GPFS, etc. The system disk on the other hand is fully 
virtualized and therefore rather slow. Anything that does such horrible 
open/seek/write/flush/close pattern as ha_logd will take some CPU and other 
bandwidth, which might in the end reduce performance of the real storage 
product (CPU usage is really an issue for us). In case of ha_logd that it not 
really required, but was just due to a very simple workaround for log-
rotation. 
While some the patches maybe got a bit large, I think the code looks better 
now.

> 
> Sorry, won't have more specific comments until the end of
> next week.

No problem, I can entirely understand that. Time is critical for me too.


Thanks,
Bernd

-- 
Bernd Schubert
DataDirect Networks
_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic