[prev in list] [next in list] [prev in thread] [next in thread] 

List:       aix-l
Subject:    Re: Settling a food fight?
From:       Peter Bell <pbell () LACLINICA ! ORG>
Date:       2009-08-25 6:12:26
Message-ID: 20090825061232.99B6A4A23EC () m3 ! laclinica ! org
[Download RAW message or body]

Hi --

These are very small files, which are getting passed from one system
to another.  They're evanescent;  they exist on the NFS host for at most
15 minutes before the second process picks them up.  Because the lifespan
is variable (the writes are triggered by specific database transactions, and
we can't predict those ahead of time) capturing all of their names and sizes
with a cron job isn't a lock, either.

The weird thing about this one is that during the time when the receiver
did not pick up any files, it was picking up files coming off a 
second interface
(two accounts in the mv.E database, writing to two folders, being fed into two
MS SQL databases.)

I haven't yet had time to go through last night's mail in enough detail to
verify that all of what you are recommending we are doing - I'm not
the PICK programmer by a long shot and I've got a couple of vendors telling
me what each saw, and customers seeing data missing from the second
database.

This is why I had hoped to be able to do an audit outside of the database,
either from within AIX or from the NFS file system.  Frankly, our PICK
programmer is very good and has implemented a lot of auditing;  in
the past, his auditing has always been accurate.  I tend to believe him
more than the SQL vendor, but I can't hang my hat on that.

Tripwire (well, free Tripwire) won't work for this.  Audit on the filer is
one way to go, although the release I'm working with now doesn't seem
able to 'listen' to directories, it's only listening to named files.

The folks who we call when the AIX side needs serious attention are telling
me that they think that in an NFS implementation where the filesystem has
sync set, that AIX should be fairly bulletproof.  This is what I'm 
hearing from
them:

"the file operation call only return when the server has completed 
all work for this operation. In case of a write request, the server 
will physically write the data to disk and if necessary, update any 
directory structure before returning a response to the client. This 
ensures file integrity."

They're also telling me they don't know of an NFS logging implementation
in AIX, and what I'm thinking is that I really do need to do it on the NFS
system.  I'd hoped the write was something that engaged enough of AIX
that it was a loggable event, but as you explained last night it's almost
certainly not.

I just hope that if this does happen again, we're able to point to an 
audit record.

We may ultimately pay for Tripwire to get this visibility.  I'm going to have
them set us up with a demo and see if it will let me do what I need to
do.  I also need to rtfm some more on good old audit, which I think
actually is up to the job (although it means I need to move the NFS
system to a newer platform, which is on the roadmap in any case.)

-Peter

At 09:55 PM 8/24/2009, you wrote:
>Peter, how large is the data file that is being written and is it by 
>any chance a EOM (End-Of_Month) reporting issue or something where 
>the file is a different size at different times of the month?  If 
>your data file is larger than the area where the target is being 
>written, then if the process fails (more likely errors out unnoticed 
>due to improper application error logging) there will be no closing 
>write on the file and it will appear never to have been written.  No 
>time stamp or anything other indication of activity.
>
>As a contract admin, I have seen that condition definitely cause a 
>really good and loud food fight.
>
>Jerry
>
>On Sun, 8/23/09, Peter Bell <pbell@LACLINICA.ORG> wrote:
>
>From: Peter Bell <pbell@LACLINICA.ORG>
>Subject: Settling a food fight?
>To: aix-l@Princeton.EDU
>Date: Sunday, August 23, 2009, 12:20 PM
>
>Hi --
>
>A multivalue database living in our AIX 5.3 install writes files out to
>an NFS system using PUTX.  The files are be picked up by an interface to
>another system.
>
>I am wondering what I can do to log those writes on the AIX side.  There
>is an intermittent file reception or creation issue we need to
>troubleshoot.
>
>The options on the NFS box itself aren't looking too appetizing (just
>yet) and I'm wondering:  do I already have this info on logs on the AIX
>side of the system?  Or would it be (somewhat) simple to enable it?
>
>Thanks much,
>
>Peter Bell
><http://us.mc577.mail.yahoo.com/mc/compose?to=pbell@laclinica.org>pbell@laclinica.org

[Attachment #3 (text/html)]

<html>
<body>
Hi --<br><br>
These are very small files, which are getting passed from one system<br>
to another.&nbsp; They're evanescent;&nbsp; they exist on the NFS host
for at most <br>
15 minutes before the second process picks them up.&nbsp; Because the
lifespan<br>
is variable (the writes are triggered by specific database transactions,
and<br>
we can't predict those ahead of time) capturing all of their names and
sizes <br>
with a cron job isn't a lock, either.&nbsp; <br><br>
The weird thing about this one is that during the time when the
receiver<br>
did not pick up any files, it was picking up files coming off a second
interface <br>
(two accounts in the mv.E database, writing to two folders, being fed
into two <br>
MS SQL databases.)&nbsp; <br><br>
I haven't yet had time to go through last night's mail in enough detail
to <br>
verify that all of what you are recommending we are doing - I'm not<br>
the PICK programmer by a long shot and I've got a couple of vendors
telling<br>
me what each saw, and customers seeing data missing from the second<br>
database.&nbsp; <br><br>
This is why I had hoped to be able to do an audit outside of the
database,<br>
either from within AIX or from the NFS file system.&nbsp; Frankly, our
PICK<br>
programmer is very good and has implemented a lot of auditing;&nbsp;
in<br>
the past, his auditing has always been accurate.&nbsp; I tend to believe
him<br>
more than the SQL vendor, but I can't hang my hat on that.&nbsp;
<br><br>
Tripwire (well, free Tripwire) won't work for this.&nbsp; Audit on the
filer is<br>
one way to go, although the release I'm working with now doesn't seem
<br>
able to 'listen' to directories, it's only listening to named
files.&nbsp; <br><br>
The folks who we call when the AIX side needs serious attention are
telling <br>
me that they think that in an NFS implementation where the filesystem has
<br>
sync set, that AIX should be fairly bulletproof.&nbsp; This is what I'm
hearing from <br>
them:<br><br>
&quot;<font size=2>the file operation call only return when the server
has completed all work for this operation. In case of a write request,
the server will physically write the data to disk and if necessary,
update any directory structure before returning a response to the client.
This ensures file integrity.</font>&quot;&nbsp; <br><br>
They're also telling me they don't know of an NFS logging
implementation<br>
in AIX, and what I'm thinking is that I really do need to do it on the
NFS <br>
system.&nbsp; I'd hoped the write was something that engaged enough of
AIX <br>
that it was a loggable event, but as you explained last night it's
almost<br>
certainly not.&nbsp; <br><br>
I just hope that if this does happen again, we're able to point to an
audit record.&nbsp; <br><br>
We may ultimately pay for Tripwire to get this visibility.&nbsp; I'm
going to have<br>
them set us up with a demo and see if it will let me do what I need
to<br>
do.&nbsp; I also need to rtfm some more on good old audit, which I
think<br>
actually is up to the job (although it means I need to move the NFS <br>
system to a newer platform, which is on the roadmap in any case.)&nbsp;
<br><br>
-Peter<br><br>
At 09:55 PM 8/24/2009, you wrote:<br>
<blockquote type=cite class=cite cite="">Peter, how large is the data
file that is being written and is it by any chance a EOM (End-Of_Month)
reporting issue or something where the file is a different size at
different times of the month?&nbsp; If your data file is larger than the
area where the target is being written, then if the process fails (more
likely errors out unnoticed due to improper application error logging)
there will be no closing write on the file and it will appear never to
have been written.&nbsp; No time stamp or anything other indication of
activity.<br>
&nbsp;<br>
As a contract admin, I have seen that condition definitely cause a really
good and loud food fight.<br>
&nbsp;<br>
Jerry<br><br>
On <b>Sun, 8/23/09, Peter Bell <i>&lt;pbell@LACLINICA.ORG&gt;</i></b>
wrote:<br>

<dl><br>

<dd>From: Peter Bell &lt;pbell@LACLINICA.ORG&gt;<br>

<dd>Subject: Settling a food fight?<br>

<dd>To: aix-l@Princeton.EDU<br>

<dd>Date: Sunday, August 23, 2009, 12:20 PM<br><br>

<dd>Hi --<br><br>

<dd>A multivalue database living in our AIX 5.3 install writes files out
to<br>

<dd>an NFS system using PUTX.&nbsp; The files are be picked up by an
interface to<br>

<dd>another system.&nbsp;&nbsp; <br><br>

<dd>I am wondering what I can do to log those writes on the AIX
side.&nbsp; There<br>

<dd>is an intermittent file reception or creation issue we need to<br>

<dd>troubleshoot.&nbsp; <br><br>

<dd>The options on the NFS box itself aren't looking too appetizing
(just<br>

<dd>yet) and I'm wondering:&nbsp; do I already have this info on logs on
the AIX<br>

<dd>side of the system?&nbsp; Or would it be (somewhat) simple to enable
it?&nbsp; <br><br>

<dd>Thanks much,<br><br>

<dd>Peter Bell<br>

<dd>
<a href="http://us.mc577.mail.yahoo.com/mc/compose?to=pbell@laclinica.org">
pbell@laclinica.org</a><br>

</dl></blockquote></body>
</html>


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic