'Re: Some comments on interrim recommendations'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       syslog-sec
Subject:    Re: Some comments on interrim recommendations
From:       John Kelsey <kelsey.j () ix ! netcom ! com>
Date:       2000-08-03 8:33:39
[Download RAW message or body]

-----BEGIN PGP SIGNED MESSAGE-----

At 09:06 PM 8/2/00 -0400, Alex Brown wrote:
...
>John Kelsey wrote:

>And I'm definitely an amateur as a cryptographer.   There are a few
>ugly facts of life with syslog that are tangled up with UNIX and
>TCP/IP (actually unreliable UDP datagrams) but it's  basically just
>the worst case log transport, with no guarantees on anything,
>including tampering and falsification.  If you're interested Chris
>Lonvik <clonvick@cisco.com> has a draft describing the basic
>operation of the "protocol" at
>http://www.employees.org/~lonvick/draft3.txt.  

Thanks, I've pulled it down, and I'll read it sometime soon.  (I'm
about a week from a cross-country move, so it may be a while.)

>> We ought to discuss how to do this intelligently.  For now,
>> suppose Z is the shared secret.  If we start a new log, we
>> generate a unique log ID, L.  We then include L in every
>> log message.  If the source machine (logging client) has
>> some nonvolatile storage, it can just remember the previous
>> L and increment it.  If the source machine doesn't have
>> nonvolatile memory, but can generate random numbers, L can
>> be a 128-bit random number.
>>
>
>There is no concept of "session" in typical syslog usage, so there's
>nothing obvious other than another sequence number to use as log ID.
>Some small devices have a "boot count" in nonvolatile storage that
>serves as an "era" identifier and this could serve as a log ID
>value. But many do not, and most do not have a time of day clock.  
>At any rate, a log ID should have this sense of "session" or "era"
>whatever the mechanism for generating it.  

Okay.  The only ``session'' that's important here is on the source
machine.  Basically, if the source machine crashes and reboots, that
will be reflected somehow in the nonce.  It's reasonable to randomly
generate the log ID if there's no other way to get it to be unique. 
If we can assume that a device won't reboot more than 2^{32} times,
then a 96-bit nonce has less than one chance in 8 billion of ever
repeating for this device.  (This can be tuned to an arbitrarily low
probability.)  But there isn't really a session between the source
and sink.  

One qualm is that many small devices won't have the time, any
nonvolatile memory, or any reasonable way to generate a random
number.  I'm not sure what to do for them.  

>>   Eventually, I think it would make sense to derive a new key
>>   for each log created using shared secret Z.  Among other
>>   things, this would simplify adding encryption to the interim
>>   format.
>>
>
>Encryption was never in the picture that I was painting.  This is
>really just a band-aid for the most serious well-known
>vulnerabilities of syslog;  secure transport of log data should
>definitely use something more robust.   But it would certainly be
>valuable to generate a new Z for each log "session" at both source
>and sink.   Would the presence of a new value of L in the stream
>indicate that Z had been recomputed?  

Yes, exactly.  It would be simple enough to do something like

K = current key for this log
L = log ID
Z = long-term shared secret

K = HMAC(Z,L)

So if the sink machine ever sees an L it hasn't seen before, it just
recomputes a new log key.  (And maybe it checks a list of previously
seen log IDs to catch replays.)  

Anyway, encryption protects against the problem of having someone
selectively drop log message packets.  They stop knowing exactly
which packets to drop, and so have to drop a larger number to ensure
they keep the event they want out of the log.

...
>> Why not specify this as a sequence number?  That way, your
>> nonce size can be kept reasonable, and you can always detect
>> missing messages, and correctly reorder your messages on the
>> logging server side.  When combined with a unique ID per
>> log, you get something like this:
>>
>> Z = long term shared secret
>> L = unique log ID
>> N = log entry sequence number
>> M = raw log message
>>
>> LogMessage = L,N,M,HMAC(Z,(L,N,M))
>>
>> That is, the full log message is: logID, sequence number,
>> message, and MAC.  (You can think of (L,N) as the nonce, if
>> you like.)
>
>Well, Z would now be a new secret Z(L)  following your suggestion. 
>But yes, this is reasonable.  I think the only question is whether
>the notion of "session" really applies to ordinary syslog usage --
>what event causes N to be reset and L to be advanced?  In small
>network devices, the only reasonable events are a network management
>control operation (perhaps a soft reset) or a power cycle (hard
>reboot).  

This was my thought.  A new log is started whenever the machine
reboots and loses its previous log context.  

>> >Chained authentication
>> >
>> >The nonce may be replaced with the last MAC sent from the
>> >log client, making it possible to detect an insertion or gap
>> >in the syslog stream from a client (same secret):
>> >
>> >... Nov  5 14:14:54 zorilla PAM_pwdb[509]: (login) session
>> >opened \ for user abrown by (uid=0)
>> >chain=227c40a6cde84f49bfad43c412490110 \
>> >md5=a6739e57964c9dec7613d663f049c0f7 Nov  5 14:14:55 zorilla
>> >PAM_pwdb[509]: (login) session closed \ for user abrown
>> >chain=a6739e57964c9dec7613d663f049c0f7 \
>> >md5=cbce1c7ced9cfdc1fb86ba8ef365d8eb ...
>>
>> I don't see what this buys you.  If you use a sequence
>> number and MAC all your log messages, you'll be able to
>> detect an insertion or gap, and you'll be able to
>> authenticate your messages.  But you save the extra hash
>> operation.
>>
>
>The MAC value in the "chain=" field is saved from the previous
>message, not recomputed.  I actually don't think there is an added
>computational burden --it's just like any other nonce value.  The
>main benefit, as I've said, is that there is no sense of session in
>syslog usage that would lead to a natural log ID value L to
>accompany a sequence number N, and this approach doesn't impose one.
> 

You're right, there's almost no computational hit.  (The only impact
is that the message is lengthened by 160 bits.)  

>> >Each individual record can still be authenticated
>> >independently from all others, but also can be sequenced by
>> >verifying the "chain" field. The log host syslogd or filter
>> >must retain the previous MAC value for each log source. The
>> >first record from each source will be unsequenceable,
>> >because no prior MAC value is available. A missing or
>> >corrupted record or a gap can be identified but not
>> >recovered.
>>
>> Using a sequence number here is much better.  We used the
>> hash chain in the secure audit log design to make it easy to
>> verify the logfile's accuracy remotely, over a very low
>> bandwidth channel.  (Verify that the hash chain works, then
>> verify the first and last records' MACs.)  This application
>> doesn't need it.
>>
>> To defend against replays, we keep track of the most
>> recently received sequence number, and the current log ID,
>> and have some limit for how long a message can reasonably be
>> delayed.  (This can be huge, e.g. 1000 messages.)  To detect
>> insertions or deletions, we look for sequence numbers.  To
>> deal with out-of-order log messages, we sort them by
>> sequence number either when they arrive, or when they're
>> reviewed.
>>
>
>All easier to do with sequence numbering, it's true.

The real killer here is that we can't tell anything about how many
log messages have been dropped with the hash chain method.  Also,
on-the-fly replay detection (to avoid having your sink machine filled
up with replayed messages) is much harder with hash chains than with
log IDs + sequence numbers.  

...
>Encryption may
>also be needed for secure storage, but this would be based on the
>nature of the log data and perhaps local site policy.  For many uses
>tamper resistance through chained authentication in the logfile
>would be sufficient.  

Right.  I wasn't thinking of encrypting it for secure storage, but
rather to avoid having an attacker be able to see exactly which log
messages he should kill on the network, to keep them out of the log. 

>>
>> c.  UDP (Can we set up a format for sending these messages
>> so that an implausibly large number of packets have to be
>> dropped to lose messages?  I have some ideas on this, which
>> I'll write up in a separate note.)
>
>Yes,  some form of error correcting encoding might be possible too,
>but syslog transport is probably not worth the effort.  

I was thinking of something pretty simple (using an error-correcting
code to basically split (say) each batch of 8 messages into 12
shares, with any 8 able to reconstruct the whole batch.  If these
shares are also encrypted, an attacker can only prevent a given
message from making it into the log by clobbering a bunch of these
UDP packets all at once.  I'm assuming that, for some set of
parameters, we can say that losing a whole batch should basically
never happen.  (Though it's an open question whether that set of
parameters costs too much in message expansion terms.)  

The goal is to destroy the ability of an attacker to intercept and
selectively delete a few log messages on the fly.  He either does
nothing, or does something obvious.  But my ignorance of networking
issues may be showing here, I don't know what set of parameters, if
any, would work in general.  

>Alex Brown  http://www.msg.com/~abrown +1 617 504 8761

- --John Kelsey, kelsey@counterpane.com

-----BEGIN PGP SIGNATURE-----
Version: PGPfreeware 6.5.1 Int. for non-commercial use
<http://www.pgpinternational.com>
Comment: foo

iQCVAwUBOYkuUyZv+/Ry/LrBAQHhhQP7BHxlx5vk+Hb7uR5fw6xPCqdG4rW9wx/P
aVGl+Pv+PPqOURhXovEoI1JiJI5QGuiSvPVReQCVTrb+kBgcbjAeWju2NhQvq6IA
RWUN0/b/FXfRTE5VHIcGYzKMi1LDrpJOZVT3ZJFXOVjI2SIoilPHwJSCJwjg9uhE
4039cYpuqAA=
=0e22
-----END PGP SIGNATURE-----

[prev in list] [next in list] [prev in thread] [next in thread]
Configure | About | News | Add a list | Sponsored by KoreLogic