[prev in list] [next in list] [prev in thread] [next in thread] 

List:       openldap-devel
Subject:    Strange issue with contextCSN
From:       Pierangelo Masarati <ando () sys-net ! it>
Date:       2008-11-22 16:30:52
Message-ID: 492833BC.2020102 () sys-net ! it
[Download RAW message or body]

I'm running concurrency tests of MMR, and I see some strange issues:

1) loss of sync after multi-concurrent load (multiple concurrent ops on 
each server, and modifications to the same data subset on all servers, 
in order to trigger conflicts).  I'm still trying to see if there is any 
pattern or clue about what failed (like finding some explanation in the 
logs).  This happens once in a while after many operations.  I don't 
expect this to be necessarily a bug; it might be the consequence of 
conflicts.  Of course, it would be nice if slapd allows to clearly 
identify where the conflict occurred, to support manual resolution.

2) loss of sync after single-concurrent load (multiple concurrent ops on 
a single server).  This is really inesplicable (to me), as there should 
be no conflict.  The only possible explanation I see (but need to 
investigate further) is that an entry is added on a server, sync'd to 
another one and, in the meanwhile, deleted on the first one before its 
own sync gets back.  This happens very seldom.

3) whay puzzled me a bit is that when I load a single server, I'd expect 
to end up with a single contextCSN containing the SID of that server. 
This is correct for the server I load, but the others, even when they 
get correctly sync'd, contain a contextCSN for each server in the MMR 
pool, and the contextCSN with the other SIDs don't get propagated to the 
server that was loaded.  It's not clear why those CSNs are generated, 
and how they get into the loop and propagate between servers that do not 
receive direct modifications.

4) another thing that puzzled me a bit is that in some cases, when all 
servers are loaded, and the contextCSNs are one for each SID and the 
same in all of the servers, they are sorted randomly, and differently; 
for example:

bash-3.2$ diff -u testrun/server2.out testrun/server3.out
--- testrun/server2.out	2008-11-22 17:17:28.000000000 +0100
+++ testrun/server3.out	2008-11-22 17:17:28.000000000 +0100
@@ -2497,8 +2497,8 @@
  associatedDomain: example.com
  entryCSN: 20081122161630.753152Z#000000#001#000000
  contextCSN: 20081122161708.935242Z#000000#001#000000
-contextCSN: 20081122161658.195350Z#000000#002#000000
  contextCSN: 20081122161658.193983Z#000000#003#000000
+contextCSN: 20081122161658.195350Z#000000#002#000000

Not a big deal (except for the need to sort values to compare them), but 
I'd expect them to be exactly in the same order...

I'm going to pack my suite of tests and put them on ftp.openldap.org 
(and eventually add them to OpenLDAP's test suite, specifically meant to 
test MMR), but first I need to polish them a little bit and enucleate 
those that present issues, in order to open specific ITSes.

p.


Ing. Pierangelo Masarati
OpenLDAP Core Team

SysNet s.r.l.
via Dossi, 8 - 27100 Pavia - ITALIA
http://www.sys-net.it
-----------------------------------
Office:  +39 02 23998309
Mobile:  +39 333 4963172
Fax:     +39 0382 476497
Email:   ando@sys-net.it
-----------------------------------

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic