[prev in list] [next in list] [prev in thread] [next in thread] 

List:       cyrus-devel
Subject:    =?UTF-8?Q?Re:_[RFC]_multiplexing_cyrus_replication_with_log/log-run_shar?= =?UTF-8?Q?ding_&_multiple
From:       "Bron Gondwana" <brong () fastmailteam ! com>
Date:       2019-11-21 15:37:49
Message-ID: 5ce1a879-4e7e-4a16-8cc3-91d4eec50e8b () dogfood ! fastmail ! com
[Download RAW message or body]

Oh I should just add the other thing that you might be interested in that I've got \
some initial stabs at - synchronous replication. Embedding the sync_client logic into \
mailbox commit such that any action that writes to a mailbox does a pass through and \
creates a "SYNC APPLY MAILBOX" dlist stanza and shoves it down the wire at a replica. \
There's a couple of bits missing so far - it needs a way to upload the message \
content as well, which I'm probably just going to embed directly the the RECORD \
compontent - it's kind of ugly but it makes it a single DLIST - and obviously it \
needs to not fail entirely if the replica is down, so it might be fire and forget or \
it might have a timeout after which it syslogs but returns.

It builds on top of the existing SINCE_MODSEQ and SINCE_UIDNEXT logic that's already \
in master, and will also want a "sync cache" - which will store the remote MAILBOX \
line for each mailbox, so you can generate a SYNC APPLY without first having to do a \
SYNC GET to find the current remote state. Assuming the replica hasn't changed in the \
meanwhile, this will allow for single round trip apply of changes rather than the \
current 4 round trips for an APPEND.

(truly, it's 4 round trips!)

S0 SYNCGET MAILBOX user.cassandane
S1 SYNCAPPLY RESERVE
S2 SYNCAPPLY MESSAGE
S3 SYNCAPPLY MAILBOX

In my plan, the "SYNCGET MAILBOX" would not be needed, because you'd already know the \
remote state. The "RESERVE" would not be needed because you'd already know from the \
local conversations.db that this message wasn't listed in any other mailbox or with a \
UID less than the UIDNEXT of the remote mailbox from that known remote state, so all \
you'd have left is the SYNCAPPLY MESSAGE and the SYNCAPPLY MAILBOX. So it's just a \
matter of merging those into a single round trip with some nice combined format, and \
you're done :)

Bron.

On Fri, Nov 22, 2019, at 02:25, Bron Gondwana wrote:
> Wow, interesting. That definitely works, though I'd probably normalise everything \
> to the user ID so that the seen and mailbox events for the same user got the same \
> channel. 
> We're looking at similar things for our setup too, either shading or even per user \
> logs with a daemon which farms users out to multiple channels. 
> As for when we'd look at a sync daemon: probably next year. We're planning to land \
> uuid based storage soon, which means that renaming users and mailboxes is really \
> fast, then looking at replication channels on top of that would make more sense, \
> because otherwise user renames become tricky. 
> I'll have a look at the diff when it isn't 11:30pm for me.
> 
> Cheers,
> 
> Bron
> 
> On Thu, Nov 21, 2019, at 18:50, Thomas Cataldo wrote:
> > Hi,
> > 
> > In our workload, cyrus replication latency is pretty critical as we serve most \
> > read requests from the replica. Having a single network channel between master & \
> > replica is a big issue for us. 
> > Trying to improve our latency, we implemented the following approach : instead of \
> > writing "channel/log" we write "channel/log.<shard_index>". We compute our shard \
> > key this way : 
> > # cat log.0 
> > APPEND devenv.blue!user.tom.Sent
> > MAILBOX devenv.blue!user.tom.Sent
> > 
> > # cat log.2 
> > SEEN tom@devenv.blue 9f799278-a6cd-45b7-9546-0e861d5e15d6
> > root@bm1804:/var/lib/cyrus/sync/core# cat log.3 
> > …
> > APPEND devenv.blue!user.sga
> > MAILBOX devenv.blue!user.sga
> > 
> > We compute an hashcode of the first argument. We normalize it so \
> > devenv.blue!user.tom.Sent and devenv.blue!user.tom have the same hashcode then we \
> > "hashcode % shard_count" to figure out which log file to use. We patched \
> > sync_client to add a "-i <shard_index>". sync_client -i 0 will process log.0 and \
> > use log-run.0, etc. 
> > We don't spawn sync_client from cyrus.conf but we prefer systemd tricks :
> > 
> > /lib/systemd/system/bm-cyrus-syncclient@.service which is a template and we then \
> > enable : systemctl enable bm-cyrus-syncclient@{0..3} to spawn 4 sync_client.
> > 
> > 
> > Attached diff of what we changed. 
> > 
> > As a side note, our usage forbids moving a mailbox folder into another mailbox \
> > (ie. moving user.tom.titi into user.sga.stuff is forbidden in our setup). I guess \
> > this approach would be problematic we moving a mailbox subfolder to another \
> > mailbox as they might be sharded to separate log files. 
> > Any feedback on this approach ? I read that you planned to turn sync_client into \
> > a sync daemon. Any schedule estimate on that ? 
> > Regards,
> > Thomas.
> > 
> > 
> > sync_client systemd configuration template :
> > /lib/systemd/system/bm-cyrus-syncclient@.service (%i is expanded to 42 by systemd \
> > when you enable syncclient@42) [Unit]
> > Description=BlueMind Cyrus sync_client service
> > After=bm-cyrus-imapd.service
> > PartOf=bm-cyrus-imapd.service
> > ConditionPathExists=!/etc/bm/bm-cyrus-imapd.disabled
> > 
> > [Service]
> > Type=forking
> > Environment=CONF=/etc/imapd.conf
> > ExecStartPre=/usr/bin/find /var/lib/cyrus/sync -name ‘log*.%i' -type f -exec rm \
> > -f {} \; ExecStart=/usr/sbin/sync_client -C $CONF -t 1800 -n core -i %i -l -r
> > SuccessExitStatus=75
> > RemainAfterExit=no
> > Restart=always
> > RestartSec=5s
> > TimeoutStopSec=20s
> > 
> > [Install]
> > WantedBy=bm-cyrus-imapd.service
> > 
> > 
> > 
> > 
> > 
> > Thomas Cataldo
> > Directeur Technique
> > 
> > (+33) 6 42 25 91 38
> > 
> > BlueMind
> > +33 (0)5 81 91 55 60
> > Hotel des Télécoms, 40 rue du village d'entreprises
> > 31670 Labège, France
> > www.bluemind.net / https://blog.bluemind.net/fr/
> > 
> > 
> > 
> > 
> > *Attachments:*
> > * replication_multiplexing.diff
> 
> --
> Bron Gondwana, CEO, Fastmail Pty Ltd
> brong@fastmailteam.com
> 

--
 Bron Gondwana, CEO, Fastmail Pty Ltd
 brong@fastmailteam.com


[Attachment #3 (text/html)]

<!DOCTYPE html><html><head><title></title><style type="text/css">
p.MsoNormal,p.MsoNoSpacing{margin:0}</style></head><body><div \
style="font-family:Arial;">Oh I should just add the other thing that you might be \
interested in that I've got some initial stabs at - synchronous replication.&nbsp; \
Embedding the sync_client logic into mailbox commit such that any action that writes \
to a mailbox does a pass through and creates a "SYNC APPLY MAILBOX" dlist stanza and \
shoves it down the wire at a replica.&nbsp; There's a couple of bits missing so far - \
it needs a way to upload the message content as well, which I'm probably just going \
to embed directly the the RECORD compontent - it's kind of ugly but it makes it a \
single DLIST - and obviously it needs to not fail entirely if the replica is down, so \
it might be fire and forget or it might have a timeout after which it syslogs but \
returns.<br></div><div style="font-family:Arial;"><br></div><div \
style="font-family:Arial;">It builds on top of the existing SINCE_MODSEQ and \
SINCE_UIDNEXT logic that's already in master, and will also want a "sync cache" - \
which will store the remote MAILBOX line for each mailbox, so you can generate a SYNC \
APPLY without first having to do a SYNC GET to find the current remote state.&nbsp; \
Assuming the replica hasn't changed in the meanwhile, this will allow for single \
round trip apply of changes rather than the current 4 round trips for an \
APPEND.<br></div><div style="font-family:Arial;"><br></div><div \
style="font-family:Arial;">(truly, it's 4 round trips!)<br></div><div \
style="font-family:Arial;"><br></div><div style="font-family:Arial;">S0 SYNCGET \
MAILBOX user.cassandane<br></div><div style="font-family:Arial;">S1 SYNCAPPLY \
RESERVE<br></div><div style="font-family:Arial;">S2 SYNCAPPLY MESSAGE<br></div><div \
style="font-family:Arial;">S3 SYNCAPPLY MAILBOX<br></div><div \
style="font-family:Arial;"><br></div><div style="font-family:Arial;">In my plan, the \
"SYNCGET MAILBOX" would not be needed, because you'd already know the remote \
state.&nbsp; The "RESERVE" would not be needed because you'd already know from the \
local conversations.db that this message wasn't listed in any other mailbox or with a \
UID less than the UIDNEXT of the remote mailbox from that known remote state, so all \
you'd have left is the SYNCAPPLY MESSAGE and the SYNCAPPLY MAILBOX.&nbsp; So it's \
just a matter of merging those into a single round trip with some nice combined \
format, and you're done :)<br></div><div style="font-family:Arial;"><br></div><div \
style="font-family:Arial;">Bron.<br></div><div style="font-family:Arial;"> \
<br></div><div>On Fri, Nov 22, 2019, at 02:25, Bron Gondwana \
wrote:<br></div><blockquote type="cite" id="qt"><div style="font-family:Arial;">Wow, \
interesting. That definitely works, though I'd probably normalise everything to the \
user ID so that the seen and mailbox events for the same user got the same \
channel.<br></div><div style="font-family:Arial;"><br></div><div \
style="font-family:Arial;">We're looking at similar things for our setup too, either \
shading or even per user logs with a daemon which farms users out to multiple \
channels.<br></div><div style="font-family:Arial;"><br></div><div \
style="font-family:Arial;">As for when we'd look at a sync daemon: probably next \
year. We're planning to land uuid based storage soon, which means that renaming users \
and mailboxes is really fast, then looking at replication channels on top of that \
would make more sense, because otherwise user renames become tricky.<br></div><div \
style="font-family:Arial;"><br></div><div style="font-family:Arial;">I'll have a look \
at the diff when it isn't 11:30pm for me.<br></div><div \
style="font-family:Arial;"><br></div><div \
style="font-family:Arial;">Cheers,<br></div><div \
style="font-family:Arial;"><br></div><div \
style="font-family:Arial;">Bron<br></div><div \
style="font-family:Arial;"><br></div><div>On Thu, Nov 21, 2019, at 18:50, Thomas \
Cataldo wrote:<br></div><blockquote id="qt-qt" \
type="cite"><div>Hi,<br></div><div><br></div><div>In our workload, cyrus replication \
latency is pretty critical as we serve most read requests from the \
replica.<br></div><div>Having a single network channel between master &amp; replica \
is a big issue for us.<br></div><div><br></div><div>Trying to improve our latency, we \
implemented the following approach : instead of writing "channel/log" we write \
"channel/log.&lt;shard_index&gt;".<br></div><div>We compute our shard key this way \
:<br></div><div><br></div><div># cat log.0&nbsp;<br></div><div>APPEND \
devenv.blue!user.tom.Sent<br></div><div>MAILBOX \
devenv.blue!user.tom.Sent<br></div><div><br></div><div># cat \
log.2&nbsp;<br></div><div>SEEN tom@devenv.blue \
9f799278-a6cd-45b7-9546-0e861d5e15d6<br></div><div>root@bm1804:/var/lib/cyrus/sync/core# \
cat log.3&nbsp;<br></div><div>…<br></div><div>APPEND \
devenv.blue!user.sga<br></div><div>MAILBOX \
devenv.blue!user.sga<br></div><div><br></div><div>We compute an hashcode of the first \
argument. We normalize it so devenv.blue!user.tom.Sent and devenv.blue!user.tom have \
the same hashcode then we "hashcode % shard_count" to figure out which log file to \
use.<br></div><div>We patched sync_client to add a "-i &lt;shard_index&gt;". \
sync_client -i 0 will process log.0 and use log-run.0, \
etc.<br></div><div><br></div><div>We don't spawn sync_client from cyrus.conf but we \
prefer systemd tricks \
:<br></div><div><br></div><div>/lib/systemd/system/bm-cyrus-syncclient@.service which \
is a template and we then enable :<br></div><div>systemctl enable \
bm-cyrus-syncclient@{0..3} to spawn 4 \
sync_client.<br></div><div><br></div><div><br></div><div>Attached diff of what we \
changed.&nbsp;<br></div><div><br></div><div>As a side note, our usage forbids moving \
a mailbox folder into another mailbox (ie. moving user.tom.titi into user.sga.stuff \
is forbidden in our setup). I guess this approach would be problematic we moving a \
mailbox subfolder to another mailbox as they might be sharded to separate log \
files.<br></div><div><br></div><div>Any feedback on this approach ? I read that you \
planned to turn sync_client into a sync daemon. Any schedule estimate on that \
?<br></div><div><br></div><div>Regards,<br></div><div>Thomas.<br></div><div><br></div><div><br></div><div>sync_client \
systemd configuration template \
:<br></div><div>/lib/systemd/system/bm-cyrus-syncclient@.service (%i is expanded to \
42 by systemd when you enable \
syncclient@42)<br></div><div>[Unit]<br></div><div>Description=BlueMind Cyrus \
sync_client service<br></div><div>After=bm-cyrus-imapd.service<br></div><div>PartOf=bm \
-cyrus-imapd.service<br></div><div>ConditionPathExists=!/etc/bm/bm-cyrus-imapd.disable \
d<br></div><div><br></div><div>[Service]<br></div><div>Type=forking<br></div><div>Environment=CONF=/etc/imapd.conf<br></div><div>ExecStartPre=/usr/bin/find \
/var/lib/cyrus/sync -name ‘log*.%i' -type f -exec rm -f {} \
\;<br></div><div>ExecStart=/usr/sbin/sync_client -C $CONF -t 1800 -n core -i %i -l \
-r<br></div><div>SuccessExitStatus=75<br></div><div>RemainAfterExit=no<br></div><div>R \
estart=always<br></div><div>RestartSec=5s<br></div><div>TimeoutStopSec=20s<br></div><d \
iv><br></div><div>[Install]<br></div><div>WantedBy=bm-cyrus-imapd.service<br></div><di \
v><br></div><div><br></div><div><br></div><div><br></div><div><br></div><div>Thomas \
Cataldo<br></div><div>Directeur Technique<br></div><div><br></div><div>(+33) 6 42 25 \
91 38<br></div><div><br></div><div>BlueMind<br></div><div>+33 (0)5 81 91 55 \
60<br></div><div>Hotel des Télécoms, 40 rue du village \
d'entreprises<br></div><div>31670 Labège, France<br></div><div>www.bluemind.net / \
https://blog.bluemind.net/fr/<br></div><div><br></div><div><br></div><div><br></div><d \
iv><br></div><div><b>Attachments:</b><br></div><ul><li>replication_multiplexing.diff<br></li></ul></blockquote><div \
style="font-family:Arial;"><br></div><div id="qt-sig56629417"><div \
class="qt-signature">--<br></div><div class="qt-signature">&nbsp; Bron Gondwana, CEO, \
Fastmail Pty Ltd<br></div><div class="qt-signature">&nbsp; \
brong@fastmailteam.com<br></div><div \
class="qt-signature"><br></div></div></blockquote><div \
style="font-family:Arial;"><br></div><div id="sig56629417"><div \
class="signature">--<br></div><div class="signature">&nbsp; Bron Gondwana, CEO, \
Fastmail Pty Ltd<br></div><div class="signature">&nbsp; \
brong@fastmailteam.com<br></div><div class="signature"><br></div></div><div \
style="font-family:Arial;"><br></div></body></html>



[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic