[prev in list] [next in list] [prev in thread] [next in thread] 

List:       bacula-users
Subject:    [Bacula-users] Improving the speed of spooling attributes
From:       Dan Langille <dan () langille ! org>
Date:       2014-03-23 19:33:28
Message-ID: 9052464A-7B2D-4BC5-BA37-2D66E7508ED8 () langille ! org
[Download RAW message or body]

[Attachment #2 (multipart/signed)]


In this email, I write about backup times growing over a few months, and trying to \
figure out why it was so slow.

Conclusion: give your database server as much RAM as you can.  Inserting into the \
File table requires updating 5 indexes.  If that index can be held entirely in RAM, \
those updates can occur without constant swapping to disk.  The amount of RAM you \
need to give it varies according to your database size. Too much or too little can \
increase the time required.

Ref: Bacula 5.2.12 on FreeBSD 9.2, backing up to disk first, then copying to tape. \
Disk storage is raidz2 (more later in post).

The problem: slow backups.  Not slow as in time to backup data, but slow in putting \
the attributes into the database.

In this post, when I speak about time, I am referring to the time it takes to spool \
the data attributes.  Taking this sample job output:

###
23-Mar 05:02 crey-sd JobId 167020: Sending spooled attrs to the Director. Despooling \
115,264,052 bytes ... 23-Mar 05:09 bacula-dir JobId 167020: Bacula bacula-dir 5.2.12 \
(12Sep12): ###

In that example, spooling time is 7 minutes (roughly speaking).

Given that different size backups result in different amounts of data spooling, I \
took to measuring the spooling process in MB/s.  From a high of 129 MB/s in early \
January, it dropped to 73 by the end of January, and by mid Feb it was 5MB/s.

I suspected the file system, etc, but I was proven wrong.  It turned out to be a \
database issue.

First, some fact:

* The File table contains about 172 million records. This size ballooned over this \
                period because of increased backups.
* Logging was not being monitored on the database server
* Localhost connections were blocked by the firewall, thus preventing the auto-vaccum \
process from being initiated

The first problem to solve was dead tuples in the File table.  Firewall rules were \
altered to allow auto-vaccum to run.

Various database tuning parameters were changed to get an initial vacuum to run in \
decent time:

* RAM on this PostgreSQL 9.2.4 server is 16GB
* work_mem = 1GB
* maintenance_work_mem = 1GB
* checkpoint_segments = 512
* checkpoint_completion_target = 0.7

Once an autovacuum was done, things improved.  It now took about 45 minutes, giving \
us 40MB/s for spooling attributes.  I figured we must be able to do better.

I started playing with SQL by creating my own database table to mirror the temporary \
table, ‘batch’.  Then I started running the insert query to see what optimizations I \
could make.  e.g. I ran this query manually:

INSERT INTO File (FileIndex, 1, PathId, FilenameId, LStat, MD5, DeltaSeq) 
    SELECT B.FileIndex, B.JobId, P.PathId, FN.FilenameId, B.LStat, B.MD5, B.DeltaSeq 
      FROM my_batch B
      JOIN Path     P  ON (B.Path = P.Path) 
      JOIN Filename FN ON (B.Name = FN.Name);

I always inserted into Jobid = 1 which I knew was not a job still in history.

More details here: https://docs.google.com/document/d/1AVAIi6PmJZZE11N3PLLNtbuxuOES4vCNtXiqoxoP2Xk/edit


I found that these settings helped.  They are standard PostgreSQL settings to \
optimize queries.

shared_buffers = 3GB (postgresql.conf setting)
kern.ipc.shmmax=4294967296 (/etc/sysctl.conf)
kern.ipc.shmall=4294967296

This dropped the insert time to about 6 minutes.  About half of this time is \
constructing the query 

NOTE: Using 2.5GB or 3.5GB decreased the throughput.

Filesystem background:

This is where the backups are stored on disk (i.e. bacula-sd on server B):

$ zfs get all system/usr/local/bacula
NAME                     PROPERTY              VALUE                                  \
SOURCE system/usr/local/bacula  type                  filesystem                      \
- system/usr/local/bacula  creation              Mon Jul 22 10:25 2013                \
- system/usr/local/bacula  used                  12.9T                                \
- system/usr/local/bacula  available             4.32T                                \
- system/usr/local/bacula  referenced            8.96T                                \
- system/usr/local/bacula  compressratio         1.25x                                \
- system/usr/local/bacula  mounted               yes                                  \
- system/usr/local/bacula  quota                 none                                 \
default system/usr/local/bacula  reservation           none                           \
default system/usr/local/bacula  recordsize            128K                           \
default system/usr/local/bacula  mountpoint            \
/usr/jails/crey.example.org/usr/local/bacula  local system/usr/local/bacula  sharenfs \
off                                              default system/usr/local/bacula  \
checksum              fletcher4                                        inherited from \
system system/usr/local/bacula  compression           lz4                             \
local system/usr/local/bacula  atime                 off                              \
inherited from system system/usr/local/bacula  devices               on               \
default system/usr/local/bacula  exec                  on                             \
default system/usr/local/bacula  setuid                on                             \
inherited from system/usr/local system/usr/local/bacula  readonly              off    \
local system/usr/local/bacula  jailed                off                              \
default system/usr/local/bacula  snapdir               hidden                         \
default system/usr/local/bacula  aclmode               discard                        \
default system/usr/local/bacula  aclinherit            restricted                     \
default system/usr/local/bacula  canmount              on                             \
default system/usr/local/bacula  xattr                 off                            \
temporary system/usr/local/bacula  copies                1                            \
default system/usr/local/bacula  version               5                              \
- system/usr/local/bacula  utf8only              off                                  \
- system/usr/local/bacula  normalization         none                                 \
- system/usr/local/bacula  casesensitivity       sensitive                            \
- system/usr/local/bacula  vscan                 off                                  \
default system/usr/local/bacula  nbmand                off                            \
default system/usr/local/bacula  sharesmb              off                            \
default system/usr/local/bacula  refquota              none                           \
default system/usr/local/bacula  refreservation        none                           \
default system/usr/local/bacula  primarycache          all                            \
default system/usr/local/bacula  secondarycache        all                            \
default system/usr/local/bacula  usedbysnapshots       3.97T                          \
- system/usr/local/bacula  usedbydataset         8.96T                                \
- system/usr/local/bacula  usedbychildren        0                                    \
- system/usr/local/bacula  usedbyrefreservation  0                                    \
- system/usr/local/bacula  logbias               latency                              \
default system/usr/local/bacula  dedup                 off                            \
default system/usr/local/bacula  mlslabel                                             \
- system/usr/local/bacula  sync                  standard                             \
default system/usr/local/bacula  refcompressratio      1.30x                          \
- system/usr/local/bacula  written               19.1G                                \
- system/usr/local/bacula  logicalused           15.9T                                \
- system/usr/local/bacula  logicalreferenced     11.4T                                \
-

The database is stored here (on server B):

$ zfs get all system/usr/local/pgsql
NAME                    PROPERTY              VALUE                  SOURCE
system/usr/local/pgsql  type                  filesystem             -
system/usr/local/pgsql  creation              Fri May  3  9:38 2013  -
system/usr/local/pgsql  used                  193G                   -
system/usr/local/pgsql  available             9.75T                  -
system/usr/local/pgsql  referenced            193G                   -
system/usr/local/pgsql  compressratio         2.10x                  -
system/usr/local/pgsql  mounted               yes                    -
system/usr/local/pgsql  quota                 none                   default
system/usr/local/pgsql  reservation           none                   default
system/usr/local/pgsql  recordsize            8K                     local
system/usr/local/pgsql  mountpoint            /usr/local/pgsql       inherited from \
system system/usr/local/pgsql  sharenfs              off                    default
system/usr/local/pgsql  checksum              fletcher4              inherited from \
system system/usr/local/pgsql  compression           lz4                    local
system/usr/local/pgsql  atime                 off                    inherited from \
system system/usr/local/pgsql  devices               on                     default
system/usr/local/pgsql  exec                  on                     default
system/usr/local/pgsql  setuid                on                     inherited from \
system/usr/local system/usr/local/pgsql  readonly              off                    \
default system/usr/local/pgsql  jailed                off                    default
system/usr/local/pgsql  snapdir               hidden                 default
system/usr/local/pgsql  aclmode               discard                default
system/usr/local/pgsql  aclinherit            restricted             default
system/usr/local/pgsql  canmount              on                     local
system/usr/local/pgsql  xattr                 off                    temporary
system/usr/local/pgsql  copies                1                      default
system/usr/local/pgsql  version               5                      -
system/usr/local/pgsql  utf8only              off                    -
system/usr/local/pgsql  normalization         none                   -
system/usr/local/pgsql  casesensitivity       sensitive              -
system/usr/local/pgsql  vscan                 off                    default
system/usr/local/pgsql  nbmand                off                    default
system/usr/local/pgsql  sharesmb              off                    default
system/usr/local/pgsql  refquota              none                   default
system/usr/local/pgsql  refreservation        none                   default
system/usr/local/pgsql  primarycache          metadata               local
system/usr/local/pgsql  secondarycache        all                    default
system/usr/local/pgsql  usedbysnapshots       0                      -
system/usr/local/pgsql  usedbydataset         193G                   -
system/usr/local/pgsql  usedbychildren        0                      -
system/usr/local/pgsql  usedbyrefreservation  0                      -
system/usr/local/pgsql  logbias               latency                default
system/usr/local/pgsql  dedup                 off                    default
system/usr/local/pgsql  mlslabel                                     -
system/usr/local/pgsql  sync                  standard               default
system/usr/local/pgsql  refcompressratio      2.10x                  -
system/usr/local/pgsql  written               193G                   -
system/usr/local/pgsql  logicalused           137G                   -
system/usr/local/pgsql  logicalreferenced     137G                   -



-- 
Dan Langille - http://langille.org


["signature.asc" (signature.asc)]

-----BEGIN PGP SIGNATURE-----
Comment: GPGTools - http://gpgtools.org

iEYEARECAAYFAlMvNwgACgkQCgsXFM/7nTwMbgCfQrUyZBKT8hwCt9I+hcwjOgV7
iLIAni4Xklw7oVZc4CbwGSrY8mh0vnWx
=bDO/
-----END PGP SIGNATURE-----


------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech

_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic