'[DRBD-user] Pacemaker drbd fail to promote to primary'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       drbd-user
Subject:    [DRBD-user] Pacemaker drbd fail to promote to primary
From:       Neil Schneider <nschneider () awarepoint ! com>
Date:       2016-09-02 4:56:01
Message-ID: BY1PR0401MB1612741A7E59794F48DDE47EBEE50 () BY1PR0401MB1612 ! namprd04 ! prod ! outlook ! com
[Download RAW message or body]

Below are my configs. 
The issue I'm experiencing is that when stonith reboots the primary, the secondary \
doesn't get promoted.  I thought the handlers in drbd.conf were supposed to "handle" \
that.  Anybody know what I'm missing? 
I've been looking at logs, but nothing stands out to me. The logging is pretty \
verbose. Perhaps I could make the logs a little less verbose. 
I don't know where those options are. 
I had this working with this identical configuration with the same nodes but simple \
hostnames. I switched it up to simulate a real world change, where I am using fqdn \
style hostnames, and also in the configuration files below. This is a pair of \
"appliances" that run out proprietary software.  FQDNs are part of our requirements \
for our software.   So I or someone I train, is going to have to configure this for \
each pair we deliver.  The old HA was much simpler. 
Have to always move forward. 

/etc/drbd.conf
global {
        usage-count no;
}
common {
        protocol C;
        handlers {
                fence-peer "/usr/lib/drbd/crm-fence-peer.sh";
                after-resync-target "/usr/lib/drbd/crm_unfence-peer.sh";
                split-brain "/usr/lib/drbd/notify-split-brain.sh root";
        }

        startup {
        }
        disk {
                fencing resource-and-stonith;
                #on-io-error detach;
        }
        net {
                allow-two-primaries yes;
                after-sb-0pri discard-zero-changes;
                after-sb-1pri discard-secondary;
                after-sb-2pri disconnect;
                rr-conflict disconnect;
        }
}
resource mysql {
        protocol C;
        meta-disk internal;
        device /dev/drbd1;
        syncer {
                verify-alg sha1;
                rate 33M;
                csums-alg sha1;
        }
on node1 {
        disk /dev/sda1;
        address 10.6.7.24:7789;
            }
on awpnode2 {
        disk /dev/sda1;
        address 10.6.7.27:7789;
            }
}

/etc/corosync/corosync.conf
# Please read the corosync.conf.5 manual page
compatibility: whitetank

totem {
        version: 2
        secauth: off
        threads: 0
        interface {
                member {
                        memberaddr: 10.6.7.24
                        }
                member  {
                        memberaddr: 10.6.7.27
                        }
                ringnumber: 0
                bindnetaddr: 10.6.7.0
                mcastport: 5405
                ttl: 1
        }
        transport: udpu
}
logging {
        fileline: off
        to_stderr: no
        to_logfile: yes
        logfile: /var/log/cluster/corosync.log
        to_syslog: yes
        debug: off
        timestamp: on
        logger_subsys {
                subsys: AMF
                debug: off
        }
}

pcs config show

 Master/Slave Set: mysql_data_clone [mysql_data]
     Masters: [ node2 ]
     Slaves: [ node1i ]

pcs config show
Cluster Name: awpcluster
Corosync Nodes:
 node1 node2
Pacemaker Nodes:
 node1 node2

Resources:
 Master: mysql_data_clone
  Meta Attrs: master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 notify      \
=true  Resource: mysql_data (class=ocf provider=linbit type=drbd)
   Attributes: drbd_resource=mysql
   Operations: start interval=0s timeout=240 (mysql_data-start-interval-0s)
               promote interval=0s timeout=90 (mysql_data-promote-interval-0s)
               demote interval=0s timeout=90 (mysql_data-demote-interval-0s)
               stop interval=0s timeout=100 (mysql_data-stop-interval-0s)
               monitor interval=30s (mysql_data-monitor-interval-30s)

Stonith Devices:
 Resource: fence_node1_kvm (class=stonith type=fence_virsh)
  Attributes: pcmk_host_list=node1 ipaddr=10.6.7.10 action=reboot login=root \
passwd=password port=node1  Operations: monitor interval=30s \
(fence_node1_kvm-monitor-interval-30s)  Resource: fence_node2_kvm (class=stonith \
type=fence_virsh)  Attributes: pcmk_host_list=node2 ipaddr=10.6.7.12 action=reboot \
login=root passwd=password port=awpnode2 delay=15  Operations: monitor interval=30s \
(fence_awpnode2_kvm-monitor-interval-30s) Fencing Levels:

Location Constraints:
  Resource: mysql_data_clone
    Constraint: drbd-fence-by-handler-mysql-mysql_data_clone
      Rule: score=-INFINITY role=Master  \
                (id:drbd-fence-by-handler-mysql-rule-mysql_data_clone)
        Expression: #uname ne cleardata-awpnode2.awarepoint.com  \
(id:drbd-fence-by-handler-mysql-expr-mysql_data_clone) Ordering Constraints:
Colocation Constraints:

Resources Defaults:
 resource-stickiness: 200
Operations Defaults:
 No defaults set

Cluster Properties:
 cluster-infrastructure: cman
 dc-version: 1.1.14-8.el6_8.1-70404b0
 have-watchdog: false
 last-lrm-refresh: 1472768147
 no-quorum-policy: ignore
 stonith-enabled: true

Thank you.

Neil Schneider
DevOps Engineer

_______________________________________________
drbd-user mailing list
drbd-user@lists.linbit.com
http://lists.linbit.com/mailman/listinfo/drbd-user

[prev in list] [next in list] [prev in thread] [next in thread]