[prev in list] [next in list] [prev in thread] [next in thread] 

List:       linux-ha-dev
Subject:    Re: [Linux-ha-dev] Patch for mysql RA
From:       Marek Marczykowski <marmarek () staszic ! waw ! pl>
Date:       2010-10-25 1:10:32
Message-ID: 4CC4D908.9000906 () staszic ! waw ! pl
[Download RAW message or body]

[Attachment #2 (multipart/signed)]

[Attachment #4 (multipart/mixed)]


On 19.08.2010 21:34, Florian Haas wrote:
> Marek,
> 
> I've finally found time to look into this. Sorry about the delay. So to
> recap, based on the patches list in
> http://marmarek.w.staszic.waw.pl/patches/ha-mysql-ra/,

I've also troubles to find some free time :/

> 05 has gone in but I think it ought to be reverted and replaced with a
> change in functionality, not documentation. Why not check whether the
> resource is configured as a M/S, and if yes, actually _start_ mysqld
> with --skip-slave-start rather than expecting the user to add this to
> the config?

Fixed, new patches attached and also on website. I also think on passing
--read_only option instead of starting in read-write mode and setting
read_only right after start (when in M/S of course). What do you think?

> 06 has not gone in, but I'm generally OK with it. But, please, "START
> SLAVE", not "SLAVE START". And on a different style note, no reason for
> trailing semicolons after ocf_log and return.

Fixed.

> 08 has not gone in. It's nice but I hate the way it's implemented with a
> state file. Why not use crm_attribute and stick transient attributes
> onto nodes? If we could get that patch rewritten to use transient node
> attributes I'd like to see this go in. But here too: "STOP SLAVE"
> please, not "SLAVE STOP".

Changed. I've used persistent node attributes to keep replication state
even on reboot.

> 09: not in, comments on 06 and 05 apply here too.

Fixed (this semicolons was from patch 06...).

I've made some new patches:

10_mysql-ra-monitor-ms-get-ro-state.patch: In monitor action, check if
this instance is running as master based on read_only mysql variable.
It's better than CRM variables because represent "real" state, not the
desirable one.

11_mysql-ra-use-monitor-to-check-start.patch: Call detailed
(OCF_CHECK_LEVEL=10) monitor action to check if mysql is really working
(in start action). It helps when database is broken (and automatic
recovery failed) - then do not try to restart it - fail immediately.

-- 
Best Regards,
Marek Marczykowski          |   gg:2873965      | RLU #390519
marmarek at staszic waw pl  | xmpp:marmarek at staszic waw pl


["05_mysql-ra-slave-start-opts.patch" (text/plain)]

--- mysql-repl.org	2010-10-24 22:29:12.729659138 +0200
+++ mysql-repl	2010-10-24 22:34:03.445657761 +0200
@@ -766,12 +765,17 @@
     #chmod 0755 $OCF_RESKEY_datadir
     #chown -R $OCF_RESKEY_user $OCF_RESKEY_datadir
     #chgrp -R $OCF_RESKEY_group $OCF_RESKEY_datadir
+    mysql_extra_params=
+    if ocf_is_ms; then
+        mysql_extra_params="--skip-slave-start"
+    fi
 
     ${OCF_RESKEY_binary} --defaults-file=$OCF_RESKEY_config \
 		--pid-file=$OCF_RESKEY_pid \
 		--socket=$OCF_RESKEY_socket \
 		--datadir=$OCF_RESKEY_datadir \
-		--user=$OCF_RESKEY_user $OCF_RESKEY_additional_parameters >/dev/null 2>&1 &
+		--user=$OCF_RESKEY_user $OCF_RESKEY_additional_parameters \
+        $mysql_extra_params >/dev/null 2>&1 &
     rc=$?
 
     if [ $rc != 0 ]; then

["06_mysql-ra-start-slave-replication.patch" (text/plain)]

--- mysql-repl.org2	2010-10-24 22:42:25.694692401 +0200
+++ mysql-repl	2010-10-24 22:44:00.785673661 +0200
@@ -806,6 +806,18 @@
 	# in read only mode.
 	set_read_only on
 
+	master_host=`echo $OCF_RESKEY_CRM_meta_notify_master_uname`
+	if [ "$master_host" -a "$master_host" != `uname -n` ]; then
+	    ocf_log info "Changing MySQL configuration to replicate from $master_host."
+	    set_master $master_host
+	    ocf_run $MYSQL $MYSQL_OPTIONS_LOCAL $MYSQL_OPTIONS_REPL \
+		-e "START SLAVE"
+	    if [ $? -ne 0 ]; then
+		ocf_log err "Failed to start slave"
+		return $OCF_ERR_GENERIC
+	    fi
+	fi
+
 	# We also need to set a master preference, otherwise Pacemaker
 	# won't ever promote us in the absence of any explicit
 	# preference set by the administrator. We choose a low

["08_mysql-ra-repl-keep-state.patch" (text/plain)]

--- mysql-repl.org3	2010-10-24 23:14:12.785629803 +0200
+++ mysql-repl	2010-10-25 01:26:08.726235588 +0200
@@ -333,6 +333,9 @@
 MYSQL_OPTIONS_REPL="--user=$OCF_RESKEY_replication_user \
--password=$OCF_RESKEY_replication_passwd"  
 CRM_MASTER="${HA_SBIN_DIR}/crm_master -l reboot "
+HOSTNAME=`uname -n`
+CRM_ATTR="${HA_SBIN_DIR}/crm_attribute -N $HOSTNAME -l forever"
+INSTANCE_ATTR_NAME=`echo ${OCF_RESOURCE_INSTANCE}| awk -F : '{print $1}'`
 
 #######################################################################
 
@@ -393,13 +396,15 @@
     return 1
 }
 
-check_slave() {
-    # Checks slave status
-    local rc
-    local tmpfile
+parse_slave_info() {
+    # Extracts field $1 from result of "SHOW SLAVE STATUS\G" from file $2
+    sed -ne "s/^.* $1: \(.*\)$/\1/p" < $2
+}
+
+get_slave_info() {
+    # Warning: this sets $tmpfile and LEAVE this file! You must delete it after use!
     local mysql_options
 
-    rc=1
     tmpfile=`mktemp ${HA_RSCTMP}/check_slave.${OCF_RESOURCE_INSTANCE}.XXXXXX`
 
     mysql_options="$MYSQL_OPTIONS_LOCAL --user=$OCF_RESKEY_replication_user \
--password=$OCF_RESKEY_replication_passwd" @@ -407,23 +412,36 @@
     $MYSQL $mysql_options \
         -e 'SHOW SLAVE STATUS\G' > $tmpfile
 
-    local master_host
-    local master_user
-    local master_port
-    local slave_sql
-    local slave_io
-    local last_errno
-    local secs_behind
-
     if [ -s $tmpfile ]; then
-	master_host=`sed -ne 's/^.*Master_Host: \(.*\)$/\1/p' < $tmpfile`
-	master_user=`sed -ne 's/^.*Master_User: \(.*\)$/\1/p' < $tmpfile`
-	master_port=`sed -ne 's/^.*Master_Port: \(.*\)$/\1/p' < $tmpfile`
-	slave_sql=`sed -ne 's/^.*Slave_SQL_Running: \(.*\)$/\1/p' < $tmpfile`
-	slave_io=`sed -ne 's/^.*Slave_IO_Running: \(.*\)$/\1/p' < $tmpfile`
-	last_errno=`sed -ne 's/^.*Last_Errno: \(.*\)$/\1/p' < $tmpfile`
-	secs_behind=`sed -ne 's/^.*Seconds_Behind_Master: \(.*\)$/\1/p' < $tmpfile`
+	master_host=`parse_slave_info Master_Host $tmpfile`
+	master_user=`parse_slave_info Master_User $tmpfile`
+	master_port=`parse_slave_info Master_Port $tmpfile`
+	master_log_file=`parse_slave_info Master_Log_File $tmpfile`
+	master_log_pos=`parse_slave_info Read_Master_Log_Pos $tmpfile`
+	slave_sql=`parse_slave_info Slave_SQL_Running $tmpfile`
+	slave_io=`parse_slave_info Slave_IO_Running $tmpfile`
+	last_errno=`parse_slave_info Last_Errno $tmpfile`
+	secs_behind=`parse_slave_info Seconds_Behind_Master $tmpfile`
+
+        ocf_log debug "MySQL instance running as a replication slave"
+    else
+        # Instance produced an empty "SHOW SLAVE STATUS" output --
+        # instance is not a slave
+	ocf_log err "check_slave invoked on an instance that is not a replication slave."
+	return $OCF_ERR_GENERIC
+    fi
+
+    return $OCF_SUCCESS
+}
 
+check_slave() {
+    # Checks slave status
+    local rc
+
+    get_slave_info
+    rc=$?
+
+    if [ $rc -eq 0 ]; then
 	if [ $last_errno -ne 0 ]; then
 	    # Whoa. Replication ran into an error. This slave has
 	    # diverged from its master. Make sure this resource
@@ -482,18 +500,45 @@
 }
 
 set_master() {
+    local new_master_host master_log_file master_log_pos
+    local master_params
+
+    new_master_host=$1
+
+    # Keep replication position
+    get_slave_info
+
+    if [ "$master_log_file" -a "$new_master_host" = "$master_host" ]; then
+	master_params=", MASTER_LOG_FILE='$master_log_file', \
+	    MASTER_LOG_POS=$master_log_pos"
+	ocf_log debug "Kept master pos for $master_host : $master_log_file:$master_log_pos"
+    else
+	master_host=`$CRM_ATTR -n master-host-${INSTANCE_ATTR_NAME} -G`
+	master_log_file=`$CRM_ATTR -n master-log-file-${INSTANCE_ATTR_NAME} -G`
+	master_log_pos=`$CRM_ATTR -n master-log-pos-${INSTANCE_ATTR_NAME} -G`
+	if [ "$new_master_host" = "$master_host" -a -n "$master_log_file" -a -n \
"$master_log_pos" ]; then +	    master_params=", MASTER_LOG_FILE='$master_log_file', \
\ +		MASTER_LOG_POS=$master_log_pos"
+	    ocf_log debug "Restored master pos for $master_host : \
$master_log_file:$master_log_pos" +	fi
+    fi
+
     # Informs the MySQL server of the master to replicate
     # from. Accepts one mandatory argument which must contain the host
     # name of the new master host. The master must either be unchanged
     # from the laste master the slave replicated from, or freshly
     # reset with RESET MASTER.
-    local master_host
-    master_host=$1
 
     ocf_run $MYSQL $MYSQL_OPTIONS_LOCAL $MYSQL_OPTIONS_REPL \
-	-e "CHANGE MASTER TO MASTER_HOST='$master_host', \
+	-e "CHANGE MASTER TO MASTER_HOST='$new_master_host', \
                              MASTER_USER='$OCF_RESKEY_replication_user', \
-                             MASTER_PASSWORD='$OCF_RESKEY_replication_passwd'"
+                             MASTER_PASSWORD='$OCF_RESKEY_replication_passwd' \
$master_params" +
+    # Remove state attributes - it will be invalid after START SLAVE
+    $CRM_ATTR -n master-host-${INSTANCE_ATTR_NAME} -D
+    $CRM_ATTR -n master-log-file-${INSTANCE_ATTR_NAME} -D
+    $CRM_ATTR -n master-log-pos-${INSTANCE_ATTR_NAME} -D
+    rm -f $tmpfile
 }
 
 unset_master(){
@@ -547,7 +592,14 @@
 	ocf_log err "Error stopping rest slave threads"
 	exit $OCF_ERR_GENERIC
     fi
-    
+
+    #Save current state
+    get_slave_info
+    $CRM_ATTR -n master-host-${INSTANCE_ATTR_NAME} -v $master_host
+    $CRM_ATTR -n master-log-file-${INSTANCE_ATTR_NAME} -v $master_log_file
+    $CRM_ATTR -n master-log-pos-${INSTANCE_ATTR_NAME} -v $master_log_pos
+    rm -f $tmpfile
+
     ocf_run $MYSQL $mysql_options \
 	-e "CHANGE MASTER TO MASTER_HOST=''" 
     if [ $? -gt 0 ]; then
@@ -756,6 +808,9 @@
 		ocf_log err "Failed to start slave"
 		return $OCF_ERR_GENERIC
 	    fi
+	else
+	    ocf_log info "No MySQL master present - clearing replication state"
+	    unset_master
 	fi
 
 	# We also need to set a master preference, otherwise Pacemaker
@@ -820,6 +875,8 @@
     if ( ! mysql_status ); then
 	return $OCF_NOT_RUNNING
     fi
+    ocf_run $MYSQL $MYSQL_OPTIONS_LOCAL $MYSQL_OPTIONS_REPL \
+	-e "STOP SLAVE"
     set_read_only off || return $OCF_ERR_GENERIC
 
     # Existing master gets a higher-than-default master preference, so
@@ -878,9 +935,7 @@
 	    fi
 
 	    if [ $master_host = `uname -n` ]; then
-		ocf_log info "Resetting MySQL replication configuration on new master \
                $master_host"
-		ocf_run $MYSQL $MYSQL_OPTIONS_LOCAL $MYSQL_OPTIONS_REPL \
-		    -e 'RESET MASTER'
+		ocf_log info "This will be new master"
 	    else
 		ocf_log info "Changing MySQL configuration to replicate from $master_host"
 		set_master $master_host


["09_mysql-ra-disable-slave-on-no-master.patch" (text/plain)]

--- /usr/lib/ocf/resource.d/heartbeat/mysql-repl.orig	2010-07-20 04:32:01.681369222 +0200
+++ /usr/lib/ocf/resource.d/heartbeat/mysql-repl	2010-07-20 04:33:06.374301483 +0200
@@ -802,6 +802,9 @@
 		ocf_log err "Failed to start slave"
 		return $OCF_ERR_GENERIC
 	    fi
+	else 
+	    ocf_log info "No MySQL master present - clearing replication state"
+	    unset_master
 	fi
 
 	# We also need to set a master preference, otherwise Pacemaker

["10_mysql-ra-slave-ms-get-ro-state.patch" (text/plain)]

--- mysql-repl.org7	2010-10-25 02:16:02.029634076 +0200
+++ mysql-repl	2010-10-25 02:16:22.302632453 +0200
@@ -361,6 +361,26 @@
 	-e "SET GLOBAL read_only=${ro_val}"
 }
 
+get_read_only() {
+    # Check if read-only is set
+    local mysql_options
+    local read_only_state
+
+    mysql_options="$MYSQL_OPTIONS_LOCAL"
+    if [ -n $OCF_RESKEY_replication_user ]; then
+	mysql_options="$mysql_options $MYSQL_OPTIONS_REPL"
+    fi
+
+    read_only_state=`$MYSQL $mysql_options \
+	-e "SHOW VARIABLES" | grep read_only | awk '{print $2}'`
+    
+    if [ "$read_only_state" = "ON" ]; then
+	return 0
+    else
+	return 1
+    fi
+}
+
 is_slave() {
     # Check whether this machine should be slave
 
@@ -366,8 +366,7 @@
 is_slave() {
     # Check whether this machine should be slave
 
-    master_host=`echo $OCF_RESKEY_CRM_meta_notify_promote_uname`
-    if [ -z "$master_host" -o "$master_host" == `uname -n` ]; then
+    if ! ocf_is_ms || ! get_read_only; then
 	return 1;
     fi
 
@@ -696,7 +716,7 @@
 	fi
     fi
 
-    if [ "$OCF_RESKEY_CRM_meta_role" = "Master" ]; then
+    if ocf_is_ms && ! get_read_only; then
 	    ocf_log info "MySQL monitor succeeded (master)";
 	    return $OCF_RUNNING_MASTER
     else

["11_mysql-ra-use-monitor-to-check-start.patch" (text/plain)]

--- mysql-repl.org5	2010-10-24 23:33:27.937656257 +0200
+++ mysql-repl	2010-10-24 23:49:35.585658022 +0200
@@ -695,7 +695,7 @@
     fi
 
 
-    if [ $OCF_CHECK_LEVEL -gt 0 ]; then
+    if [ $OCF_CHECK_LEVEL -gt 0 -a -n "$OCF_RESKEY_test_table" ]; then
 	# Check if this instance is configured as a slave, and if so
 	# check slave status
 	if is_slave; then
@@ -840,6 +840,15 @@
 	# greater-than-zero preference.
 	$CRM_MASTER -v 1
     fi
+
+    # Initial monitor action
+    OCF_CHECK_LEVEL=10
+    mysql_monitor
+    rc=$?
+    if [ $rc != $OCF_SUCCESS -a $rc != $OCF_RUNNING_MASTER ]; then
+	ocf_log "Failed initial monitor action"
+	return $rc
+    fi
     
     ocf_log info "MySQL started"
     return $OCF_SUCCESS

["smime.p7s" (application/pkcs7-signature)]

_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic