[prev in list] [next in list] [prev in thread] [next in thread] 

List:       nyphp-mysql
Subject:    Re: [mysql] fixing latin1 data stored in utf8 columns
From:       Rob Marscher <rmarscher () beaffinitive ! com>
Date:       2007-10-26 22:04:04
Message-ID: 48C22C35-9B1F-4BCD-95E5-A3C99D52EA12 () beaffinitive ! com
[Download RAW message or body]

Alright... so I have a solution... I can't be sure this works 100% or  
not... but has worked so far for me.

Basically, the idea is to do mysqldump reading the data as latin1 but  
in the output dump, you add 'set names utf8' so that when the data is  
read back in, it uses the utf8 character set to put the data back  
into the database correctly.

Only problem with this... mysqldump doesn't seem to support using any  
client-side character set than the correct one... so I had to write a  
php script to mimic what mysqldump does.

Here's the process -- make sure no other clients are modifying data  
in the database.  Run this script and pipe the output to a file.   
Drop all of the tables in the database.  Import back from the file.   
Done.

Here's the script.  I'm sure this will get totally mangled over  
email... I'll see about posting it publicly somewhere.


<?php

// put the correct values in here obviously
$host = 'localhost';
$user = 'root';
$password = 'somepass';

// select the database based on command-line input
if ($_SERVER['argv'][1]) {
	$dbName = $_SERVER['argv'][1];
} else {
	die("\nNo database selected\n\n");
}
mysql_connect($host, $user, $password);
mysql_select_db($dbName);
if (mysql_errno()) {
	die(mysql_error() . "\n");
}
$variablesRs = mysql_query('Show variables');
if (mysql_errno()) {
	die(mysql_error() . "\n");
}
while((list($variable, $value) = mysql_fetch_array($variablesRs)) !==  
false) {
	if ($variable == 'max_allowed_packet') {
		$maxAllowedPacket = $value;
		break;
	}
}
echo "-- Dumping $dbName\n\n";
echo "SET NAMES utf8;\n\n";
mysql_query('Set names latin1');
$tablesRs = mysql_query('Show table status');
if (mysql_errno()) {
	die(mysql_error() . "\n");
}
while((list($tableName, $engine, $version, $rowFormat, $rows,  
$avgRowLength) = mysql_fetch_array($tablesRs)) !== false) {
	list($tableName, $tableCreate) = mysql_fetch_row(mysql_query("Show  
create table `$tableName`"));
	$columns = 0;
	$tableDef = array();
	$tableDefRs = mysql_query("Describe `$tableName`");
	while(($columnDef = mysql_fetch_assoc($tableDefRs)) !== false) {
		$tableDef[] = $columnDef;
		$columns++;
	}
	echo "--\n-- Table structure for $tableName\n--\n\n";
	echo "$tableCreate;\n\n";
	echo "--\n-- Dumping data for $tableName\n--\n\n";
	echo "LOCK TABLES `$tableName` WRITE;\n";
	if ($engine == 'MyISAM') {
		echo "ALTER TABLE `$tableName` DISABLE KEYS;\n";
	}

	$start = 0;
	if ($avgRowLength > 0) {
		echo "max packet = $maxAllowedPacket, avg length = $avgRowLength\n";
		$rowBatchLimit = (int) ($maxAllowedPacket / $avgRowLength);
	} else {
		$rowBatchLimit = 10000;
	}
	echo "limit = $rowBatchLimit\n";
	$rowsRs = mysql_query("SELECT * FROM `$tableName` LIMIT $start,  
$rowBatchLimit");
	if (mysql_errno()) {
		die(mysql_error() . "\n");
	}
	while(mysql_numrows($rowsRs)) {
		echo "INSERT INTO `$tableName` VALUES ";
		$numRows = 0;
		while(($row = mysql_fetch_row($rowsRs)) !== false) {
			if ($numRows != 0) {
				echo ",";
			}
			echo "(";
			for ($i = 0; $i < $columns; $i++) {
				if (strpos($tableDef[$i]['Type'], 'int') === false
				    && strpos($tableDef[$i]['Type'], 'float') === false
				    && strpos($tableDef[$i]['Type'], 'dec') === false
				    && strpos($tableDef[$i]['Type'], 'numeric') === false
				    && strpos($tableDef[$i]['Type'], 'real') === false
				    && strpos($tableDef[$i]['Type'], 'bool') === false
				    && strpos($tableDef[$i]['Type'], 'bit') === false
				    && strpos($tableDef[$i]['Type'], 'double') === false) {
					if ($row[$i] !== null || $tableDef[$i]['NULL'] == 'NO') {
						$row[$i] = "'" . mysql_real_escape_string($row[$i]) .  
"'";				    	
					} else {
						$row[$i] = "'null'";
					}
				}
				if ($row[$i] === null || $row[$i] === '') {
					if ($tableDef[$i]['NULL'] == 'NO') {
						$row[$i] = 0;
					} else {
						$row[$i] = "NULL";
					}
				}
				if ($i != 0) {
					echo ",";
				}
				echo $row[$i];
			}
			echo ")";
			$numRows++;
		}
		echo ";\n";
		$start += $rowBatchLimit;
		$rowsRs = mysql_query("SELECT * FROM `$tableName` LIMIT $start,  
$rowBatchLimit");
	}
	echo ";\n";
	if ($engine == 'MyISAM') {
		echo "ALTER TABLE `$tableName` ENABLE KEYS;\n";
	}	
	echo "UNLOCK TABLES;\n\n";
}

_______________________________________________
New York PHP Community MySQL SIG
http://lists.nyphp.org/mailman/listinfo/mysql

NYPHPCon 2006 Presentations Online
http://www.nyphpcon.com

Show Your Participation in New York PHP
http://www.nyphp.org/show_participation.php
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic