[prev in list] [next in list] [prev in thread] [next in thread] 

List:       wikitech-l
Subject:    Re: [Wikitech-l] [mwdumper] new maintainer?
From:       Jamie Morken <jmorken () shaw ! ca>
Date:       2010-02-17 3:40:42
Message-ID: cf93ecdb16f76.4b7af4ba () shaw ! ca
[Download RAW message or body]



Date: Tue, 16 Feb 2010 09:34:41 -0800
From: Brion Vibber <brion@pobox.com>
Subject: Re: [Wikitech-l] [mwdumper] new maintainer?
To: wikitech-l@lists.wikimedia.org
Message-ID: <hlekvf$nl0$1@ger.gmane.org>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

On 2/16/10 7:03 AM, Jamie Morken wrote:
> Ok, the simple question: how many people prefer XML or sql dumps?

I think we have a FAQ on this...

http://meta.wikimedia.org/wiki/Download#What_happened_to_the_SQL_dumps.3F


You *do* realize that such "SQL dumps" would have to be invented from 
whole cloth and couldn't just be dumped from the actual databases, right?

The raw databases include dozens of alternate clusters and have data 
from different revisions compressed together, including deleted items 
and private data, and can't simply be released by WMF even if someone 
actually wanted to figure out how to replicate Wikimedia's exact storage 
cluster layout to do a data import.

Most likely if they were created they'd simply be created by running the 
xml through a tool like mwdumper...

-- brion



Hi Brion,

I have not tried mwdumper yet, I have been looking at the various xml to sql \
conversion tools, and reading about people's use of them, but I will have to give it \
a try to see for myself, but it seems like an overly complex task to recreate an sql \
database in my opinion.  Also when wikimedia dumps used to be in sql format I think \
there were less dump problems than there are now, although maybe the main issue is \
the growth of the file sizes.  It is probably simpler to make an sql dump than an XML \
dump I bet, also the older mediawiki dumps were in sql format.  For making the \
wikimedia dumps into sql directly I think the process would be to do sql database \
merge's and then make sure the private data is erased?  This might be simpler than \
creating to XML and then using mwdumper to get back to sql.  Also there is a \
bottleneck somewhere in the dump system (dump fails etc) maybe it is the XML part?  I \
will get back to you after I try mwdumper and/or:

php importDump.php <17gigabytefail> :)

cheers,
Jamie


_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic