[prev in list] [next in list] [prev in thread] [next in thread] 

List:       wikitech-l
Subject:    Re: [Wikitech-l] Public repositories for research dumps
From:       Bilal Abdul Kader <bilalak () gmail ! com>
Date:       2009-06-23 16:37:47
Message-ID: 61d9018b0906230937p566d9a3dte82490667da68182 () mail ! gmail ! com
[Download RAW message or body]

Hi Felipe,Thanks for the great effort. This will save us hours of
downloading and importing older dumps.

bilal


On Tue, Jun 23, 2009 at 12:26 PM, Felipe Ortega <glimmer_phoenix@yahoo.es>wrote:

>
> Hello.
>
> Since just a few hours ago, a new public repository has been created to
> host WikiXRay database dumps, containing info extracted from public
> Wikipedia dbdumps. The image is hosted by RedIRIS (in short, the Spanish
> equivalent of Kennisnet in Netherlands).
>
> http://sunsite.rediris.es/mirror/WKP_research
>
> ftp://ftp.rediris.es/mirror/WKP_research
>
> These new dumps are aimed to save time and effort to other researchers,
> since they won't need to parse the complete XML dumps to extract all
> relevant activity metadata. We used mysqldump to create the dumps from our
> databases..
>
> As of today, only some of the biggest Wikipedias are available. However,
>  in the following days the full set of available languages will be ready for
> downloading. The files will be updated regularly.
>
> The procedure is as follows:
>
> 1. Find the research dump of your interest. Download and decompress it in
> your local system.
>
> 2. Create a local DB to import the information.
>
> 3. Load the dump file, using a MySQL user with insert privileges:
>
> $> mysql -u user -p passw myDB < dumpfile.sql
>
> And you're done.
>
> Final warning. 3 fields in the revision table are not reliable yet:
>
> rev_num_inlinks
> rev_num_outlinks
> rev_num_trans
>
> All remaining fields/values are trustable (in particular rev_len,
> rev_num_words, and so forth).
>
> Regards,
>
> Felipe.
>
>
>
>
_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic