[prev in list] [next in list] [prev in thread] [next in thread] 

List:       postgis-users
Subject:    Re: [postgis-users] shp2pgsql segfault and gdb
From:       Amos Hayes <ahayes () polkaroo ! net>
Date:       2005-09-30 16:42:46
Message-ID: 62924978-B67F-4B70-8951-0F46E8914281 () polkaroo ! net
[Download RAW message or body]


On 30-Sep-05, at 12:25 PM, strk@refractions.net wrote:

>
> <doc_snippet>
> The -W defines an encoding of the input data (dbf file).
> When specified all attributes of the dbf are converted to UTF-8.
> The resulting .sql script will contain a command to set CLIENT  
> ENCODING
> to UTF-8, so that the backend will be able to reconvert from UTF-8
> to whatever encoding the database has been created with.
> </doc_snippet>
>
> Most likely you will have a Latin1 backend, so the conversion flow
> would be:
>
>     1: Latin1 -> UTF8 (shp2pgsql)
>     2: UTF8 -> Latin1 (sql session)
>
> If you have a UTF8 database:
>
>     1: Latin1 -> UTF8 (shp2pgsql)
>     2: NO CONVERSION (sql session)
>
> -strk;
>

That clears things up. Thank you very much. Might I suggest the  
following modified doc snippet? Also, could you list what character  
set arguments are allowed, or link to a list of them?

P.S. I think PotgreSQL now creates databases in "UNICODE" by default.  
Am I correct in thinking that UTF-8 is a subset of UNICODE, therefore  
no conversion will need to happen on import?


<doc_snippet>
The -W option specifies the encoding of the input data (dbf file).  
When used, all attributes of the dbf are converted from the specified  
encoding to UTF-8. The resulting SQL output will contain a command to  
set CLIENT ENCODING to UTF-8, so that the backend will be able to  
reconvert from UTF-8 to whatever encoding the database is configured  
to use internally.
</doc_snippet>

--
Amos

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic