[prev in list] [next in list] [prev in thread] [next in thread] 

List:       postgresql-general
Subject:    Re: psql '\copy to' and unicode escapes
From:       "David G. Johnston" <david.g.johnston () gmail ! com>
Date:       2018-02-26 17:05:37
Message-ID: CAKFQuwbpm1QO20k2X8_qRnq1up0naFSF3mbLEp+onxKSc8L+ew () mail ! gmail ! com
[Download RAW message or body]

On Mon, Feb 26, 2018 at 9:53 AM, Steven Hirsch <snhirsch@gmail.com> wrote:

> I fear that I'm missing something very obvious, but I cannot find a syntax
> that permits me to use an escaped hexadecimal representation in a CSV file
> and have that representation interpreted as the equivalent unicode
> character when inserting into the database.
>

​There isn't one - copy treats input as literals and performs basically no
processing on them.​  The system writing the csv file would have to
actually encode the UTF-8 symbol, not the string of the code point,
directly into the document (i.e., a capable viewer would display whatever
00b0 is on-screen, or a placeholder if it is a non-printable character).

INSERT and COPY are two totally different animals:

INSERT INTO tbl (t) VALUES (trim('   jdjd   ')); -- stores jdjd, but
putting trim('   jdjd   ') in a csv file and you would store "trim('
 jdjd    ')"

David J.

[Attachment #3 (text/html)]

<div dir="ltr"><div class="gmail_default" \
style="font-family:arial,helvetica,sans-serif"><span \
style="font-family:arial,sans-serif">On Mon, Feb 26, 2018 at 9:53 AM, Steven Hirsch \
</span><span dir="ltr" style="font-family:arial,sans-serif">&lt;<a \
href="mailto:snhirsch@gmail.com" \
target="_blank">snhirsch@gmail.com</a>&gt;</span><span \
style="font-family:arial,sans-serif"> wrote:</span><br></div><div \
class="gmail_extra"><div class="gmail_quote"><blockquote class="gmail_quote" \
style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div \
dir="ltr"><div><div>I fear that I&#39;m missing something very obvious, but I cannot \
find a syntax that permits me to use an escaped hexadecimal representation in a CSV \
file and have that representation interpreted as the equivalent unicode character \
when inserting into the database.</div></div></div></blockquote><div><br></div><div \
class="gmail_default" style="font-family:arial,helvetica,sans-serif">​There \
isn&#39;t one - copy treats input as literals and performs basically no processing on \
them.​   The system writing the csv file would have to actually encode the UTF-8 \
symbol, not the string of the code point, directly into the document (i.e., a capable \
viewer would display whatever 00b0 is on-screen, or a placeholder if it is a \
non-printable character).</div><div class="gmail_default" \
style="font-family:arial,helvetica,sans-serif"><br></div><div class="gmail_default" \
style="font-family:arial,helvetica,sans-serif">INSERT and COPY are two totally \
different animals:</div><div class="gmail_default" \
style="font-family:arial,helvetica,sans-serif"><br></div><div class="gmail_default" \
style="font-family:arial,helvetica,sans-serif">INSERT INTO tbl (t) VALUES (trim(&#39; \
jdjd     &#39;)); -- stores jdjd, but putting trim(&#39;     jdjd     &#39;) in a csv \
file and you would store &quot;trim(&#39;     jdjd      &#39;)&quot;</div><div \
class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></div><div \
class="gmail_default" style="font-family:arial,helvetica,sans-serif">David \
J.</div><div class="gmail_default" \
style="font-family:arial,helvetica,sans-serif"><br></div></div></div></div>



[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic