[prev in list] [next in list] [prev in thread] [next in thread]
List: pgsql-bugs
Subject: Re: [BUGS] BUG #8173: Inserting heap tuples in bulk in COPY patch return wrong line on failure 999 o
From: Heikki Linnakangas <hlinnakangas () vmware ! com>
Date: 2013-05-23 11:56:35
Message-ID: 519E03F3.70306 () vmware ! com
[Download RAW message or body]
On 21.05.2013 18:20, lalbin@fhcrc.org wrote:
> The following bug has been logged on the website:
>
> Bug reference: 8173
> Logged by: Lloyd Albin
> Email address: lalbin@fhcrc.org
> PostgreSQL version: 9.2.4
> Operating system: SUSE Linux (64-bit)
> Description:
>
> During testing for our 9.2 upgrade, we found that the error messages we were
> expecting did not match what was given by the program. In looking over the
> revision notes from our current version of 9.0.12 through the 9.2.4, that we
> are testing, I believe that I have tracked down the issue to "Improve COPY
> performance by adding tuples to the heap in batches". When I looked at the
> patch code in
> http://www.postgresql.org/message-id/4E708759.40206@enterprisedb.com I found
> that you are inserting 1000 rows at a time. The problem is that on failure,
> you return either row 1000 or the last line, whichever comes first. This can
> be confusing as you will see in the demo code below.
>
> CREATE TABLE public.table1 (
> key INTEGER,
> PRIMARY KEY(key)
> );
>
> Create a csv file with only one column of data, numbered from 1 to 1008.
>
> Make two copies of the file and name them csv_test.csv and csv_test2.csv.
>
> Edit csv_test.csv and change the entry 1000 to 500.
>
> Edit csv_test2.csv and change the entry 900 to 500.
>
> On 9.0.12 Server
>
> COPY public.table1 FROM 'csv_test.csv';
>
> ERROR: duplicate key value violates unique constraint "table1_pkey"
> DETAIL: Key (key)=(500) already exists.
> CONTEXT: COPY table1, line 1000: "500"
>
> COPY public.table1 FROM 'csv_test2.csv';
>
> ERROR: duplicate key value violates unique constraint "table1_pkey"
> DETAIL: Key (key)=(500) already exists.
> CONTEXT: COPY table1, line 900: "500"
>
> Both times the context gave us the correct information.
>
> Now try the same thing on 9.2.4 Server
>
> COPY public.table1 FROM 'csv_test.csv';
>
> ERROR: duplicate key value violates unique constraint "table1_pkey"
> DETAIL: Key (key)=(500) already exists.
> CONTEXT: COPY table1, line 1000: "500"
>
> COPY public.table1 FROM 'csv_test2.csv';
>
> ERROR: duplicate key value violates unique constraint "table1_pkey"
> DETAIL: Key (key)=(500) already exists.
> CONTEXT: COPY table1, line 1000: "1000"
>
> As you can see, the second test returned the last line of the set of tuples
> being recorded not the line that actually failed.
>
> Make a copy of csv_test2.csv and name it csv_test3.csv.
> Edit csv_test3.csv and remove all entries after 994.
>
> COPY public.table1 FROM 'csv_test3.csv';
>
> ERROR: duplicate key value violates unique constraint "table1_pkey"
> DETAIL: Key (key)=(500) already exists.
> CONTEXT: COPY table1, line 995: ""
>
> If you are writing less than 1000 lines then it will return the line after
> the last line with a value of "".
Hmm, yeah, it's quite self-evident what's happening; the server reports
the last line that was *read*, no the line where the error happened.
Committed a fix for this. Unfortunately we only keep the last line read
buffered in text format, so after this you'll only get the line number,
not the content of that line:
postgres=# copy foo from '/tmp/foo';
ERROR: duplicate key value violates unique constraint "foo_pkey"
DETAIL: Key (id)=(4500) already exists.
CONTEXT: COPY foo, line 4500
Thanks for the report!
- Heikki
--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic