[prev in list] [next in list] [prev in thread] [next in thread]
List: avro-user
Subject: Re: Avro versioning and SpecificDatum's
From: Alex Holmes <grep.alex () gmail ! com>
Date: 2011-09-21 23:55:26
Message-ID: CAK0mCrSW+LoxoY50wS4Q8-xhwfjCmOA7ETfMQBt1qBuvEgLFtg () mail ! gmail ! com
[Download RAW message or body]
Thanks, that fixed my issue.
On Tue, Sep 20, 2011 at 2:51 PM, Scott Carey <scottcarey@apache.org> wrote:
> As Doug mentioned in the ticket, the problem is likely:
>
> new SpecificDatumReader<Record>()
>
>
> This should be
>
> new SpecificDatumReader<Record>(Record.class)
>
>
> Which sets the reader to resolve to the schema found in Record.class
>
>
>
> On 9/20/11 3:44 AM, "Alex Holmes" <grep.alex@gmail.com> wrote:
>
>>Created the following ticket:
>>
>>https://issues.apache.org/jira/browse/AVRO-891
>>
>>Thanks,
>>Alex
>>
>>On Tue, Sep 20, 2011 at 6:26 AM, Alex Holmes <grep.alex@gmail.com> wrote:
>>> Thanks, I'll add a bug.
>>>
>>> As a FYI, even without the alias (retaining the original field name),
>>> just removing the "id" field yields the exception.
>>>
>>> On Tue, Sep 20, 2011 at 2:22 AM, Scott Carey <scottcarey@apache.org>
>>>wrote:
>>>> That looks like a bug. What happens if there is no aliasing/renaming
>>>> involved? Aliasing is a newer feature than field addition, removal,
>>>>and
>>>> promotion.
>>>>
>>>> This should be easy to reproduce, can you file a JIRA ticket? We
>>>>should
>>>> discuss this further there.
>>>>
>>>> Thanks!
>>>>
>>>>
>>>> On 9/19/11 6:14 PM, "Alex Holmes" <grep.alex@gmail.com> wrote:
>>>>
>>>>>OK, I was able to reproduce the exception.
>>>>>
>>>>>v1:
>>>>>{"name": "Record", "type": "record",
>>>>> "fields": [
>>>>> {"name": "name", "type": "string"},
>>>>> {"name": "id", "type": "int"}
>>>>> ]
>>>>>}
>>>>>
>>>>>v2:
>>>>>{"name": "Record", "type": "record",
>>>>> "fields": [
>>>>> {"name": "name_rename", "type": "string", "aliases": ["name"]}
>>>>> ]
>>>>>}
>>>>>
>>>>>Step 1. Write Avro file using v1 generated class
>>>>>Step 2. Read Avro file using v2 generated class
>>>>>
>>>>>Exception in thread "main" org.apache.avro.AvroRuntimeException: Bad
>>>>>index
>>>>> at Record.put(Unknown Source)
>>>>> at
>>>>>org.apache.avro.generic.GenericData.setField(GenericData.java:463)
>>>>> at
>>>>>org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReade
>>>>>r.j
>>>>>ava:166)
>>>>> at
>>>>>org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java
>>>>>:13
>>>>>8)
>>>>> at
>>>>>org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java
>>>>>:12
>>>>>9)
>>>>> at
>>>>>org.apache.avro.file.DataFileStream.next(DataFileStream.java:233)
>>>>> at
>>>>>org.apache.avro.file.DataFileStream.next(DataFileStream.java:220)
>>>>> at Read.readFromAvro(Unknown Source)
>>>>> at Read.main(Unknown Source)
>>>>>
>>>>>The code to write/read the avro file didn't change from below.
>>>>>
>>>>>On Mon, Sep 19, 2011 at 9:08 PM, Alex Holmes <grep.alex@gmail.com>
>>>>>wrote:
>>>>>> I'm trying to put together a simple test case to reproduce the
>>>>>> exception. While I was creating the test case, I hit this behavior
>>>>>> which doesn't seem right, but maybe it's my misunderstanding on how
>>>>>> forward/backward compatibility should work:
>>>>>>
>>>>>> Schema v1:
>>>>>>
>>>>>> {"name": "Record", "type": "record",
>>>>>> "fields": [
>>>>>> {"name": "name", "type": "string"},
>>>>>> {"name": "id", "type": "int"}
>>>>>> ]
>>>>>> }
>>>>>>
>>>>>> Schema v2:
>>>>>>
>>>>>> {"name": "Record", "type": "record",
>>>>>> "fields": [
>>>>>> {"name": "name_rename", "type": "string", "aliases": ["name"]},
>>>>>> {"name": "new_field", "type": "int", "default":"0"}
>>>>>> ]
>>>>>> }
>>>>>>
>>>>>> In the 2nd version I:
>>>>>>
>>>>>> - removed field "id"
>>>>>> - renamed field "name" to "name_rename"
>>>>>> - added field "new_field"
>>>>>>
>>>>>> I write the v1 data file:
>>>>>>
>>>>>> public static Record createRecord(String name, int id) {
>>>>>> Record record = new Record();
>>>>>> record.name = name;
>>>>>> record.id = id;
>>>>>> return record;
>>>>>> }
>>>>>>
>>>>>> public static void writeToAvro(OutputStream outputStream)
>>>>>> throws IOException {
>>>>>> DataFileWriter<Record> writer =
>>>>>> new DataFileWriter<Record>(new SpecificDatumWriter<Record>());
>>>>>> writer.create(Record.SCHEMA$, outputStream);
>>>>>>
>>>>>> writer.append(createRecord("r1", 1));
>>>>>> writer.append(createRecord("r2", 2));
>>>>>>
>>>>>> writer.close();
>>>>>> outputStream.close();
>>>>>> }
>>>>>>
>>>>>> I wrote a version-agnostic Read class:
>>>>>>
>>>>>> public static void readFromAvro(InputStream is) throws IOException {
>>>>>> DataFileStream<Record> reader = new DataFileStream<Record>(
>>>>>> is, new SpecificDatumReader<Record>());
>>>>>> for (Record a : reader) {
>>>>>> System.out.println(ToStringBuilder.reflectionToString(a));
>>>>>> }
>>>>>> IOUtils.cleanup(null, is);
>>>>>> IOUtils.cleanup(null, reader);
>>>>>> }
>>>>>>
>>>>>> Running the Read code against the v1 data file, and including the v1
>>>>>> code-generated classes in the classpath produced:
>>>>>>
>>>>>> Record@6a8c436b[name=r1,id=1]
>>>>>> Record@6baa9f99[name=r2,id=2]
>>>>>>
>>>>>> If I run the same code, but use just the v2 generated classes in the
>>>>>> classpath I get:
>>>>>>
>>>>>> Record@39dd3812[name_rename=r1,new_field=1]
>>>>>> Record@27b15692[name_rename=r2,new_field=2]
>>>>>>
>>>>>> The name_rename field seems to be good, but why would "new_field"
>>>>>> inherit the values of the deleted field "id"?
>>>>>>
>>>>>> Cheers,
>>>>>> Alex
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Mon, Sep 19, 2011 at 12:35 PM, Doug Cutting <cutting@apache.org>
>>>>>>wrote:
>>>>>>> On 09/19/2011 05:12 AM, Alex Holmes wrote:
>>>>>>>> I then modified my original schema by adding, deleting and renaming
>>>>>>>> some fields, creating version 2 of the schema. After re-creating
>>>>>>>>the
>>>>>>>> Java classes I attempted to read the version 1 file using the
>>>>>>>> DataFileStream (with a SpecificDatumReader), and this is throwing
>>>>>>>>an
>>>>>>>> exception.
>>>>>>>
>>>>>>> This should work. Can you provide more detail? What is the
>>>>>>>exception?
>>>>>>> A reproducible test case would be great to have.
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Doug
>>>>>>>
>>>>>>
>>>>
>>>>
>>>>
>>>
>
>
>
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic