[prev in list] [next in list] [prev in thread] [next in thread]
List: avro-user
Subject: Re: avro BinaryDecoder bug ?
From: Yang <teddyyyy123 () gmail ! com>
Date: 2011-09-01 1:57:06
Message-ID: CAAnh3_9Hqy8-oT4etSbnkVMG6tAB75wZSYUF1LUJ-Nhy3gx7rQ () mail ! gmail ! com
[Download RAW message or body]
yes https://issues.apache.org/jira/browse/AVRO-882
On Wed, Aug 31, 2011 at 6:52 PM, Scott Carey <scottcarey@apache.org> wrote:
> Looks like a bug to me.
>
> Can you file a JIRA ticket?
>
> Thanks!
>
> On 8/29/11 1:24 PM, "Yang" <teddyyyy123@gmail.com> wrote:
>
> >if I read on a empty file with BinaryDecoder, I get EOF, good,
> >
> >but with the current code, if I read it again with the same decoder, I
> >get a IndexOutofBoundException, not EOF.
> >
> >it seems that always giving EOF should be a more desirable behavior.
> >
> >you can see from this test code:
> >
> >import static org.junit.Assert.assertEquals;
> >
> >import java.io.IOException;
> >
> >import org.apache.avro.specific.SpecificRecord;
> >import org.junit.Test;
> >
> >import myavro.Apple;
> >
> >import java.io.File;
> >import java.io.FileInputStream;
> >import java.io.FileNotFoundException;
> >import java.io.FileOutputStream;
> >import java.io.InputStream;
> >import java.io.OutputStream;
> >
> >import org.apache.avro.io.Decoder;
> >import org.apache.avro.io.DecoderFactory;
> >import org.apache.avro.io.Encoder;
> >import org.apache.avro.io.EncoderFactory;
> >import org.apache.avro.specific.SpecificDatumReader;
> >import org.apache.avro.specific.SpecificDatumWriter;
> >
> >class MyWriter {
> >
> > SpecificDatumWriter<SpecificRecord> wr;
> > Encoder enc;
> > OutputStream ostream;
> >
> > public MyWriter() throws FileNotFoundException {
> > wr = new SpecificDatumWriter<SpecificRecord>(new
> >Apple().getSchema());
> > ostream = new FileOutputStream(new File("/tmp/testavro"));
> > enc = EncoderFactory.get().binaryEncoder(ostream, null);
> > }
> >
> > public synchronized void dump(SpecificRecord event) throws
> >IOException {
> > wr.write(event, enc);
> > enc.flush();
> > }
> >
> >}
> >
> >class MyReader {
> >
> > SpecificDatumReader<SpecificRecord> rd;
> > Decoder dec;
> > InputStream istream;
> >
> > public MyReader() throws FileNotFoundException {
> > rd = new SpecificDatumReader<SpecificRecord>(new
> >Apple().getSchema());
> > istream = new FileInputStream(new File("/tmp/testavro"));
> > dec = DecoderFactory.get().binaryDecoder(istream, null);
> > }
> >
> > public synchronized SpecificRecord read() throws IOException {
> > Object r = rd.read(null, dec);
> > return (SpecificRecord) r;
> > }
> >
> >}
> >
> >public class AvroWriteAndReadSameTime {
> > @Test
> > public void testWritingAndReadingAtSameTime() throws Exception {
> >
> > MyWriter dumper = new MyWriter();
> > final Apple apple = new Apple();
> > apple.taste = "sweet";
> > dumper.dump(apple);
> >
> > final MyReader rd = new MyReader();
> > rd.read();
> >
> >
> > try {
> > rd.read();
> > } catch (Exception e) {
> > e.printStackTrace();
> > }
> >
> > // the second one somehow generates a NPE, we hope to get EOF...
> > try {
> > rd.read();
> > } catch (Exception e) {
> > e.printStackTrace();
> > }
> >
> > }
> >}
> >
> >
> >
> >
> >
> >the issue is in BinaryDecoder.readInt(), right now even when it hits
> >EOF, it still advances the pos pointer.
> >all the other APIs (readLong readFloat ...) do not do this. changing
> >to the following makes it work:
> >
> >
> > @Override
> > public int readInt() throws IOException {
> > ensureBounds(5); // won't throw index out of bounds
> > int len = 1;
> > int b = buf[pos] & 0xff;
> > int n = b & 0x7f;
> > if (b > 0x7f) {
> > b = buf[pos + len++] & 0xff;
> > n ^= (b & 0x7f) << 7;
> > if (b > 0x7f) {
> > b = buf[pos + len++] & 0xff;
> > n ^= (b & 0x7f) << 14;
> > if (b > 0x7f) {
> > b = buf[pos + len++] & 0xff;
> > n ^= (b & 0x7f) << 21;
> > if (b > 0x7f) {
> > b = buf[pos + len++] & 0xff;
> > n ^= (b & 0x7f) << 28;
> > if (b > 0x7f) {
> > throw new IOException("Invalid int encoding");
> > }
> > }
> > }
> > }
> > }
> > if (pos+len > limit) {
> > throw new EOFException();
> > }
> > pos += len; //<================== CHANGE, used to be
> >above the EOF throw
> >
> > return (n >>> 1) ^ -(n & 1); // back to two's-complement
> > }
>
>
>
[Attachment #3 (text/html)]
yes <a href="https://issues.apache.org/jira/browse/AVRO-882">https://issues.apache.org/jira/browse/AVRO-882</a><br><br><div \
class="gmail_quote">On Wed, Aug 31, 2011 at 6:52 PM, Scott Carey <span \
dir="ltr"><<a href="mailto:scottcarey@apache.org">scottcarey@apache.org</a>></span> \
wrote:<br> <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px \
#ccc solid;padding-left:1ex;">Looks like a bug to me.<br> <br>
Can you file a JIRA ticket?<br>
<br>
Thanks!<br>
<div><div></div><div class="h5"><br>
On 8/29/11 1:24 PM, "Yang" <<a \
href="mailto:teddyyyy123@gmail.com">teddyyyy123@gmail.com</a>> wrote:<br> <br>
>if I read on a empty file with BinaryDecoder, I get EOF, good,<br>
><br>
>but with the current code, if I read it again with the same decoder, I<br>
>get a IndexOutofBoundException, not EOF.<br>
><br>
>it seems that always giving EOF should be a more desirable behavior.<br>
><br>
>you can see from this test code:<br>
><br>
>import static org.junit.Assert.assertEquals;<br>
><br>
>import java.io.IOException;<br>
><br>
>import org.apache.avro.specific.SpecificRecord;<br>
>import org.junit.Test;<br>
><br>
>import myavro.Apple;<br>
><br>
>import java.io.File;<br>
>import java.io.FileInputStream;<br>
>import java.io.FileNotFoundException;<br>
>import java.io.FileOutputStream;<br>
>import java.io.InputStream;<br>
>import java.io.OutputStream;<br>
><br>
>import org.apache.avro.io.Decoder;<br>
>import org.apache.avro.io.DecoderFactory;<br>
>import org.apache.avro.io.Encoder;<br>
>import org.apache.avro.io.EncoderFactory;<br>
>import org.apache.avro.specific.SpecificDatumReader;<br>
>import org.apache.avro.specific.SpecificDatumWriter;<br>
><br>
>class MyWriter {<br>
><br>
> SpecificDatumWriter<SpecificRecord> wr;<br>
> Encoder enc;<br>
> OutputStream ostream;<br>
><br>
> public MyWriter() throws FileNotFoundException {<br>
> wr = new SpecificDatumWriter<SpecificRecord>(new<br>
>Apple().getSchema());<br>
> ostream = new FileOutputStream(new File("/tmp/testavro"));<br>
> enc = EncoderFactory.get().binaryEncoder(ostream, null);<br>
> }<br>
><br>
> public synchronized void dump(SpecificRecord event) throws<br>
>IOException {<br>
> wr.write(event, enc);<br>
> enc.flush();<br>
> }<br>
><br>
>}<br>
><br>
>class MyReader {<br>
><br>
> SpecificDatumReader<SpecificRecord> rd;<br>
> Decoder dec;<br>
> InputStream istream;<br>
><br>
> public MyReader() throws FileNotFoundException {<br>
> rd = new SpecificDatumReader<SpecificRecord>(new<br>
>Apple().getSchema());<br>
> istream = new FileInputStream(new File("/tmp/testavro"));<br>
> dec = DecoderFactory.get().binaryDecoder(istream, null);<br>
> }<br>
><br>
> public synchronized SpecificRecord read() throws IOException {<br>
> Object r = rd.read(null, dec);<br>
> return (SpecificRecord) r;<br>
> }<br>
><br>
>}<br>
><br>
>public class AvroWriteAndReadSameTime {<br>
> @Test<br>
> public void testWritingAndReadingAtSameTime() throws Exception {<br>
><br>
> MyWriter dumper = new MyWriter();<br>
> final Apple apple = new Apple();<br>
> apple.taste = "sweet";<br>
> dumper.dump(apple);<br>
><br>
> final MyReader rd = new MyReader();<br>
> rd.read();<br>
><br>
><br>
> try {<br>
> rd.read();<br>
> } catch (Exception e) {<br>
> e.printStackTrace();<br>
> }<br>
><br>
> // the second one somehow generates a NPE, we hope to get EOF...<br>
> try {<br>
> rd.read();<br>
> } catch (Exception e) {<br>
> e.printStackTrace();<br>
> }<br>
><br>
> }<br>
>}<br>
><br>
><br>
><br>
><br>
><br>
>the issue is in BinaryDecoder.readInt(), right now even when it hits<br>
>EOF, it still advances the pos pointer.<br>
>all the other APIs (readLong readFloat ...) do not do this. changing<br>
>to the following makes it work:<br>
><br>
><br>
> @Override<br>
> public int readInt() throws IOException {<br>
> ensureBounds(5); // won't throw index out of bounds<br>
> int len = 1;<br>
> int b = buf[pos] & 0xff;<br>
> int n = b & 0x7f;<br>
> if (b > 0x7f) {<br>
> b = buf[pos + len++] & 0xff;<br>
> n ^= (b & 0x7f) << 7;<br>
> if (b > 0x7f) {<br>
> b = buf[pos + len++] & 0xff;<br>
> n ^= (b & 0x7f) << 14;<br>
> if (b > 0x7f) {<br>
> b = buf[pos + len++] & 0xff;<br>
> n ^= (b & 0x7f) << 21;<br>
> if (b > 0x7f) {<br>
> b = buf[pos + len++] & 0xff;<br>
> n ^= (b & 0x7f) << 28;<br>
> if (b > 0x7f) {<br>
> throw new IOException("Invalid int encoding");<br>
> }<br>
> }<br>
> }<br>
> }<br>
> }<br>
> if (pos+len > limit) {<br>
> throw new EOFException();<br>
> }<br>
> pos += len; //<================== CHANGE, used to be<br>
>above the EOF throw<br>
><br>
> return (n >>> 1) ^ -(n & 1); // back to two's-complement<br>
> }<br>
<br>
<br>
</div></div></blockquote></div><br>
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic