[prev in list] [next in list] [prev in thread] [next in thread] 

List:       avro-user
Subject:    Re: avro BinaryDecoder bug ?
From:       Yang <teddyyyy123 () gmail ! com>
Date:       2011-09-01 1:57:06
Message-ID: CAAnh3_9Hqy8-oT4etSbnkVMG6tAB75wZSYUF1LUJ-Nhy3gx7rQ () mail ! gmail ! com
[Download RAW message or body]

yes https://issues.apache.org/jira/browse/AVRO-882

On Wed, Aug 31, 2011 at 6:52 PM, Scott Carey <scottcarey@apache.org> wrote:

> Looks like a bug to me.
>
> Can you file a JIRA ticket?
>
> Thanks!
>
> On 8/29/11 1:24 PM, "Yang" <teddyyyy123@gmail.com> wrote:
>
> >if I read on a empty file with BinaryDecoder, I get EOF, good,
> >
> >but with the current code, if I read it again with the same decoder, I
> >get a IndexOutofBoundException, not EOF.
> >
> >it seems that always giving EOF should be a more desirable behavior.
> >
> >you can see from this test code:
> >
> >import static org.junit.Assert.assertEquals;
> >
> >import java.io.IOException;
> >
> >import org.apache.avro.specific.SpecificRecord;
> >import org.junit.Test;
> >
> >import myavro.Apple;
> >
> >import java.io.File;
> >import java.io.FileInputStream;
> >import java.io.FileNotFoundException;
> >import java.io.FileOutputStream;
> >import java.io.InputStream;
> >import java.io.OutputStream;
> >
> >import org.apache.avro.io.Decoder;
> >import org.apache.avro.io.DecoderFactory;
> >import org.apache.avro.io.Encoder;
> >import org.apache.avro.io.EncoderFactory;
> >import org.apache.avro.specific.SpecificDatumReader;
> >import org.apache.avro.specific.SpecificDatumWriter;
> >
> >class MyWriter {
> >
> >    SpecificDatumWriter<SpecificRecord> wr;
> >    Encoder enc;
> >    OutputStream ostream;
> >
> >    public MyWriter() throws FileNotFoundException {
> >        wr = new SpecificDatumWriter<SpecificRecord>(new
> >Apple().getSchema());
> >        ostream = new FileOutputStream(new File("/tmp/testavro"));
> >        enc = EncoderFactory.get().binaryEncoder(ostream, null);
> >    }
> >
> >    public synchronized void dump(SpecificRecord event) throws
> >IOException {
> >        wr.write(event, enc);
> >        enc.flush();
> >    }
> >
> >}
> >
> >class MyReader {
> >
> >    SpecificDatumReader<SpecificRecord> rd;
> >    Decoder dec;
> >    InputStream istream;
> >
> >    public MyReader() throws FileNotFoundException {
> >        rd = new SpecificDatumReader<SpecificRecord>(new
> >Apple().getSchema());
> >        istream = new FileInputStream(new File("/tmp/testavro"));
> >        dec = DecoderFactory.get().binaryDecoder(istream, null);
> >    }
> >
> >    public synchronized SpecificRecord read() throws IOException {
> >        Object r = rd.read(null, dec);
> >        return (SpecificRecord) r;
> >    }
> >
> >}
> >
> >public class AvroWriteAndReadSameTime {
> >    @Test
> >    public void testWritingAndReadingAtSameTime() throws Exception {
> >
> >        MyWriter dumper = new MyWriter();
> >        final Apple apple = new Apple();
> >        apple.taste = "sweet";
> >        dumper.dump(apple);
> >
> >        final MyReader rd = new MyReader();
> >        rd.read();
> >
> >
> >        try {
> >        rd.read();
> >        } catch (Exception e) {
> >            e.printStackTrace();
> >        }
> >
> >        // the second one somehow generates a NPE, we hope to get EOF...
> >        try {
> >        rd.read();
> >        } catch (Exception e) {
> >            e.printStackTrace();
> >        }
> >
> >    }
> >}
> >
> >
> >
> >
> >
> >the issue is in BinaryDecoder.readInt(), right now even when it hits
> >EOF, it still advances the pos pointer.
> >all the other APIs (readLong readFloat ...) do not do this. changing
> >to the following  makes it work:
> >
> >
> >  @Override
> >  public int readInt() throws IOException {
> >    ensureBounds(5); // won't throw index out of bounds
> >    int len = 1;
> >    int b = buf[pos] & 0xff;
> >    int n = b & 0x7f;
> >    if (b > 0x7f) {
> >      b = buf[pos + len++] & 0xff;
> >      n ^= (b & 0x7f) << 7;
> >      if (b > 0x7f) {
> >        b = buf[pos + len++] & 0xff;
> >        n ^= (b & 0x7f) << 14;
> >        if (b > 0x7f) {
> >          b = buf[pos + len++] & 0xff;
> >          n ^= (b & 0x7f) << 21;
> >          if (b > 0x7f) {
> >            b = buf[pos + len++] & 0xff;
> >            n ^= (b & 0x7f) << 28;
> >            if (b > 0x7f) {
> >              throw new IOException("Invalid int encoding");
> >            }
> >          }
> >        }
> >      }
> >    }
> >    if (pos+len > limit) {
> >      throw new EOFException();
> >    }
> >    pos += len;             //<================== CHANGE, used to be
> >above the EOF throw
> >
> >    return (n >>> 1) ^ -(n & 1); // back to two's-complement
> >  }
>
>
>

[Attachment #3 (text/html)]

yes <a href="https://issues.apache.org/jira/browse/AVRO-882">https://issues.apache.org/jira/browse/AVRO-882</a><br><br><div \
class="gmail_quote">On Wed, Aug 31, 2011 at 6:52 PM, Scott Carey <span \
dir="ltr">&lt;<a href="mailto:scottcarey@apache.org">scottcarey@apache.org</a>&gt;</span> \
wrote:<br> <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px \
#ccc solid;padding-left:1ex;">Looks like a bug to me.<br> <br>
Can you file a JIRA ticket?<br>
<br>
Thanks!<br>
<div><div></div><div class="h5"><br>
On 8/29/11 1:24 PM, &quot;Yang&quot; &lt;<a \
href="mailto:teddyyyy123@gmail.com">teddyyyy123@gmail.com</a>&gt; wrote:<br> <br>
&gt;if I read on a empty file with BinaryDecoder, I get EOF, good,<br>
&gt;<br>
&gt;but with the current code, if I read it again with the same decoder, I<br>
&gt;get a IndexOutofBoundException, not EOF.<br>
&gt;<br>
&gt;it seems that always giving EOF should be a more desirable behavior.<br>
&gt;<br>
&gt;you can see from this test code:<br>
&gt;<br>
&gt;import static org.junit.Assert.assertEquals;<br>
&gt;<br>
&gt;import java.io.IOException;<br>
&gt;<br>
&gt;import org.apache.avro.specific.SpecificRecord;<br>
&gt;import org.junit.Test;<br>
&gt;<br>
&gt;import myavro.Apple;<br>
&gt;<br>
&gt;import java.io.File;<br>
&gt;import java.io.FileInputStream;<br>
&gt;import java.io.FileNotFoundException;<br>
&gt;import java.io.FileOutputStream;<br>
&gt;import java.io.InputStream;<br>
&gt;import java.io.OutputStream;<br>
&gt;<br>
&gt;import org.apache.avro.io.Decoder;<br>
&gt;import org.apache.avro.io.DecoderFactory;<br>
&gt;import org.apache.avro.io.Encoder;<br>
&gt;import org.apache.avro.io.EncoderFactory;<br>
&gt;import org.apache.avro.specific.SpecificDatumReader;<br>
&gt;import org.apache.avro.specific.SpecificDatumWriter;<br>
&gt;<br>
&gt;class MyWriter {<br>
&gt;<br>
&gt;    SpecificDatumWriter&lt;SpecificRecord&gt; wr;<br>
&gt;    Encoder enc;<br>
&gt;    OutputStream ostream;<br>
&gt;<br>
&gt;    public MyWriter() throws FileNotFoundException {<br>
&gt;        wr = new SpecificDatumWriter&lt;SpecificRecord&gt;(new<br>
&gt;Apple().getSchema());<br>
&gt;        ostream = new FileOutputStream(new File(&quot;/tmp/testavro&quot;));<br>
&gt;        enc = EncoderFactory.get().binaryEncoder(ostream, null);<br>
&gt;    }<br>
&gt;<br>
&gt;    public synchronized void dump(SpecificRecord event) throws<br>
&gt;IOException {<br>
&gt;        wr.write(event, enc);<br>
&gt;        enc.flush();<br>
&gt;    }<br>
&gt;<br>
&gt;}<br>
&gt;<br>
&gt;class MyReader {<br>
&gt;<br>
&gt;    SpecificDatumReader&lt;SpecificRecord&gt; rd;<br>
&gt;    Decoder dec;<br>
&gt;    InputStream istream;<br>
&gt;<br>
&gt;    public MyReader() throws FileNotFoundException {<br>
&gt;        rd = new SpecificDatumReader&lt;SpecificRecord&gt;(new<br>
&gt;Apple().getSchema());<br>
&gt;        istream = new FileInputStream(new File(&quot;/tmp/testavro&quot;));<br>
&gt;        dec = DecoderFactory.get().binaryDecoder(istream, null);<br>
&gt;    }<br>
&gt;<br>
&gt;    public synchronized SpecificRecord read() throws IOException {<br>
&gt;        Object r = rd.read(null, dec);<br>
&gt;        return (SpecificRecord) r;<br>
&gt;    }<br>
&gt;<br>
&gt;}<br>
&gt;<br>
&gt;public class AvroWriteAndReadSameTime {<br>
&gt;    @Test<br>
&gt;    public void testWritingAndReadingAtSameTime() throws Exception {<br>
&gt;<br>
&gt;        MyWriter dumper = new MyWriter();<br>
&gt;        final Apple apple = new Apple();<br>
&gt;        apple.taste = &quot;sweet&quot;;<br>
&gt;        dumper.dump(apple);<br>
&gt;<br>
&gt;        final MyReader rd = new MyReader();<br>
&gt;        rd.read();<br>
&gt;<br>
&gt;<br>
&gt;        try {<br>
&gt;        rd.read();<br>
&gt;        } catch (Exception e) {<br>
&gt;            e.printStackTrace();<br>
&gt;        }<br>
&gt;<br>
&gt;        // the second one somehow generates a NPE, we hope to get EOF...<br>
&gt;        try {<br>
&gt;        rd.read();<br>
&gt;        } catch (Exception e) {<br>
&gt;            e.printStackTrace();<br>
&gt;        }<br>
&gt;<br>
&gt;    }<br>
&gt;}<br>
&gt;<br>
&gt;<br>
&gt;<br>
&gt;<br>
&gt;<br>
&gt;the issue is in BinaryDecoder.readInt(), right now even when it hits<br>
&gt;EOF, it still advances the pos pointer.<br>
&gt;all the other APIs (readLong readFloat ...) do not do this. changing<br>
&gt;to the following  makes it work:<br>
&gt;<br>
&gt;<br>
&gt;  @Override<br>
&gt;  public int readInt() throws IOException {<br>
&gt;    ensureBounds(5); // won&#39;t throw index out of bounds<br>
&gt;    int len = 1;<br>
&gt;    int b = buf[pos] &amp; 0xff;<br>
&gt;    int n = b &amp; 0x7f;<br>
&gt;    if (b &gt; 0x7f) {<br>
&gt;      b = buf[pos + len++] &amp; 0xff;<br>
&gt;      n ^= (b &amp; 0x7f) &lt;&lt; 7;<br>
&gt;      if (b &gt; 0x7f) {<br>
&gt;        b = buf[pos + len++] &amp; 0xff;<br>
&gt;        n ^= (b &amp; 0x7f) &lt;&lt; 14;<br>
&gt;        if (b &gt; 0x7f) {<br>
&gt;          b = buf[pos + len++] &amp; 0xff;<br>
&gt;          n ^= (b &amp; 0x7f) &lt;&lt; 21;<br>
&gt;          if (b &gt; 0x7f) {<br>
&gt;            b = buf[pos + len++] &amp; 0xff;<br>
&gt;            n ^= (b &amp; 0x7f) &lt;&lt; 28;<br>
&gt;            if (b &gt; 0x7f) {<br>
&gt;              throw new IOException(&quot;Invalid int encoding&quot;);<br>
&gt;            }<br>
&gt;          }<br>
&gt;        }<br>
&gt;      }<br>
&gt;    }<br>
&gt;    if (pos+len &gt; limit) {<br>
&gt;      throw new EOFException();<br>
&gt;    }<br>
&gt;    pos += len;             //&lt;================== CHANGE, used to be<br>
&gt;above the EOF throw<br>
&gt;<br>
&gt;    return (n &gt;&gt;&gt; 1) ^ -(n &amp; 1); // back to two&#39;s-complement<br>
&gt;  }<br>
<br>
<br>
</div></div></blockquote></div><br>



[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic