[prev in list] [next in list] [prev in thread] [next in thread] 

List:       avro-user
Subject:    Re: recursive types
From:       Lee Hambley <lee.hambley () gmail ! com>
Date:       2019-12-04 11:38:11
Message-ID: CAN_+VLViegt30ZCrLW5rJ8Z3HDW92w6RYjKdvW=w6tFHVnjOeA () mail ! gmail ! com
[Download RAW message or body]

HI Rog,

Good question, the answer lay in the docs in the "Parsing Canonical Form
for Schemas" where it states (amongst all the other transformation rules)

[ORDER] Order the appearance of fields of JSON objects as follows: *name*,
> type, * fields*, symbols, items, values, size. For example, if an object
> has type, name, and size fields, then the name field should appear first,
> followed by the type and then the size fields.


(emphasis mine)

The canonical form for schemas becomes more relevant to Avro usage when
working with a schema registry for e.g, but it's a really common use-case
and I consider definition of a canonical form for schema comparisons to be
a strength of Avro compared with other serialization formats.

-
https://avro.apache.org/docs/1.8.2/spec.html#Parsing+Canonical+Form+for+Schemas

HTH,

Lee Hambley
http://lee.hambley.name/
+49 (0) 170 298 5667


On Wed, 4 Dec 2019 at 12:17, roger peppe <rogpeppe@gmail.com> wrote:

> Hi,
>
> My apologies in advance if this topic has been well discussed before - the
> mailing list search tool appears to be broken (the link points to the
> expired domain name "search-hadoop.com").
>
> I'm trying to understand about recursive types in Avro, given that the
> specification says about names
> <http://avro.apache.org/docs/current/spec.html#names>:
>
> a name must be defined before it is used ("before" in the depth-first,
>> left-to-right traversal of the JSON parse tree, where the types attribute
>> of a protocol is always deemed to come "before" the messages attribute.)
>
>
> By my reading, this would make the following Avro schema invalid, because
> the name "R" will not yet be defined when it's referenced inside the type
> of the field F, because in depth-first order, the leaf is traversed before
> the root.
>
> {
>     "type": "record",
>     "fields": [
>         {"name": "F", "type": ["null", "R"]}
>     ],
>     "name": "R"
> }
>
> It seems that types like this are valid in practice (I found the above
> example in an Avro test suite), so could someone enlighten me as to how
> this is allowed, please?
>
> Thanks for any info. If I'm asking in the wrong place, please advise me of
> a better forum!
>
>     rog.
>
>
>

[Attachment #3 (text/html)]

<div dir="ltr"><div class="gmail_default" \
style="font-family:arial,helvetica,sans-serif">HI Rog,</div><div \
class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></div><div \
class="gmail_default" style="font-family:arial,helvetica,sans-serif">Good question, \
the answer lay in the docs in the &quot;Parsing Canonical Form for Schemas&quot; \
where it states (amongst all the other transformation rules)</div><div \
class="gmail_default" \
style="font-family:arial,helvetica,sans-serif"><br></div><blockquote \
class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid \
rgb(204,204,204);padding-left:1ex"> [<span class="gmail_default" \
style="font-family:arial,helvetica,sans-serif"></span>ORDER] Order the appearance of \
                fields of JSON objects
          as follows: <b><span class="gmail-codefrag">name</span></b>, <span \
class="gmail-codefrag">type</span>, <b>          <span \
class="gmail-codefrag">fields</span></b>, <span \
                class="gmail-codefrag">symbols</span>,
          <span class="gmail-codefrag">items</span>, <span \
                class="gmail-codefrag">values</span>, <span \
                class="gmail-codefrag">size</span>.
          For example, if an object has <span class="gmail-codefrag">type</span>,
          <span class="gmail-codefrag">name</span>, and <span \
                class="gmail-codefrag">size</span> fields, then the
          <span class="gmail-codefrag">name</span> field should appear first, \
followed by the  <span class="gmail-codefrag">type</span> and then the <span \
class="gmail-codefrag">size</span> fields.</blockquote><div><br></div><div \
style="font-family:arial,helvetica,sans-serif" class="gmail_default">(emphasis \
mine)</div><div style="font-family:arial,helvetica,sans-serif" \
class="gmail_default"><br></div><div style="font-family:arial,helvetica,sans-serif" \
class="gmail_default">The canonical form for schemas becomes more relevant to Avro \
usage when working with a schema registry for e.g, but it&#39;s a really common \
use-case and I consider definition of a canonical form for schema comparisons to be a \
strength of Avro compared with other serialization formats.<br></div><div \
class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></div><div \
class="gmail_default" style="font-family:arial,helvetica,sans-serif">- <a \
href="https://avro.apache.org/docs/1.8.2/spec.html#Parsing+Canonical+Form+for+Schemas" \
>https://avro.apache.org/docs/1.8.2/spec.html#Parsing+Canonical+Form+for+Schemas</a></div><div \
> class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></div><div \
> class="gmail_default" \
> style="font-family:arial,helvetica,sans-serif">HTH,<br></div><div \
> class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br \
> clear="all"></div><div><div dir="ltr" class="gmail_signature" \
> data-smartmail="gmail_signature"><div dir="ltr"><div>Lee Hambley<div><a \
> href="http://lee.hambley.name/" \
> target="_blank">http://lee.hambley.name/</a><br></div><div>+49 (0) 170 298 \
> 5667</div></div></div></div></div><br></div><br><div class="gmail_quote"><div \
> dir="ltr" class="gmail_attr">On Wed, 4 Dec 2019 at 12:17, roger peppe &lt;<a \
> href="mailto:rogpeppe@gmail.com">rogpeppe@gmail.com</a>&gt; \
> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px \
> 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div \
> dir="ltr">Hi,<div><br><div>My apologies in advance if this topic has been well \
> discussed before - the mailing list search tool appears to be broken (the link \
> points to the expired domain name &quot;<a href="http://search-hadoop.com" \
> target="_blank">search-hadoop.com</a>&quot;).</div><div><br></div></div><div>I&#39;m \
> trying to understand about recursive types in Avro, given that the specification \
> says about <a href="http://avro.apache.org/docs/current/spec.html#names" \
> target="_blank">names</a>:</div><div><br></div><blockquote class="gmail_quote" \
> style="margin:0px 0px 0px 0.8ex;border-left:1px solid \
> rgb(204,204,204);padding-left:1ex"><span \
> style="color:rgb(0,0,0);font-family:Verdana,Helvetica,sans-serif;font-size:16px">a \
> name must be defined before it is used (&quot;before&quot; in the depth-first, \
> left-to-right traversal of the JSON parse tree, where the  </span><span \
> style="font-family:&quot;Courier \
> New&quot;,Courier,monospace;font-size:17.6px;color:rgb(0,0,0)">types</span><span \
> style="color:rgb(0,0,0);font-family:Verdana,Helvetica,sans-serif;font-size:16px">  \
> attribute of a protocol is always deemed to come &quot;before&quot; the  \
> </span><span style="font-family:&quot;Courier \
> New&quot;,Courier,monospace;font-size:17.6px;color:rgb(0,0,0)">messages</span><span \
> style="color:rgb(0,0,0);font-family:Verdana,Helvetica,sans-serif;font-size:16px">  \
> attribute.)</span></blockquote><div><br></div><div>By my reading, this would make \
> the following Avro schema invalid, because the name &quot;R&quot; will not yet be \
> defined when it&#39;s referenced inside the type of the field F, because in \
> depth-first order, the leaf is traversed before the \
> root.</div><div><br></div><div>{<br>      &quot;type&quot;: &quot;record&quot;,<br> \
> &quot;fields&quot;: [<br>            {&quot;name&quot;: &quot;F&quot;, \
> &quot;type&quot;: [&quot;null&quot;, &quot;R&quot;]}<br>      ],<br>      \
> &quot;name&quot;: &quot;R&quot;<br>}<br></div><div><br></div><div>It seems that \
> types like this are valid in practice (I found the above example in an Avro test \
> suite), so could someone enlighten me as to how this is allowed, \
> please?</div><div><br></div><div>Thanks for any info. If I&#39;m asking in the \
> wrong place, please advise me of a better forum!</div><div><br></div><div>      \
> rog.</div><div><br></div><div><br></div></div>
</blockquote></div>



[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic