[prev in list] [next in list] [prev in thread] [next in thread] 

List:       flume-user
Subject:    Re: Json over netcat source
From:       Deepak Subhramanian <deepak.subhramanian () gmail ! com>
Date:       2014-05-09 11:02:11
Message-ID: CA+UubijKn9nUu5vX8k9SFcGCTkfDF8boN59r1E_Kmhy7Nq0_5g () mail ! gmail ! com
[Download RAW message or body]

Sorry. My mistake. It is loading JSON data properly after the temporary fix.


On Thu, May 8, 2014 at 6:24 PM, Deepak Subhramanian <
deepak.subhramanian@gmail.com> wrote:

> Hi Ashish,
> 
> Thanks for the solution. I made the changes and I can see the JSON message
> now. There is a JIRA raised on the same issue.
> 
> https://issues.apache.org/jira/browse/FLUME-2126
> 
> 
> From Hive when I load JSON data it automatically splits JSON fields to
> different columns. For some reason the ESSink doesnt load in the same way.
> I am not sure if I am setting the correct type. There is a parameter es.
> input.json I have to set to true in hive table . Is there any similar
> variable I have to set for ESSink
> 
> Here is the raw data I am getting in Kibana.
> 
> {
> "_index": "test-2014-05-08",
> "_type": "parsed_logs",
> "_id": "7qSBgRx-Q_GLaCDWARs_Cg",
> "_score": null,
> "_source": {
> "@message": "{\"action\":{\"id\":\"00001\"}}",
> "@timestamp": "2014-05-08T16:48:44.180Z",
> "@type": "application/json",
> "@fields": {
> "_attachment_mimetype": "application/json",
> "timestamp": "1399567724180",
> "_type": "application/json",
> "type": "application/json"
> }
> },
> "sort": [
> 1399567724180
> ]
> }
> 
> 
> 
> On Sun, Apr 13, 2014 at 4:56 PM, Ashish <paliwalashish@gmail.com> wrote:
> 
> > little more on the issue
> > 
> > builder.field(fieldName, tmp); calls the XContentBuilder API where class
> > type is determined and appropriate method is called. Since tmp, which is
> > instance of XContentBuilder, doesn't match any of the defined if conditions
> > it goes to final else where the tmp.toString() is called, and field(String,
> > String) method is called so we get object address in index.
> > 
> > Replacing
> > builder.field(fieldName, tmp);
> > with
> > builder.field(fieldName, tmp.string());
> > 
> > shall make things work, but I am not sure if this would be the best way
> > to use the API.
> > 
> > Got the answer from ES user list :)
> > 
> > http://elasticsearch-users.115913.n3.nabble.com/Issue-with-posting-json-data-to-elastic-search-via-Flume-td4054017.html
> >  
> > Can ES experts comment on the best way forward?
> > 
> > 
> > 
> > On Sun, Apr 13, 2014 at 8:10 PM, Ashish <paliwalashish@gmail.com> wrote:
> > 
> > > Have been able to reproduce the problem locally using the existing test
> > > cases inside ES Sink. The problem does exist.
> > > 
> > > Did some initial investigation, the framework is able to detect the JSON
> > > content and tries to add it as complex field.
> > > timestamp is added only if present in header.
> > > 
> > > In the class org.apache.flume.sink.elasticsearch.ContentBuilderUtil
> > > 
> > > public static void addComplexField(XContentBuilder builder, String
> > > fieldName,
> > > XContentType contentType, byte[] data) throws IOException {
> > > XContentParser parser = null;
> > > try {
> > > XContentBuilder tmp = jsonBuilder();
> > > parser = XContentFactory.xContent(contentType).createParser(data);
> > > parser.nextToken();
> > > tmp.copyCurrentStructure(parser);
> > > builder.field(fieldName, tmp); <<<< This is where the we might
> > > have an issue (real action is happening inside this method
> > > call)
> > > 
> > > Can someone familiar with this part look further into this? I shall
> > > debug further as soon as I have free cycles.
> > > 
> > > thanks
> > > ashish
> > > 
> > > 
> > > 
> > > On Fri, Apr 11, 2014 at 5:24 PM, Deepak Subhramanian <
> > > deepak.subhramanian@gmail.com> wrote:
> > > 
> > > > Thanks Simon. I am also struggling with no luck. I tried using the
> > > > latest flume elastic search sink jar  build from 1.5SNAPSHOT ,but still no
> > > > luck. I will try to see if it is an issue with elastic search api . When I
> > > > loaded json using hive it loaded JSON properly. But we have to pass a
> > > > property es.input.json in hive.  Is there a way to pass the same in Flume.
> > > > 
> > > > CREATE EXTERNAL TABLE json (data STRING \
> > > > <http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/current/hive.html#CO25-1>)
> > > >  
> > > > 
> > > > 
> > > > 
> > > > STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler'
> > > > TBLPROPERTIES('es.resource' = '...',
> > > > 
> > > > 
> > > > 
> > > > 
> > > > 'es.input.json` = 'yes' \
> > > > <http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/current/hive.html#CO25-2>);
> > > >  
> > > > 
> > > 
> > > 
> > > --
> > > thanks
> > > ashish
> > > 
> > > Blog: http://www.ashishpaliwal.com/blog
> > > My Photo Galleries: http://www.pbase.com/ashishpaliwal
> > > 
> > 
> > 
> > 
> > --
> > thanks
> > ashish
> > 
> > Blog: http://www.ashishpaliwal.com/blog
> > My Photo Galleries: http://www.pbase.com/ashishpaliwal
> > 
> 
> 
> 
> --
> Deepak Subhramanian
> 



-- 
Deepak Subhramanian


[Attachment #3 (text/html)]

<div dir="ltr">Sorry. My mistake. It is loading JSON data properly after the \
temporary fix.</div><div class="gmail_extra"><br><br><div class="gmail_quote">On Thu, \
May 8, 2014 at 6:24 PM, Deepak Subhramanian <span dir="ltr">&lt;<a \
href="mailto:deepak.subhramanian@gmail.com" \
target="_blank">deepak.subhramanian@gmail.com</a>&gt;</span> wrote:<br> <blockquote \
class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc \
solid;padding-left:1ex"><div dir="ltr">Hi Ashish,<div><br></div><div>Thanks for the \
solution. I made the changes and I can see the JSON message now. There is a JIRA \
raised on the same issue.</div> <div><div>  </div></div><div><a \
href="https://issues.apache.org/jira/browse/FLUME-2126" \
target="_blank">https://issues.apache.org/jira/browse/FLUME-2126</a><br> \
</div><div><br></div><div><br></div><div>From Hive when I load JSON data it \
automatically splits JSON fields to different columns. For some reason the ESSink \
doesnt load in the same way. I am not sure if I am setting the correct type. There is \
a parameter  <span style="font-family:Consolas,Menlo,&#39;DejaVu Sans \
Mono&#39;,&#39;Bitstream Vera Sans Mono&#39;,&#39;Lucida \
Console&#39;;font-size:0.9em;line-height:1.5em;white-space:pre-wrap">es</span><span \
style="color:rgb(0,136,0);background-color:rgb(240,240,240);font-family:Consolas,Menlo,&#39;DejaVu \
Sans Mono&#39;,&#39;Bitstream Vera Sans Mono&#39;,&#39;Lucida \
Console&#39;;font-size:0.9em;line-height:1.5em;white-space:pre-wrap">.</span><span \
style="font-family:Consolas,Menlo,&#39;DejaVu Sans Mono&#39;,&#39;Bitstream Vera Sans \
Mono&#39;,&#39;Lucida \
Console&#39;;font-size:0.9em;line-height:1.5em;white-space:pre-wrap">input</span><span \
style="color:rgb(0,136,0);background-color:rgb(240,240,240);font-family:Consolas,Menlo,&#39;DejaVu \
Sans Mono&#39;,&#39;Bitstream Vera Sans Mono&#39;,&#39;Lucida \
Console&#39;;font-size:0.9em;line-height:1.5em;white-space:pre-wrap">.json</span>  I \
have to set to true in hive table <font face="Consolas, Menlo, DejaVu Sans Mono, \
Bitstream Vera Sans Mono, Lucida Console"><span \
style="font-size:11.818181991577148px;line-height:17.549999237060547px;white-space:pre-wrap">. \
Is there any similar variable I have to set for ESSink </span></font></div>

<div><br></div><div>Here is the raw data I am getting in Kibana.  </div><div><pre \
style="padding:9.5px;font-family:Menlo,Monaco,Consolas,&#39;Courier \
New&#39;,monospace;font-size:13px;color:rgb(153,153,153);border-top-left-radius:3px;bo \
rder-top-right-radius:3px;border-bottom-right-radius:3px;border-bottom-left-radius:3px \
;margin-top:0px;margin-bottom:10px;line-height:20px;word-break:break-all;word-wrap:break-word;white-space:pre-wrap;background-color:rgb(238,238,238);border:1px \
solid rgba(0,0,0,0.14902)"> {
  &quot;_index&quot;: &quot;test-2014-05-08&quot;,
  &quot;_type&quot;: &quot;parsed_logs&quot;,
  &quot;_id&quot;: &quot;7qSBgRx-Q_GLaCDWARs_Cg&quot;,
  &quot;_score&quot;: null,
  &quot;_source&quot;: {
    &quot;@message&quot;: \
&quot;{\&quot;action\&quot;:{\&quot;id\&quot;:\&quot;00001\&quot;}}&quot;,  \
&quot;@timestamp&quot;: &quot;2014-05-08T16:48:44.180Z&quot;,  &quot;@type&quot;: \
&quot;application/json&quot;,  &quot;@fields&quot;: {
      &quot;_attachment_mimetype&quot;: &quot;application/json&quot;,
      &quot;timestamp&quot;: &quot;1399567724180&quot;,
      &quot;_type&quot;: &quot;application/json&quot;,
      &quot;type&quot;: &quot;application/json&quot;
    }
  },
  &quot;sort&quot;: [
    1399567724180
  ]
}</pre></div></div><div class="gmail_extra"><div><div class="h5"><br><br><div \
class="gmail_quote">On Sun, Apr 13, 2014 at 4:56 PM, Ashish <span dir="ltr">&lt;<a \
href="mailto:paliwalashish@gmail.com" \
target="_blank">paliwalashish@gmail.com</a>&gt;</span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc \
solid;padding-left:1ex"><div dir="ltr">little more on the \
issue<div><br></div><div><span \
style="font-family:arial,sans-serif;font-size:13px">builder.field(fieldName, tmp); \
calls the X</span><font face="arial, sans-serif">ContentBuilder API where class type \
is determined and appropriate method is called. Since tmp, which is instance of  \
XContentBuilder, doesn&#39;t match any of the defined if conditions it goes to final \
else where the tmp.toString() is called, and field(String, String) method is called \
so we get object address in index.</font><br>


</div><div><font face="arial, sans-serif"><br></font></div><div><font face="arial, \
sans-serif">Replacing</font></div><div><span \
style="font-family:arial,sans-serif;font-size:13px">builder.field(fieldName, tmp);  \
</span><font face="arial, sans-serif"><br>


</font></div><div><span \
style="font-family:arial,sans-serif;font-size:13px">with</span></div><div><font \
face="arial, sans-serif">builder.field(fieldName, \
tmp.string());<br></font></div><div><font face="arial, sans-serif"><br>


</font></div><div><font face="arial, sans-serif">shall make things work, but I am not \
sure if this would be the best way to use the API.</font></div><div><font \
face="arial, sans-serif"><br></font></div><div><span \
style="font-family:arial,sans-serif">Got the answer from ES user list :)</span><br>


</div><div><font face="arial, sans-serif"><a \
href="http://elasticsearch-users.115913.n3.nabble.com/Issue-with-posting-json-data-to-elastic-search-via-Flume-td4054017.html" \
target="_blank">http://elasticsearch-users.115913.n3.nabble.com/Issue-with-posting-json-data-to-elastic-search-via-Flume-td4054017.html</a><br>



</font></div><div><font face="arial, sans-serif"><br></font></div><div><font \
face="arial, sans-serif">Can ES experts comment on the best way \
forward?</font></div><div><br></div></div><div><div><div class="gmail_extra"> \
<br><br><div class="gmail_quote"> On Sun, Apr 13, 2014 at 8:10 PM, Ashish <span \
dir="ltr">&lt;<a href="mailto:paliwalashish@gmail.com" \
target="_blank">paliwalashish@gmail.com</a>&gt;</span> wrote:<br><blockquote \
class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc \
solid;padding-left:1ex">


<div dir="ltr">Have been able to reproduce the problem locally using the existing \
test cases inside ES Sink. The problem does exist.<div><br></div><div>Did some \
initial investigation, the framework is able to detect the JSON content and tries to \
add it as complex field.</div>



<div>timestamp is added only if present in header.</div><div><br></div><div>In the \
class org.apache.flume.sink.elasticsearch.ContentBuilderUtil<br></div><div><br></div><div><div>public \
static void addComplexField(XContentBuilder builder, String fieldName,</div>



<div>         XContentType contentType, byte[] data) throws IOException {</div><div>  \
XContentParser parser = null;</div><div>      try {</div><div>         \
XContentBuilder tmp = jsonBuilder();</div><div>         parser = \
XContentFactory.xContent(contentType).createParser(data);</div>



<div>         parser.nextToken();</div><div>         \
tmp.copyCurrentStructure(parser);</div><div>         builder.field(fieldName, tmp); \
&lt;&lt;&lt;&lt; This is where the we might have an issue (real action is happening \
inside this method                                                                    \
call)</div>



</div><div><br></div><div>Can someone familiar with this part look further into this? \
I shall debug further as soon as I have free \
cycles.</div><div><br></div><div>thanks</div><div>ashish</div><div><br></div></div><div \
class="gmail_extra">


<div><div>
<br><br><div class="gmail_quote">On Fri, Apr 11, 2014 at 5:24 PM, Deepak Subhramanian \
<span dir="ltr">&lt;<a href="mailto:deepak.subhramanian@gmail.com" \
target="_blank">deepak.subhramanian@gmail.com</a>&gt;</span> wrote:<br>



<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc \
solid;padding-left:1ex"><div dir="ltr">  Thanks Simon. I am also struggling with no \
luck. I tried using the latest flume elastic search sink jar   build from 1.5SNAPSHOT \
,but still no luck. I will try to see if it is an issue with elastic search api . \
When I loaded json using hive it loaded JSON properly. But we have to pass a property \
es.input.json in hive.   Is there a way to pass the same in Flume.<div>




<br></div><div><pre style="margin-bottom:15px;font-family:Consolas,Menlo,&#39;DejaVu \
Sans Mono&#39;,&#39;Bitstream Vera Sans Mono&#39;,&#39;Lucida \
Console&#39;;font-size:0.9em;white-space:pre-wrap;word-wrap:break-word;padding:8px \
10px 8px 18px;border-left-width:3px;border-style:none none none \
solid;border-left-color:rgb(116,183,63);overflow:auto;background-color:rgb(240,240,240 \
);line-height:1.5em;color:rgb(136,136,136);border-top-right-radius:5px;border-bottom-right-radius:5px">
 <span style="color:rgb(0,0,0)!important">CREATE EXTERNAL TABLE json </span><span \
style="color:rgb(102,102,0)!important">(</span><span \
style="color:rgb(0,0,0)!important">data STRING</span><a \
href="http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/current/hive.html#CO25-1" \
style="color:rgb(116,183,63);text-decoration:none;outline:none" \
target="_blank"></a><span><img \
src="http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/current/images/icons/callouts/1.png" \
alt="" style="border:0px;vertical-align:middle"></span><span \
style="color:rgb(102,102,0)!important">)</span><span \
style="color:rgb(0,0,0)!important"><br>




STORED BY </span><span \
style="color:rgb(0,136,0)!important">&#39;org.elasticsearch.hadoop.hive.EsStorageHandler&#39;</span><span \
style="color:rgb(0,0,0)!important"><br>TBLPROPERTIES</span><span \
style="color:rgb(102,102,0)!important">(</span><span \
style="color:rgb(0,136,0)!important">&#39;es.resource&#39;</span><span \
style="color:rgb(0,0,0)!important"> </span><span \
style="color:rgb(102,102,0)!important">=</span><span \
style="color:rgb(0,0,0)!important"> </span><span \
style="color:rgb(0,136,0)!important">&#39;...&#39;</span><span \
style="color:rgb(102,102,0)!important">,</span><span \
style="color:rgb(0,0,0)!important"><br>




                     </span><span \
style="color:rgb(0,136,0)!important">&#39;es.input.json` = &#39;</span><span \
style="color:rgb(0,0,0)!important">yes</span><span \
style="color:rgb(0,136,0)!important">&#39;</span><a \
href="http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/current/hive.html#CO25-2" \
style="color:rgb(116,183,63);text-decoration:none;outline:none" \
target="_blank"></a><span><img \
src="http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/current/images/icons/callouts/2.png" \
alt="" style="border:0px;vertical-align:middle"></span><span \
style="color:rgb(0,136,0)!important">);</span></pre>




</div></div>
</blockquote></div><br><br clear="all"><div><br></div></div></div><span><font \
color="#888888">-- <br>thanks<br>ashish<br><br>Blog: <a \
href="http://www.ashishpaliwal.com/blog" \
target="_blank">http://www.ashishpaliwal.com/blog</a><br>


My Photo Galleries: <a href="http://www.pbase.com/ashishpaliwal" \
target="_blank">http://www.pbase.com/ashishpaliwal</a> </font></span></div>
</blockquote></div><br><br clear="all"><div><br></div>-- \
<br>thanks<br>ashish<br><br>Blog: <a href="http://www.ashishpaliwal.com/blog" \
target="_blank">http://www.ashishpaliwal.com/blog</a><br>My Photo Galleries: <a \
href="http://www.pbase.com/ashishpaliwal" \
target="_blank">http://www.pbase.com/ashishpaliwal</a> </div>
</div></div></blockquote></div><br><br clear="all"><div><br></div></div></div><span \
class="HOEnZb"><font color="#888888">-- <br>Deepak Subhramanian </font></span></div>
</blockquote></div><br><br clear="all"><div><br></div>-- <br>Deepak Subhramanian
</div>



[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic