[prev in list] [next in list] [prev in thread] [next in thread] 

List:       solr-dev
Subject:    Re: Lucene Query Parser Syntax Specification
From:       Gus Heck <gus.heck () gmail ! com>
Date:       2020-11-12 21:07:45
Message-ID: CAEUNc492fynDiXpixpSkdfV_jAVcsMQqiFj9YNjaum+qMCVovA () mail ! gmail ! com
[Download RAW message or body]

I have had this thought regarding IDE support too. I've had expressions
that when formatted for legibility are over 100 lines long, and adding
something in the middle that changes indenting is truly painful at that
point. At the moment I've got several irons in the fire already and can't
possibly take that on. The current implementation
(org.apache.solr.client.solrj.io.stream.expr.StreamExpressionParser) is
hand coded, and not generated from a grammar. So one would probably want to
correct that first so that syntax changes can be identified and adjusted in
downstream syntax highlighters relatively easily. Unfortunately when I
looked at this for Intellij briefly Intellij is favoring antlr, but javacc
and jflex are what we tend to use in the solr codebase.

-Gus

On Thu, Nov 12, 2020 at 7:02 AM ufuk yılmaz <uyilmaz@vivaldi.net.invalid>
wrote:

> I wish something like this existed for streaming expressions.
>
> To have highlighting and validation in an editor would be great!
>
>
>
> Sent from Mail <https://go.microsoft.com/fwlink/?LinkId=550986> for
> Windows 10
>
>
>
> *From: *Scott Guthery <sguthery@gmail.com>
> *Sent: *11 November 2020 23:54
> *To: *dev@lucene.apache.org
> *Subject: *Re: Lucene Query Parser Syntax Specification
>
>
>
> >> The source code is the de-facto specification
>
>
>
> Fair enough although it does beg the question of which parser source code,
> there being no shortage of Lucene/Solr/etc. query parsers, parser releases,
> and parser versions at github.  Anyway, below is my de jure yacc.  I think
> it covers everything in the 2012 specification and rounds out the special
> cases a little.
>
>
>
> Your comments are solicited and will be greatly appreciated.
>
>
>
> Cheers, Scott
>
>
>
> P.S.  yacc/bison can generate parsers in programming languages other than
> C including Java.
>
>
>
> query : query TOK_AND query
>       | query TOK_OR query
>       | TOK_NOT query
>       |  '('  query ')'
>       | term
> term:
>
> TOK_ALPHA   |
>
> TOK_WILD    |
> TOK_ALPHA ':' TOK_ALPHA |
> TOK_ALPHA ':' TOK_WILD  |
> TOK_ALPHA '~' |
> TOK_ALPHA '~' TOK_NUM |
> TOK_ALPHA '^' TOK_NUM |
> TOK_ALPHA ':' TOK_ALPHA '~'   |
> TOK_ALPHA ':' TOK_ALPHA '~' TOK_NUM  |
> TOK_ALPHA ':' TOK_ALPHA '^' TOK_NUM  |
> '"' TOK_ALPHA TOK_ALPHA '"' '~' TOK_NUM  |
> TOK_ALPHA ':' '[' TOK_NUM TOK_TO TOK_NUM ']' |
> TOK_ALPHA ':' '{' TOK_ALPHA TOK_TO TOK_ALPHA '}' |
> '+'TOK_ALPHA  |
> '-'TOK_ALPHA
>
>
>


-- 
http://www.needhamsoftware.com (work)
http://www.the111shift.com (play)

[Attachment #3 (text/html)]

<div dir="ltr">I have had this thought regarding IDE support too. I&#39;ve had \
expressions that when formatted for legibility are over 100 lines long, and adding \
something in the middle that changes indenting is truly painful at that point. At the \
moment I&#39;ve got several irons in the fire already and can&#39;t possibly take \
that on. The current implementation \
(org.apache.solr.client.solrj.io.stream.expr.StreamExpressionParser) is hand coded, \
and not generated from a grammar. So one would probably want to correct that first so \
that syntax changes can be identified and adjusted in downstream syntax highlighters \
relatively easily. Unfortunately when I looked at this for Intellij briefly Intellij \
is favoring antlr, but javacc and jflex are what we tend to use in the solr \
codebase.<div><br></div><div>-Gus</div></div><br><div class="gmail_quote"><div \
dir="ltr" class="gmail_attr">On Thu, Nov 12, 2020 at 7:02 AM ufuk yılmaz \
&lt;uyilmaz@vivaldi.net.invalid&gt; wrote:<br></div><blockquote class="gmail_quote" \
style="margin:0px 0px 0px 0.8ex;border-left:1px solid \
rgb(204,204,204);padding-left:1ex"><div lang="EN-GB"><div \
class="gmail-m_-1082690275612580968WordSection1"><p class="MsoNormal">I wish \
something like this existed for streaming expressions.</p><p class="MsoNormal">To \
have highlighting and validation in an editor would be great!</p><p \
class="MsoNormal"><u></u>  <u></u></p><p class="MsoNormal">Sent from <a \
href="https://go.microsoft.com/fwlink/?LinkId=550986" target="_blank">Mail</a> for \
Windows 10</p><p class="MsoNormal"><u></u>  <u></u></p><div \
style="border-right:none;border-bottom:none;border-left:none;border-top:1pt solid \
rgb(225,225,225);padding:3pt 0cm 0cm"><p class="MsoNormal" \
style="border:none;padding:0cm"><b>From: </b><a href="mailto:sguthery@gmail.com" \
target="_blank">Scott Guthery</a><br><b>Sent: </b>11 November 2020 23:54<br><b>To: \
</b><a href="mailto:dev@lucene.apache.org" \
target="_blank">dev@lucene.apache.org</a><br><b>Subject: </b>Re: Lucene Query Parser \
Syntax Specification</p></div><p class="MsoNormal"><u></u>  <u></u></p><div><div><p \
class="MsoNormal">&gt;&gt; The source code is the de-facto specification  </p><div><p \
class="MsoNormal"><u></u>  <u></u></p><div><p class="MsoNormal">Fair enough  although \
it does beg the question of which parser source code, there being no shortage of \
Lucene/Solr/etc. query parsers, parser  releases, and parser versions at github.   \
Anyway, below is my de jure yacc.   I think it covers everything in the 2012 \
specification and rounds out the special cases a little.</p></div></div><div><p \
class="MsoNormal"><u></u>  <u></u></p></div><div><p class="MsoNormal">Your comments \
are solicited and will be greatly appreciated.</p></div><div><p \
class="MsoNormal"><u></u>  <u></u></p></div><div><p class="MsoNormal">Cheers, \
Scott</p></div><div><p class="MsoNormal"><u></u>  <u></u></p></div><div><p \
class="MsoNormal">P.S.   yacc/bison can generate parsers in programming languages \
other than C including Java.</p></div><div><p class="MsoNormal"><u></u>  \
<u></u></p></div><div><p class="MsoNormal"><span style="font-family:&quot;Courier \
New&quot;">query : query TOK_AND query<br>         | query TOK_OR query<br>         | \
TOK_NOT query<br>         |  </span> &#39;(&#39;    <span \
style="font-family:&quot;Courier New&quot;">query</span>  &#39;)&#39;    <span \
style="font-family:&quot;Courier New&quot;"><br>         | term  <br>term:  \
</span></p></div><div><p class="MsoNormal"><span style="font-family:&quot;Courier \
New&quot;">TOK_ALPHA     |  </span></p></div></div></div><p class="MsoNormal"><span \
style="font-family:&quot;Courier New&quot;">TOK_WILD      | <br>TOK_ALPHA &#39;:&#39; \
TOK_ALPHA |<br>TOK_ALPHA &#39;:&#39; TOK_WILD   |<br>TOK_ALPHA &#39;~&#39; \
|<br>TOK_ALPHA &#39;~&#39; TOK_NUM |<br>TOK_ALPHA &#39;^&#39; TOK_NUM | <br>TOK_ALPHA \
&#39;:&#39; TOK_ALPHA &#39;~&#39;     |<br>TOK_ALPHA &#39;:&#39; TOK_ALPHA \
&#39;~&#39; TOK_NUM   |<br>TOK_ALPHA &#39;:&#39; TOK_ALPHA &#39;^&#39; TOK_NUM   \
|<br>&#39;&quot;&#39; TOK_ALPHA TOK_ALPHA &#39;&quot;&#39; &#39;~&#39; TOK_NUM   \
|<br>TOK_ALPHA &#39;:&#39; &#39;[&#39; TOK_NUM TOK_TO TOK_NUM &#39;]&#39; \
|<br>TOK_ALPHA &#39;:&#39; &#39;{&#39; TOK_ALPHA TOK_TO TOK_ALPHA &#39;}&#39; \
|<br>&#39;+&#39;TOK_ALPHA   | <br>&#39;-&#39;TOK_ALPHA  </span></p><p \
class="MsoNormal"><u></u>  <u></u></p></div></div></blockquote></div><br \
clear="all"><div><br></div>-- <br><div dir="ltr" class="gmail_signature"><div \
dir="ltr"><div><a href="http://www.needhamsoftware.com" \
target="_blank">http://www.needhamsoftware.com</a>  (work)</div><div><a \
href="http://www.the111shift.com" target="_blank">http://www.the111shift.com</a>  \
(play)</div></div></div>



[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic