[prev in list] [next in list] [prev in thread] [next in thread] 

List:       lucene-user
Subject:    =?UTF-8?B?5Zue5aSN77yaIOWbnuWkje+8mkhvdyB0byBxdWVyeSBmb3IgJ2FueSB3b3JkJyBpbiBhIHBo?= =?UTF-8?B?cmFzZ
From:       "=?UTF-8?B?6ZmI5b+X56Wl?=" <zhixiang.czx () alibaba-inc ! com>
Date:       2020-01-09 16:29:26
Message-ID: 74505541-4de3-4ec4-a3df-ac4affef5a87.zhixiang.czx () alibaba-inc ! com
[Download RAW message or body]

[Attachment #2 (multipart/alternative)]

[Attachment #4 (text/plain)]

To be more clear,i think you need build a custom PhraseQuery class,which can set \
each slop value between sub terms,also you need a special WildchardTerm matching \
any term which is only used in this custom PhraseQuery context……

Or just use grep tool or regex automata to scan?







  	
 陈志祥 
阿里巴巴 地图引擎 心算法工程师 
 电话:057128223456-81124100 
 邮箱:zhixiang.czx@alibaba-inc.com 
 地址:上海-长宁-申通信息广场 
	     
	   		 阿里巴巴  	 企业主页  		      
 信息安全声明:本邮件包含信息归发件人所在组织所有,发件人所在组织对该邮件拥有所有权利。
 请接收者注意保密,未经发件人书面许可,不得向任何第三方组 \
和个人透露本邮件所含信息的全部或部分。以上声明仅适用于工作邮件。
 Information Security Notice: The information contained in this mail is solely \
property of the sender's organization.  This mail communication is confidential. \
Recipients named above are obligated to maintain secrecy and are not permitted to \
disclose the contents of this communication to others.  \
------------------------------------------------------------------ 发件人:Jeroen \
Lauwers<Jeroen.Lauwers@CTLO.NET> 日 期:2020年01月09日 23:41:37
收件人:java-user@lucene.apache.org<java-user@lucene.apache.org>
主 题:RE: 回复:How to query for 'any word' in a phrase

I don't understand your question:

In general: can it be set? Yes, : \
PhraseQuery<https://lucene.apache.org/core/7_7_2/core/org/apache/lucene/search/PhraseQ \
uery.html#PhraseQuery-int-java.lang.String-org.apache.lucene.util.BytesRef...->(int \
slop, String<https://docs.oracle.com/javase/8/docs/api/java/lang/String.html?is-external=true> \
field, BytesRef<https://lucene.apache.org/core/7_7_2/core/org/apache/lucene/util/BytesRef.html>... \
terms) In my specific case: also Yes. I'm parsing the query myself in a custom \
parser, so yes I can do it

As far as I understand, the slop is not specific to a position
Please explain how this could help.

Jeroen

From: 陈志祥 <zhixiang.czx@alibaba-inc.com>
Sent: donderdag 9 januari 2020 16:31
To: java-user@lucene.apache.org
Subject: 回复:How to query for 'any word' in a phrase

could the slop parameter in phasequery be dynamically set?

------------------------------------------------------------------
发件人:Jeroen Lauwers<Jeroen.Lauwers@CTLO.NET<mailto:Jeroen.Lauwers@CTLO.NET>>
日 期:2020年01月09日 23:17:37
收件人:java-user@lucene.apache.org<java-user@lucene.apache.org<mailto:java-user@lucene.apache.org%3cjava-user@lucene.apache.org>>
 主 题:How to query for 'any word' in a phrase

Dear all,

Is there a way to construct (spans?) a phrase search like the following:
the quick brown * jumps over the * *
where * = any word but exactly 1 word

I introduced these *'s at a specific position, so a PhraseQuery with slop of 2 is \
just not good enough and the two *'s at the end must be matched as well.

Is there such a thing as a Term or BytesRef that always matches everything?

Thanks,
Jeroen


[Attachment #5 (text/html)]

To be more clear,i think you need build a custom PhraseQuery class,which can set \
each slop value between sub terms,also you need a special WildchardTerm matching \
any term which is only used in this custom PhraseQuery context……<br><br>Or just \
use grep tool or regex automata to scan?<br><br><br><br><br><br><div \
id="bali-sign">    <div id="container"         \
style="overflow:hidden;word-break:break-all;font-family:'PingFangSC-Regular', Tahoma, \
Arial, 'Hiragino Sans GB', 'WenQuanYi Micro Hei', 'Microsoft YaHei', 宋体, \
sans-serif"         __aliyun_disable_scale="1" _f="MailSignatureTemplate">        \
<table id="wrap"               style="border-collapse: collapse;margin: \
0;border-radius: 4px;background: #FFFFFF;border: 1px solid #E2E2E2"               \
cellpadding="16" cellspacing="0" __aliyun_disable_scale="1" _bc="#E2E2E2">            \
<tr>                <td>                    <table style="border-collapse: \
collapse;border: none" cellpadding="0"                           cellspacing="0"      \
__aliyun_disable_scale="1">                        <tr>                            \
<td width="96" valign="top">                                <div style="margin-top: \
8px">                                        <img style="border-radius: \
50%;border:none" width="96" height="96" \
src="cid:_aliyun_auto_inline_AE795096FB27A5A79F5D47E6D06A613A" \
class="inline-attachment"/>                                </div>                     \
</td>                            <td width="16"></td>                            <td \
valign="top" style="width: 252px">                                <div \
name="nick_name"                                     style="margin-top: \
8px;line-height: 24px;font-family: PingFangSC-Medium;font-size: 16px;color: \
#191F25;">                                    陈志祥                               \
</div>                                <div style="padding-bottom: 4px;line-height: \
20px;font-size: 12px;color: #191F25;font-family: PingFangSC-Medium">                  \
<span name="org_name">阿里巴巴</span>                                    <span \
name="position"                                          style="padding-left: \
8px">地图引擎 心算法工程师</span>                                </div>   \
<div name="mobile"                                     style="line-height: \
16px;font-size: 11px;color: #A3A4A6">                                        \
电话:057128223456-81124100                                </div>                  \
<div name="email"                                     style="line-height: \
16px;font-size: 11px;color: #A3A4A6">                                        \
邮箱:zhixiang.czx@alibaba-inc.com                                </div>           \
<div name="address"                                     style="line-height: \
16px;font-size: 11px;color: #A3A4A6">                                        \
地址:上海-长宁-申通信息广场                                </div>       \
<div name="fax"                                     style="line-height: \
16px;font-size: 11px;color: #A3A4A6">                                </div>           \
<div name="zipcode"                                     style="line-height: \
16px;font-size: 11px;color: #A3A4A6">                                </div>           \
</td>                            <td width="16" valign="top">                         \
<a name="small_quick_mark" style="text-decoration: none!important"                    \
href="https://tms.dingtalk.com/markets/dingtalk/person-view-v2?token=1B6294454CD1D4499 \
FF5DBCBBB2150CB765636FFF84AD096D62C7A74B9DD20DD7E289FE886C65C3A037689E72B9EF3FC">     \
<img width="16" height="16" style="border:none" \
src="cid:_aliyun_auto_inline_E32A21EC517A6AC9F6A5385FD3CE07AC" \
class="inline-attachment"/>                                </a>                       \
</td>                        </tr>                    </table>                    \
<div style="margin-top: 10px;border-top: 1px solid #E2E2E2;padding-top: 10px">        \
<table style="border-collapse: collapse;border: none;width: 100%;line-height:32px" \
cellpadding="0" cellspacing="0" __aliyun_disable_scale="1" >                          \
<tr>                                    <td width="32" valign="middle">               \
<a style="text-decoration: none!important" target="_blank" \
href="https://h5.dingtalk.com/home/index.html?corpId=dingd8e1123006514592&amp;token=dd9393e11685028a443f58f91cb00b2a&amp;from=emailSign"> \
<img style="border-radius: 4px;border:none;vertical-align: middle" \
src="cid:_aliyun_auto_inline_9F7C182BF8062B911BE1E8F92554FAF5" width="32" height="32" \
class="inline-attachment"/>                                        </a>               \
</td>                                    <td width="8"></td>                          \
<td align="left" valign="middle">                                    <span \
name="org_name"                                         style="font-family: \
PingFangSC-Medium;font-size: 14px;color: #191F25;vertical-align: middle">             \
阿里巴巴                                    </span>                               \
</td>                                    <td align="right" valign="middle">           \
<a style="font-size: 12px;color: #A3A4A6;text-decoration: none!important"             \
target="_blank" href="https://h5.dingtalk.com/home/index.html?corpId=dingd8e1123006514592&amp;token=dd9393e11685028a443f58f91cb00b2a&amp;from=emailSign"> \
企业主页                                       </a>                               \
</td>                                    <td width="6" valign="middle"></td>          \
<td width="10" valign="middle">                                        <a \
style="text-decoration: none!important" target="_blank" \
href="https://h5.dingtalk.com/home/index.html?corpId=dingd8e1123006514592&amp;token=dd9393e11685028a443f58f91cb00b2a&amp;from=emailSign"> \
<img style="vertical-align: middle;border:none" width="10" height="20" \
src="cid:_aliyun_auto_inline_22B4361AEE7B57E7F2495B272639A454" \
class="inline-attachment"/>                                        </a>               \
</td>                            </tr>                        </table>                \
</div>                    <div name="extension"                         \
style="line-height: 20px;font-size: 11px;color: \
#191F25;padding-top:16px;margin-top:10px;text-align: center;border-top: 1px solid \
#e2e2e2;display:none">                    </div>                </td>            \
</tr>        </table>        <div class="disclaimer" name="disclaimer" \
style="line-height: 18px;font-size: 12px;color: #A3A4A6;margin: 12px 0">              \
信息安全声明:本邮件包含信息归发件人所在组织所有,发件人 \
在组织对该邮件拥有所有权利。<br>请接收者注意保密,未经发件 \
人书面许可,不得向任何第三方组织和个人透露本邮件所含信息的全部或部分。以上声明仅适用于工作邮件。<br>Information \
Security Notice: The information contained in this mail is solely property of the \
sender's organization. <br>This mail communication is confidential. Recipients named \
above are obligated to maintain secrecy and are not permitted to disclose the \
contents of this communication to others.        </div>    \
</div></div><blockquote>------------------------------------------------------------------<br \
/>发件人:Jeroen Lauwers&lt;Jeroen.Lauwers@CTLO.NET&gt;<br \
/>日 期:2020年01月09日 23:41:37<br \
/>收件人:java-user@lucene.apache.org&lt;java-user@lucene.apache.org&gt;<br \
/>主 题:RE: &#22238;&#22797;&#65306;How to query for 'any word' in a phrase<br \
/><br />I&nbsp;don&rsquo;t&nbsp;understand&nbsp;your&nbsp;question:<br><br>In&nbsp;gen \
eral:&nbsp;can&nbsp;it&nbsp;be&nbsp;set?&nbsp;Yes,&nbsp;:&nbsp;PhraseQuery&lt;https:// \
lucene.apache.org/core/7_7_2/core/org/apache/lucene/search/PhraseQuery.html#PhraseQuer \
y-int-java.lang.String-org.apache.lucene.util.BytesRef...-&gt;(int&nbsp;slop,&nbsp;Str \
ing&lt;https://docs.oracle.com/javase/8/docs/api/java/lang/String.html?is-external=tru \
e&gt;&nbsp;field,&nbsp;BytesRef&lt;https://lucene.apache.org/core/7_7_2/core/org/apach \
e/lucene/util/BytesRef.html&gt;...&nbsp;terms)<br>In&nbsp;my&nbsp;specific&nbsp;case:& \
nbsp;also&nbsp;Yes.&nbsp;I&rsquo;m&nbsp;parsing&nbsp;the&nbsp;query&nbsp;myself&nbsp;i \
n&nbsp;a&nbsp;custom&nbsp;parser,&nbsp;so&nbsp;yes&nbsp;I&nbsp;can&nbsp;do&nbsp;it<br> \
<br>As&nbsp;far&nbsp;as&nbsp;I&nbsp;understand,&nbsp;the&nbsp;slop&nbsp;is&nbsp;not&nb \
sp;specific&nbsp;to&nbsp;a&nbsp;position<br>Please&nbsp;explain&nbsp;how&nbsp;this&nbs \
p;could&nbsp;help.<br><br>Jeroen<br><br>From:&nbsp;陈志祥&nbsp;&lt;zhixiang.czx@ali \
baba-inc.com&gt;<br>Sent:&nbsp;donderdag&nbsp;9&nbsp;januari&nbsp;2020&nbsp;16:31<br>T \
o:&nbsp;java-user@lucene.apache.org<br>Subject:&nbsp;回复:How&nbsp;to&nbsp;query&n \
bsp;for&nbsp;'any&nbsp;word'&nbsp;in&nbsp;a&nbsp;phrase<br><br>could&nbsp;the&nbsp;slo \
p&nbsp;parameter&nbsp;in&nbsp;phasequery&nbsp;be&nbsp;dynamically&nbsp;set?<br><br>- \
-----------------------------------------------------------------<br>发件人:Jeroe \
n&nbsp;Lauwers&lt;Jeroen.Lauwers@CTLO.NET&lt;mailto:Jeroen.Lauwers@CTLO.NET&gt;&gt;<br \
>日 期:2020年01月09日&nbsp;23:17:37<br>收件人:java-user@lucene.apache.or \
> g&lt;java-user@lucene.apache.org&lt;mailto:java-user@lucene.apache.org%3cjava-user@l \
> ucene.apache.org&gt;&gt;<br>主 题:How&nbsp;to&nbsp;query&nbsp;for&nbsp;'any&nb \
> sp;word'&nbsp;in&nbsp;a&nbsp;phrase<br><br>Dear&nbsp;all,<br><br>Is&nbsp;there&nbsp; \
> a&nbsp;way&nbsp;to&nbsp;construct&nbsp;(spans?)&nbsp;a&nbsp;phrase&nbsp;search&nbsp; \
> like&nbsp;the&nbsp;following:<br>the&nbsp;quick&nbsp;brown&nbsp;*&nbsp;jumps&nbsp;ov \
> er&nbsp;the&nbsp;*&nbsp;*<br>where&nbsp;*&nbsp;=&nbsp;any&nbsp;word&nbsp;but&nbsp;ex \
> actly&nbsp;1&nbsp;word<br><br>I&nbsp;introduced&nbsp;these&nbsp;*&rsquo;s&nbsp;at&nb \
> sp;a&nbsp;specific&nbsp;position,&nbsp;so&nbsp;a&nbsp;PhraseQuery&nbsp;with&nbsp;slo \
> p&nbsp;of&nbsp;2&nbsp;is&nbsp;just&nbsp;not&nbsp;good&nbsp;enough<br>and&nbsp;the&nb \
> sp;two&nbsp;*&rsquo;s&nbsp;at&nbsp;the&nbsp;end&nbsp;must&nbsp;be&nbsp;matched&nbsp; \
> as&nbsp;well.<br><br>Is&nbsp;there&nbsp;such&nbsp;a&nbsp;thing&nbsp;as&nbsp;a&nbsp;T \
> erm&nbsp;or&nbsp;BytesRef&nbsp;that&nbsp;always&nbsp;matches&nbsp;everything?<br><br>Thanks,<br>Jeroen<br><br></blockquote>
> 



[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic