[prev in list] [next in list] [prev in thread] [next in thread] 

List:       solr-user
Subject:    Help with suggester matching multiple fuzzy terms
From:       Andy Coulson <andy.coulson () epicor ! com>
Date:       2021-09-20 22:02:24
Message-ID: SA1PR18MB471127E485662C7D25047F04ECA09 () SA1PR18MB4711 ! namprd18 ! prod ! outlook ! com
[Download RAW message or body]

[Attachment #2 (multipart/alternative)]


I could use some advice on choosing and configuring the best analyzer lookup factory \
for suggestions.

In my use case, I have indexed a document field containing:
"clutch release bearing"

I would like all of the following to provide that as a matching suggestion:
"clutch release"
"release clutch"
"clutc release"
"clutc release"
"release clutch"
(and various other permutations where the word order varies and misspellings result \
in edit distances of 1 or so per term)

I get good results using the AnalyzingInfixLookupFactory in terms of word order \
(using the config below), but it doesn't appear to do any fuzzy token matching at \
all. The fuzzy lookup factory provides fuzzy token matching, but starts returning \
nothing as soon as I type beyond a single token.

I currently have this in my solrConfig.xm:

  <searchComponent name="suggest" class="solr.SuggestComponent">
    <lst name="suggester">
      <str name="name">mySuggester</str>
      <str name="lookupImpl">AnalyzingInfixLookupFactory</str>
      <str name="field">LiteralName</str>
      <str name="suggestAnalyzerFieldType">cobraTextField</str>a
    </lst>
  </searchComponent>
  <requestHandler name="/suggest" class="solr.SearchHandler" startup="lazy">
    <lst name="defaults">
      <str name="suggest">true</str>
      <str name="suggest.count">10</str>
      <str name="buildOnStartup">true</str>
    </lst>
    <arr name="components">
      <str>suggest</str>
    </arr>
  </requestHandler>


Andy Coulson
Principal Software Engineer
Epicor Software Corporation
www.epicor.com<http://www.epicor.com/>
Tel.: (512) 328-2300
Cell: (512) 517-2494
E-Mail: andy.coulson@epicor.com<mailto:andy.coulson@epicor.com>
[cid:image001.jpg@01D7AE41.475E0410]


[Attachment #5 (text/html)]

<html xmlns:v="urn:schemas-microsoft-com:vml" \
xmlns:o="urn:schemas-microsoft-com:office:office" \
xmlns:w="urn:schemas-microsoft-com:office:word" \
xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" \
xmlns="http://www.w3.org/TR/REC-html40"> <head>
<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">
<meta name="Generator" content="Microsoft Word 15 (filtered medium)">
<!--[if !mso]><style>v\:* {behavior:url(#default#VML);}
o\:* {behavior:url(#default#VML);}
w\:* {behavior:url(#default#VML);}
.shape {behavior:url(#default#VML);}
</style><![endif]--><style><!--
/* Font Definitions */
@font-face
	{font-family:"Cambria Math";
	panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
	{font-family:Calibri;
	panose-1:2 15 5 2 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
	{margin:0in;
	font-size:11.0pt;
	font-family:"Calibri",sans-serif;}
span.EmailStyle17
	{mso-style-type:personal-compose;
	font-family:"Calibri",sans-serif;
	color:windowtext;}
.MsoChpDefault
	{mso-style-type:export-only;
	font-family:"Calibri",sans-serif;}
@page WordSection1
	{size:8.5in 11.0in;
	margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
	{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang="EN-US" link="#0563C1" vlink="#954F72" style="word-wrap:break-word">
<div class="WordSection1">
<p class="MsoNormal">I could use some advice on choosing and configuring the best \
analyzer lookup factory for suggestions.<o:p></o:p></p> <p \
class="MsoNormal"><o:p>&nbsp;</o:p></p> <p class="MsoNormal">In my use case, I have \
indexed a document field containing:<o:p></o:p></p> <p \
class="MsoNormal">&#8220;clutch release bearing&#8221;<o:p></o:p></p> <p \
class="MsoNormal"><o:p>&nbsp;</o:p></p> <p class="MsoNormal">I would like all of the \
following to provide that as a matching suggestion:<o:p></o:p></p> <p \
class="MsoNormal">&#8220;clutch release&#8221;<o:p></o:p></p> <p \
class="MsoNormal">&#8220;release clutch&#8221;<o:p></o:p></p> <p \
class="MsoNormal">&#8220;clutc release&#8221;<o:p></o:p></p> <p \
class="MsoNormal">&#8220;clutc release&#8221;<o:p></o:p></p> <p \
class="MsoNormal">&#8220;release clutch&#8221;<o:p></o:p></p> <p \
class="MsoNormal">(and various other permutations where the word order varies and \
misspellings result in edit distances of 1 or so per term)<o:p></o:p></p> <p \
class="MsoNormal"><o:p>&nbsp;</o:p></p> <p class="MsoNormal">I get good results using \
the AnalyzingInfixLookupFactory in terms of word order (using the config below), but \
it doesn&#8217;t appear to do any fuzzy token matching at all. The fuzzy lookup \
factory provides fuzzy token matching, but starts returning  nothing as soon as I \
type beyond a single token.<o:p></o:p></p> <p class="MsoNormal"><o:p>&nbsp;</o:p></p>
<p class="MsoNormal">I currently have this in my solrConfig.xm:<o:p></o:p></p>
<p class="MsoNormal"><o:p>&nbsp;</o:p></p>
<p class="MsoNormal">&nbsp; &lt;searchComponent name=&quot;suggest&quot; \
class=&quot;solr.SuggestComponent&quot;&gt;<o:p></o:p></p> <p \
class="MsoNormal">&nbsp;&nbsp;&nbsp; &lt;lst \
name=&quot;suggester&quot;&gt;<o:p></o:p></p> <p \
class="MsoNormal">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &lt;str \
name=&quot;name&quot;&gt;mySuggester&lt;/str&gt;<o:p></o:p></p> <p \
class="MsoNormal">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &lt;str \
name=&quot;lookupImpl&quot;&gt;AnalyzingInfixLookupFactory&lt;/str&gt;<o:p></o:p></p> \
<p class="MsoNormal">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &lt;str \
name=&quot;field&quot;&gt;LiteralName&lt;/str&gt;<o:p></o:p></p> <p \
class="MsoNormal">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &lt;str \
name=&quot;suggestAnalyzerFieldType&quot;&gt;cobraTextField&lt;/str&gt;a<o:p></o:p></p>
 <p class="MsoNormal">&nbsp;&nbsp;&nbsp; &lt;/lst&gt;<o:p></o:p></p>
<p class="MsoNormal">&nbsp; &lt;/searchComponent&gt;<o:p></o:p></p>
<p class="MsoNormal">&nbsp; &lt;requestHandler name=&quot;/suggest&quot; \
class=&quot;solr.SearchHandler&quot; startup=&quot;lazy&quot;&gt;<o:p></o:p></p> <p \
class="MsoNormal">&nbsp;&nbsp;&nbsp; &lt;lst \
name=&quot;defaults&quot;&gt;<o:p></o:p></p> <p \
class="MsoNormal">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &lt;str \
name=&quot;suggest&quot;&gt;true&lt;/str&gt;<o:p></o:p></p> <p \
class="MsoNormal">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &lt;str \
name=&quot;suggest.count&quot;&gt;10&lt;/str&gt;<o:p></o:p></p> <p \
class="MsoNormal">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &lt;str \
name=&quot;buildOnStartup&quot;&gt;true&lt;/str&gt;<o:p></o:p></p> <p \
class="MsoNormal">&nbsp;&nbsp;&nbsp; &lt;/lst&gt;<o:p></o:p></p> <p \
class="MsoNormal">&nbsp;&nbsp;&nbsp; &lt;arr \
name=&quot;components&quot;&gt;<o:p></o:p></p> <p \
class="MsoNormal">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; \
&lt;str&gt;suggest&lt;/str&gt;<o:p></o:p></p> <p class="MsoNormal">&nbsp;&nbsp;&nbsp; \
&lt;/arr&gt;<o:p></o:p></p> <p class="MsoNormal">&nbsp; \
&lt;/requestHandler&gt;<o:p></o:p></p> <p class="MsoNormal"><o:p>&nbsp;</o:p></p>
<p class="MsoNormal"><o:p>&nbsp;</o:p></p>
<p class="MsoNormal"><span style="color:#323E4F">Andy Coulson <o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#323E4F">Principal Software \
Engineer<o:p></o:p></span></p> <p class="MsoNormal"><span \
style="color:#323E4F">Epicor Software Corporation<o:p></o:p></span></p> <p \
class="MsoNormal"><span lang="EN-GB"><a href="http://www.epicor.com/"><span \
lang="EN-US" style="color:#323E4F">www.epicor.com</span></a></span><span \
style="color:#323E4F"><o:p></o:p></span></p> <p class="MsoNormal"><span \
style="color:#323E4F">Tel.: (512) 328-2300<o:p></o:p></span></p> <p \
class="MsoNormal"><span style="color:#323E4F">Cell: (512) \
517-2494<o:p></o:p></span></p> <p class="MsoNormal"><span \
style="color:#323E4F">E-Mail: </span><span lang="EN-GB"><a \
href="mailto:andy.coulson@epicor.com"><span \
style="color:#0563C1">andy.coulson@epicor.com</span></a> </span><span \
style="color:#323E4F"><o:p></o:p></span></p> <p class="MsoNormal"><span \
style="color:#323E4F"><img border="0" width="158" height="46" \
style="width:1.6458in;height:.4791in" id="Picture_x0020_1" \
src="cid:image001.jpg@01D7AE41.475E0410"><o:p></o:p></span></p> <p \
class="MsoNormal"><o:p>&nbsp;</o:p></p> </div>
</body>
</html>



[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic