[prev in list] [next in list] [prev in thread] [next in thread]
List: solr-user
Subject: Help with suggester matching multiple fuzzy terms
From: Andy Coulson <andy.coulson () epicor ! com>
Date: 2021-09-20 22:02:24
Message-ID: SA1PR18MB471127E485662C7D25047F04ECA09 () SA1PR18MB4711 ! namprd18 ! prod ! outlook ! com
[Download RAW message or body]
[Attachment #2 (multipart/alternative)]
I could use some advice on choosing and configuring the best analyzer lookup factory \
for suggestions.
In my use case, I have indexed a document field containing:
"clutch release bearing"
I would like all of the following to provide that as a matching suggestion:
"clutch release"
"release clutch"
"clutc release"
"clutc release"
"release clutch"
(and various other permutations where the word order varies and misspellings result \
in edit distances of 1 or so per term)
I get good results using the AnalyzingInfixLookupFactory in terms of word order \
(using the config below), but it doesn't appear to do any fuzzy token matching at \
all. The fuzzy lookup factory provides fuzzy token matching, but starts returning \
nothing as soon as I type beyond a single token.
I currently have this in my solrConfig.xm:
<searchComponent name="suggest" class="solr.SuggestComponent">
<lst name="suggester">
<str name="name">mySuggester</str>
<str name="lookupImpl">AnalyzingInfixLookupFactory</str>
<str name="field">LiteralName</str>
<str name="suggestAnalyzerFieldType">cobraTextField</str>a
</lst>
</searchComponent>
<requestHandler name="/suggest" class="solr.SearchHandler" startup="lazy">
<lst name="defaults">
<str name="suggest">true</str>
<str name="suggest.count">10</str>
<str name="buildOnStartup">true</str>
</lst>
<arr name="components">
<str>suggest</str>
</arr>
</requestHandler>
Andy Coulson
Principal Software Engineer
Epicor Software Corporation
www.epicor.com<http://www.epicor.com/>
Tel.: (512) 328-2300
Cell: (512) 517-2494
E-Mail: andy.coulson@epicor.com<mailto:andy.coulson@epicor.com>
[cid:image001.jpg@01D7AE41.475E0410]
[Attachment #5 (text/html)]
<html xmlns:v="urn:schemas-microsoft-com:vml" \
xmlns:o="urn:schemas-microsoft-com:office:office" \
xmlns:w="urn:schemas-microsoft-com:office:word" \
xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" \
xmlns="http://www.w3.org/TR/REC-html40"> <head>
<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">
<meta name="Generator" content="Microsoft Word 15 (filtered medium)">
<!--[if !mso]><style>v\:* {behavior:url(#default#VML);}
o\:* {behavior:url(#default#VML);}
w\:* {behavior:url(#default#VML);}
.shape {behavior:url(#default#VML);}
</style><![endif]--><style><!--
/* Font Definitions */
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
font-size:11.0pt;
font-family:"Calibri",sans-serif;}
span.EmailStyle17
{mso-style-type:personal-compose;
font-family:"Calibri",sans-serif;
color:windowtext;}
.MsoChpDefault
{mso-style-type:export-only;
font-family:"Calibri",sans-serif;}
@page WordSection1
{size:8.5in 11.0in;
margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang="EN-US" link="#0563C1" vlink="#954F72" style="word-wrap:break-word">
<div class="WordSection1">
<p class="MsoNormal">I could use some advice on choosing and configuring the best \
analyzer lookup factory for suggestions.<o:p></o:p></p> <p \
class="MsoNormal"><o:p> </o:p></p> <p class="MsoNormal">In my use case, I have \
indexed a document field containing:<o:p></o:p></p> <p \
class="MsoNormal">“clutch release bearing”<o:p></o:p></p> <p \
class="MsoNormal"><o:p> </o:p></p> <p class="MsoNormal">I would like all of the \
following to provide that as a matching suggestion:<o:p></o:p></p> <p \
class="MsoNormal">“clutch release”<o:p></o:p></p> <p \
class="MsoNormal">“release clutch”<o:p></o:p></p> <p \
class="MsoNormal">“clutc release”<o:p></o:p></p> <p \
class="MsoNormal">“clutc release”<o:p></o:p></p> <p \
class="MsoNormal">“release clutch”<o:p></o:p></p> <p \
class="MsoNormal">(and various other permutations where the word order varies and \
misspellings result in edit distances of 1 or so per term)<o:p></o:p></p> <p \
class="MsoNormal"><o:p> </o:p></p> <p class="MsoNormal">I get good results using \
the AnalyzingInfixLookupFactory in terms of word order (using the config below), but \
it doesn’t appear to do any fuzzy token matching at all. The fuzzy lookup \
factory provides fuzzy token matching, but starts returning nothing as soon as I \
type beyond a single token.<o:p></o:p></p> <p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">I currently have this in my solrConfig.xm:<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"> <searchComponent name="suggest" \
class="solr.SuggestComponent"><o:p></o:p></p> <p \
class="MsoNormal"> <lst \
name="suggester"><o:p></o:p></p> <p \
class="MsoNormal"> <str \
name="name">mySuggester</str><o:p></o:p></p> <p \
class="MsoNormal"> <str \
name="lookupImpl">AnalyzingInfixLookupFactory</str><o:p></o:p></p> \
<p class="MsoNormal"> <str \
name="field">LiteralName</str><o:p></o:p></p> <p \
class="MsoNormal"> <str \
name="suggestAnalyzerFieldType">cobraTextField</str>a<o:p></o:p></p>
<p class="MsoNormal"> </lst><o:p></o:p></p>
<p class="MsoNormal"> </searchComponent><o:p></o:p></p>
<p class="MsoNormal"> <requestHandler name="/suggest" \
class="solr.SearchHandler" startup="lazy"><o:p></o:p></p> <p \
class="MsoNormal"> <lst \
name="defaults"><o:p></o:p></p> <p \
class="MsoNormal"> <str \
name="suggest">true</str><o:p></o:p></p> <p \
class="MsoNormal"> <str \
name="suggest.count">10</str><o:p></o:p></p> <p \
class="MsoNormal"> <str \
name="buildOnStartup">true</str><o:p></o:p></p> <p \
class="MsoNormal"> </lst><o:p></o:p></p> <p \
class="MsoNormal"> <arr \
name="components"><o:p></o:p></p> <p \
class="MsoNormal"> \
<str>suggest</str><o:p></o:p></p> <p class="MsoNormal"> \
</arr><o:p></o:p></p> <p class="MsoNormal"> \
</requestHandler><o:p></o:p></p> <p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><span style="color:#323E4F">Andy Coulson <o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#323E4F">Principal Software \
Engineer<o:p></o:p></span></p> <p class="MsoNormal"><span \
style="color:#323E4F">Epicor Software Corporation<o:p></o:p></span></p> <p \
class="MsoNormal"><span lang="EN-GB"><a href="http://www.epicor.com/"><span \
lang="EN-US" style="color:#323E4F">www.epicor.com</span></a></span><span \
style="color:#323E4F"><o:p></o:p></span></p> <p class="MsoNormal"><span \
style="color:#323E4F">Tel.: (512) 328-2300<o:p></o:p></span></p> <p \
class="MsoNormal"><span style="color:#323E4F">Cell: (512) \
517-2494<o:p></o:p></span></p> <p class="MsoNormal"><span \
style="color:#323E4F">E-Mail: </span><span lang="EN-GB"><a \
href="mailto:andy.coulson@epicor.com"><span \
style="color:#0563C1">andy.coulson@epicor.com</span></a> </span><span \
style="color:#323E4F"><o:p></o:p></span></p> <p class="MsoNormal"><span \
style="color:#323E4F"><img border="0" width="158" height="46" \
style="width:1.6458in;height:.4791in" id="Picture_x0020_1" \
src="cid:image001.jpg@01D7AE41.475E0410"><o:p></o:p></span></p> <p \
class="MsoNormal"><o:p> </o:p></p> </div>
</body>
</html>
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic