[prev in list] [next in list] [prev in thread] [next in thread]
List: lucene-user
Subject: RE: Can Lucene be used as Rules Engine?
From: "Karthick Sundaram" <karthick_s () trigent ! com ! INVALID>
Date: 2020-01-23 18:07:35
Message-ID: 004901d5d217$fdc981a0$f95c84e0$ () trigent ! com
[Download RAW message or body]
Luwak (stored query engine, allowing users to efficiently match a stream of documents \
against a large set of queries) seems to be the right candidate for my requirement.
Thanks for pointing out this to me. I will dig more about this.
Thanks,
Kart
-----Original Message-----
From: Diego Ceccarelli (BLOOMBERG/ LONDON) [mailto:dceccarelli4@bloomberg.net]
Sent: Thursday, January 23, 2020 3:22 AM
To: java-user@lucene.apache.org
Subject: Re: Can Lucene be used as Rules Engine?
> Now, I have another requirement which is reverse of above requirement.
I might be wrong but that smells like luwac (available in 8.2.0)
check out:
https://issues.apache.org/jira/browse/LUCENE-8766
it should allow you to index the rules and then use the document as a query to \
retrieve the rules that match with your document.
If you want to match the document only if at least one condition is matched you can \
just encode the conditions of a rule in OR and it might work..
please note I'm not sure about this (I never used luwac :)) maybe somebody in the \
list can comment more :)
cheers
Diego
Sent from Bloomberg Professional for Android
----- Original Message -----
From: Mikhail Khludnev <java-user@lucene.apache.org>
At: 23-Jan-2020 07:42:47
Hello, Kart.
I still don't fully get the problem. But usually implementing Rule Engine requires to \
use https://lucene.apache.org/core/7_3_1/sandbox/org/apache/lucene/search/CoveringQuery.html
which
check number of rule clauses in a dedicated field.
On Thu, Jan 23, 2020 at 12:12 AM Karthick Sundaram <karthick_s@trigent.com.invalid> \
wrote:
> Gentlemen:
>
>
>
> I am using Lucene as search engine for the below requirement:
>
>
>
> Millions of documents (text files) are there.
>
> Each text file has thousands of words (plain Strings with space separated).
>
> Example content of a text file 1 (just showing few words): 0001AAA
> 0001AAB 0001AAC 0061000 PSBP06 MFBP05 ...
>
> Example content of a text file 2 (just showing few words): 0001AAX
> 0001AAB 0001AAN 0061002 PSBP07 MFBP06 ...
>
>
>
> Then there are millions of rules captured in the database. For easy
> understanding, I specify couple of rules below:
>
>
>
> Rule 1:
>
> CONDITION 1: WITH: 0001AAA OR 0001AAC
>
> CONDITION 2: WITH: PSBP06 OR PSBP07
>
> CONDITION 3: WITH: MFBP05
>
>
>
> Rule 2:
>
> CONDITION 1: WITH: 0001AAN OR 0001AAC
>
> CONDITION 2: WITH: PSBP06
>
> CONDITION 3: WITH: PSBP08
>
> CONDITION 4: NOT WITH: MFBP05
>
>
>
> Requirement is, for a given rule, find the text files matching at
> least one word in each condition of the rule
>
> I indexed the contents of each text file as a Lucene document with a
> Field "FileContents" and another field to just store the file name
>
> So, for the Rule 1, I constructed query as (0001AAA OR 0001AAC) AND
> (PSBP06 OR PSBP07) AND (MFBP05)
>
> And for Rule 2, the query is (0001AAN OR 0001AAC) AND (PSBP06) AND
> (PSBP08) AND NOT (MFBP05).
>
>
>
> Queries are working and able to find the appropriate text files.
>
>
>
> Now, I have another requirement which is reverse of above requirement.
>
> i.e., For the given text file, I need to find the list of Rules that
> can match.
>
> Example: For the text file 1, the "Rule 1" should match, because the
> text file 1 has 0001AAA which satisfies condition 1, PSBP06 will
> satisfies condition 2, MFBP05 will satisfy condition 3.
>
> Rule 1 has 3 conditions and at least one word in each condition
> matches for text file 1. So Rule 1 is good for text file 1.
>
> Rule 2 should not match for text file 1 because PSBP08 is not there in it.
>
>
>
> I don't know whether i can index the "Rule" information in Lucene. A
> rule can have 1 or more conditions, so I can't use fixed number of
> Fields to query on. Even if there are fixed number of fields, the
> query has to check for each field to match at least one word.
>
> Is it possible to handle this requirement using Lucene? or should I go
> for other options?
>
> I am new to Lucene, any help would be appreciated.
>
>
>
> Thanks,
>
> Kart
>
>
--
Sincerely yours
Mikhail Khludnev
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic