[prev in list] [next in list] [prev in thread] [next in thread] 

List:       lucene-user
Subject:    RE: Can Lucene be used as Rules Engine?
From:       "Karthick Sundaram" <karthick_s () trigent ! com ! INVALID>
Date:       2020-01-23 18:07:35
Message-ID: 004901d5d217$fdc981a0$f95c84e0$ () trigent ! com
[Download RAW message or body]

Luwak (stored query engine, allowing users to efficiently match a stream of documents \
against a large set of queries) seems to be the right candidate for my requirement.

Thanks for pointing out this to me. I will dig more about this.

Thanks,
Kart

-----Original Message-----
From: Diego Ceccarelli (BLOOMBERG/ LONDON) [mailto:dceccarelli4@bloomberg.net] 
Sent: Thursday, January 23, 2020 3:22 AM
To: java-user@lucene.apache.org
Subject: Re: Can Lucene be used as Rules Engine?

> Now, I have another requirement which is reverse of above requirement.

I might be wrong but that smells like luwac (available in 8.2.0) 

check out: 

https://issues.apache.org/jira/browse/LUCENE-8766

it should allow you to index the rules and then use the document as a query to \
retrieve the rules that match with your document.

If you want to match the document only if at least one condition is matched you can \
just encode the conditions of a rule in OR and it might work..

please note I'm not sure about this (I never used luwac :)) maybe somebody in the \
list can comment more :)

cheers
Diego


Sent from Bloomberg Professional for Android

----- Original Message -----
From: Mikhail Khludnev <java-user@lucene.apache.org>
At: 23-Jan-2020 07:42:47


Hello, Kart.
I still don't fully get the problem. But usually implementing Rule Engine requires to \
use https://lucene.apache.org/core/7_3_1/sandbox/org/apache/lucene/search/CoveringQuery.html
 which
check number of rule clauses in a dedicated field.

On Thu, Jan 23, 2020 at 12:12 AM Karthick Sundaram <karthick_s@trigent.com.invalid> \
wrote:

> Gentlemen:
> 
> 
> 
> I am using Lucene as search engine for the below requirement:
> 
> 
> 
> Millions of documents (text files) are there.
> 
> Each text file has thousands of words (plain Strings with space separated).
> 
> Example content of a text file 1 (just showing few words): 0001AAA 
> 0001AAB 0001AAC 0061000 PSBP06 MFBP05 ...
> 
> Example content of a text file 2 (just showing few words): 0001AAX 
> 0001AAB 0001AAN 0061002 PSBP07 MFBP06 ...
> 
> 
> 
> Then there are millions of rules captured in the database. For easy 
> understanding, I specify couple of rules below:
> 
> 
> 
> Rule 1:
> 
> CONDITION 1: WITH: 0001AAA OR 0001AAC
> 
> CONDITION 2: WITH: PSBP06 OR PSBP07
> 
> CONDITION 3: WITH: MFBP05
> 
> 
> 
> Rule 2:
> 
> CONDITION 1: WITH: 0001AAN OR 0001AAC
> 
> CONDITION 2: WITH: PSBP06
> 
> CONDITION 3: WITH: PSBP08
> 
> CONDITION 4: NOT WITH: MFBP05
> 
> 
> 
> Requirement is, for a given rule, find the text files matching at 
> least one word in each condition of the rule
> 
> I indexed the contents of each text file as a Lucene document with a 
> Field "FileContents" and another field to just store the file name
> 
> So, for the Rule 1, I constructed query as (0001AAA OR 0001AAC) AND 
> (PSBP06 OR PSBP07) AND (MFBP05)
> 
> And for Rule 2, the query is (0001AAN OR 0001AAC) AND (PSBP06) AND 
> (PSBP08) AND NOT (MFBP05).
> 
> 
> 
> Queries are working and able to find the appropriate text files.
> 
> 
> 
> Now, I have another requirement which is reverse of above requirement.
> 
> i.e., For the given text file, I need to find the list of Rules that 
> can match.
> 
> Example: For the text file 1, the "Rule 1" should match, because the 
> text file 1 has 0001AAA which satisfies condition 1, PSBP06 will 
> satisfies condition 2, MFBP05 will satisfy condition 3.
> 
> Rule 1 has 3 conditions and at least one word in each condition 
> matches for text file 1. So Rule 1 is good for text file 1.
> 
> Rule 2 should not match for text file 1 because PSBP08 is not there in it.
> 
> 
> 
> I don't know whether i can index the "Rule" information in Lucene. A 
> rule can have 1 or more conditions, so I can't use fixed number of 
> Fields to query on. Even if there are fixed number of fields, the 
> query has to check for each field to match at least one word.
> 
> Is it possible to handle this requirement using Lucene? or should I go 
> for other options?
> 
> I am new to Lucene, any help would be appreciated.
> 
> 
> 
> Thanks,
> 
> Kart
> 
> 

--
Sincerely yours
Mikhail Khludnev


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic