[prev in list] [next in list] [prev in thread] [next in thread] 

List:       lucene-user
Subject:    Re: Loading WFST to Memory Mapped File in Lucene
From:       Dawid Weiss <dawid.weiss () gmail ! com>
Date:       2022-12-27 15:51:37
Message-ID: CAM21Rt-07Ox-q_LEUG063yXfQEk_Qz51pGa_y-tMQdbZ_iGrSQ () mail ! gmail ! com
[Download RAW message or body]

Please feel free to provide a pull request that adds the ability to
load the FST off heap to WFSTCompletionLookup. I think it's an
oversight and it'd be a good addition.

Dawid

On Tue, Dec 27, 2022 at 10:35 AM marcos rebelo <oleber@gmail.com> wrote:
> 
> I have the same impression, even if I'm using the MMapDirectory. The data
> is on heap.
> 
> For my use case, it's a huge waste of memory :( 90% of my data could be
> correctly organised and kept in disk.
> 
> Thanks for the support
> 
> Best regards
> Marcos Rebelo
> 
> On Tue, 27 Dec 2022, 09:11 Dawid Weiss, <dawid.weiss@gmail.com> wrote:
> 
> > Looking at the code briefly, I think WFSTCompletionLookup uses on heap
> > store for the fst. You'd have to load it with off heap fst store instead:
> > 
> > 
> > https://github.com/apache/lucene/blob/1b9d98d6ec079e950bdd37137082f81400d3bc2e/lucene/core/src/java/org/apache/lucene/util/fst/OffHeapFSTStore.java
> >  
> > but I don't think there is an API in WFSTCompletionLookup that would allow
> > you to do that.
> > 
> > D.
> > 
> > On Fri, Dec 23, 2022 at 5:00 PM marcos rebelo <oleber@gmail.com> wrote:
> > 
> > > Hey all!
> > > 
> > > I'm loading multiple WFST with ~1.1 Gb and the JVM memory increases
> > > proportionally. Looks like the file is stored in memory, meaning not
> > using
> > > Memory Mapped Files at all.
> > > 
> > > Example code:
> > > 
> > > In the following code we setup the Lucene to use /tmp/deleteme2 for the
> > > memory mapped file and we load the file from /tmp/deleteme/file.wfst via
> > an
> > > InputStream.
> > > 
> > > After file load I list the files on /tmp/deleteme2 and nothing is found,
> > > but I'm able to query the WFST file.
> > > 
> > > @Test
> > > @SneakyThrows
> > > void WFSTLoad() throws IOException {
> > > Path wfstPath = Paths.get("/tmp/deleteme2");
> > > Path wfstFilePath = Paths.get("/tmp/deleteme/file.wfst");
> > > 
> > > var directory = new MMapDirectory(wfstPath);
> > > 
> > > WFSTCompletionLookup wfst =
> > > new WFSTCompletionLookup(directory, "temp");
> > > 
> > > try (var is = new FileInputStream(wfstFilePath.toFile())) {
> > > wfst.load(is);
> > > System.out.println("FILE LOADED");
> > > }
> > > 
> > > Files.list(wfstPath).forEach(System.out::println);
> > > System.out.println("FILES LISTED");
> > > 
> > > assertThat(wfst.get("qwert123qwert")).isEqualTo(123);
> > > }
> > > 
> > > What am I doing wrong?
> > > 
> > > Thanks for the support
> > > 
> > > Best Regards
> > > Marcos Rebelo
> > > 
> > > --
> > > 
> > > *Marcos Bruno Gomes Rebelo Engineering Manager / Data Scientist /
> > Software
> > > Engineer*
> > > Linkedin: https://www.linkedin.com/in/oleber/
> > > *Adding value to your data. Specialized in Search and Recommendation
> > > Systems*
> > > Technologies: Elastic, Spark, Scala, Jupiter Notebook, Python, ...
> > > 
> > 

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic