'Re: parsing improvement'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       lua-l
Subject:    Re: parsing improvement
From:       Brigham Toskin <brighamtoskin () gmail ! com>
Date:       2015-05-30 15:31:49
Message-ID: CAFxs_oSJn8vZnO_9zfkSAYyyJoe7MHaUfY92PeLeda00k=ra_g () mail ! gmail ! com
[Download RAW message or body]

Just thought I'd point out that using something like LPEG will probably end
up bumping your memory usage up more than a bit of redundant string data.
You might want to benchmark something like:

string.sub(data, string.find(data, "<.=/>"),
startpos):gsub(pattern_to_visit_fields, function(...) ... end )

Obviously this own't compile, and you probably need to break it up some to
update your bookkeeping, but you get the idea. Might be worth testing
against other things in terms of memory and speed.

On Sat, May 30, 2015 at 1:54 AM, Enrico Colombini <erix@erix.it> wrote:

> On 29-May-15 21:30, Lionel Duboeuf wrote:
>
>> But as i said, i need to specify a starting position (which doesn't
>> exist in gmatch and gsub functions) and can't afford to split my string.
>>
>
> As an aside, I encountered the same problem when I thought about adding
> features to my 50-line in-memory XML parser (for a small enough definition
> of 'XML', but it could read OpenOffice files).
>
> I always forgot to ask if adding an optional starting point for
> gmatch/gsub would be too costly.
>
> --
>   Enrico
>
>


-- 
Brigham Toskin

[Attachment #3 (text/html)]

<div dir="ltr">Just thought I&#39;d point out that using something like LPEG will \
probably end up bumping your memory usage up more than a bit of redundant string \
data. You might want to benchmark something like:<div><br></div><div>string.sub(data, \
string.find(data, &quot;&lt;.=/&gt;&quot;), startpos):gsub(pattern_to_visit_fields, \
function(...) ... end )</div><div><br></div><div>Obviously this own&#39;t compile, \
and you probably need to break it up some to update your bookkeeping, but you get the \
idea. Might be worth testing against other things in terms of memory and \
speed.</div></div><div class="gmail_extra"><br><div class="gmail_quote">On Sat, May \
30, 2015 at 1:54 AM, Enrico Colombini <span dir="ltr">&lt;<a \
href="mailto:erix@erix.it" target="_blank">erix@erix.it</a>&gt;</span> \
wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px \
#ccc solid;padding-left:1ex">On 29-May-15 21:30, Lionel Duboeuf wrote:<br> \
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc \
solid;padding-left:1ex"> But as i said, i need to specify a starting position (which \
doesn&#39;t<br> exist in gmatch and gsub functions) and can&#39;t afford to split my \
string.<br> </blockquote>
<br>
As an aside, I encountered the same problem when I thought about adding features to \
my 50-line in-memory XML parser (for a small enough definition of &#39;XML&#39;, but \
it could read OpenOffice files).<br> <br>
I always forgot to ask if adding an optional starting point for gmatch/gsub would be \
too costly.<span class="HOEnZb"><font color="#888888"><br> <br>
-- <br>
   Enrico<br>
<br>
</font></span></blockquote></div><br><br clear="all"><div><br></div>-- <br><div \
class="gmail_signature">Brigham Toskin</div> </div>



[prev in list] [next in list] [prev in thread] [next in thread]
Configure | About | News | Add a list | Sponsored by KoreLogic