[prev in list] [next in list] [prev in thread] [next in thread] 

List:       kde-bugs-dist
Subject:    [okular] [Bug 342504] Add possibility to copy formulas as MATHML/Latex Math/OO Math
From:       Yuri Chornoivan <yurchor () ukr ! net>
Date:       2015-01-05 20:36:31
Message-ID: bug-342504-17878-wLkJ6E0U8Q () http ! bugs ! kde ! org/
[Download RAW message or body]

https://bugs.kde.org/show_bug.cgi?id=342504

Yuri Chornoivan <yurchor@ukr.net> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |yurchor@ukr.net

--- Comment #2 from Yuri Chornoivan <yurchor@ukr.net> ---
(In reply to Christoph Feck from comment #1)
> Is there any other software able to extract formulas from PDF? To me it
> looks like a very hard problem, as soon as the formulas use multiple levels
> of text (fractions etc.)

MaxTract (development canceled) can do the extraction directly.

http://www.cs.bham.ac.uk/research/groupings/reasoning/sdag/maxtract.php

Infty Reader can do it using OCR.

Some thoughts on the problem can be found here (my tests confirm the
conclusions of this paper and nothing seems changed from 2011):

http://www.cs.bham.ac.uk/~aps/research/papers/pdf/BaSeSoSu-ICDAR11-ComparingApproachesToMathematicalDocumentAnalysisFromPDF.pdf


IMHO, it is hard to expect that free OCR engines like Ocropus/Tesseract can
solve the problem in the nearest future. At least, I failed to train Tesseract
in recognition of even rather simple formulas.

-- 
You are receiving this mail because:
You are watching all bug changes.


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic