[prev in list] [next in list] [prev in thread] [next in thread] 

List:       cairo
Subject:    [cairo] Glyph utf8 string in svg output for OCR/OMR tasks
From:       Kwon-Young Choi <kwon-young.choi () hotmail ! fr>
Date:       2022-05-03 9:35:32
Message-ID: PA4PR02MB6671058EF0234BB0D825EE5AC3C09 () PA4PR02MB6671 ! eurprd02 ! prod ! outlook ! com
[Download RAW message or body]

Hello,

I've just posted a new pull-request at https://gitlab.freedesktop.org/cairo/cairo/-/
merge_requests/318[1].

My interest is mainly around OCR/OMR (Optical Music Recognition) tasks where the goal is 
to recover numerical documents from images and pdfs.
My main goal is to use cairo to extract every drawing and glyphs from pdf so that I can 
train models for symbol classification/detection and semantical reconstruction.
The current main missing bit of information from the svg output of cairo is the utf8 string 
of glyph.

My merge request aims to add this information with as little modifications as possible 
both in the code and in the svg output.

I hope this kind of use-cases will interest some people and help me merge this feature.

Let me know if there are anything I should in order to improve my contribution.

Best regards,

Kwon-Young Choi

--------
[1] https://gitlab.freedesktop.org/cairo/cairo/-/merge_requests/318

[Attachment #3 (unknown)]

<html><head>
<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">
</head>
<body><p style="margin-top:0;margin-bottom:0;margin-left:0;margin-right:0;">Hello,</p>
 <br><p style="margin-top:0;margin-bottom:0;margin-left:0;margin-right:0;">I've just \
posted a new pull-request at <a \
href="https://gitlab.freedesktop.org/cairo/cairo/-/merge_requests/318">https://gitlab.freedesktop.org/cairo/cairo/-/merge_requests/318</a>.</p>
 <br><p style="margin-top:0;margin-bottom:0;margin-left:0;margin-right:0;">My \
interest is mainly around OCR/OMR (Optical Music Recognition) tasks where the goal is \
to recover numerical documents from images and pdfs.</p> <p \
style="margin-top:0;margin-bottom:0;margin-left:0;margin-right:0;">My main goal is to \
use cairo to extract every drawing and glyphs from pdf so that I can train models for \
symbol classification/detection and semantical reconstruction.</p> <p \
style="margin-top:0;margin-bottom:0;margin-left:0;margin-right:0;">The current main \
missing bit of information from the svg output of cairo is the utf8 string of \
glyph.</p> <br><p style="margin-top:0;margin-bottom:0;margin-left:0;margin-right:0;">My \
merge request aims to add this information with as little modifications as possible \
both in the code and in the svg output.</p> <br><p \
style="margin-top:0;margin-bottom:0;margin-left:0;margin-right:0;">I hope this kind \
of use-cases will interest some people and help me merge this feature.</p> <br><p \
style="margin-top:0;margin-bottom:0;margin-left:0;margin-right:0;">Let me know if \
there are anything I should in order to improve my contribution.</p> <br><p \
style="margin-top:0;margin-bottom:0;margin-left:0;margin-right:0;">Best regards,</p> \
<br><p style="margin-top:0;margin-bottom:0;margin-left:0;margin-right:0;">Kwon-Young \
Choi</p> </body>
</html>



[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic