[prev in list] [next in list] [prev in thread] [next in thread] 

List:       koffice-devel
Subject:    indexing Postscript files
From:       Tomasz Grobelny <grotk () poczta ! onet ! pl>
Date:       2003-06-18 21:21:37
[Download RAW message or body]

I did some tests about indexing postscript files and here are the results:
1. (text to be indexed) pop does work with neither ps2ascii nor pstotext
2. Such a ps code:
%abbreviations
/g {glyphshow} def
/m {moveto} def
%select font
/Helvetica findfont
30 scalefont
setfont
%set point+render text
200 100 m
/A g
150 100 m
/Aogonek g
/B g
%end page
showpage

after processing with ps2ascii becomes "<some whitespace>AB<some whitespace>". 
Note 2 things:
a) ps2ascii doesn't convert /Aogonek character
b) output sequence of characters doesn't correspond to their position on the 
page. Sequence in the input file and the output file is the same which means 
that KWord/kotext would have to print (maybe it does already) characters in 
text flow order.
The same code will not work with pstotext.

Remaining questions:
1. What tools are used by indexing engines (google, ...)
2. Is google capable of indexing non ascii postscript (can it handle 
/Aogonek)?
3. How do other postscript renderers write non ascii characters? Thomas?

Tomek
_______________________________________________
koffice-devel mailing list
koffice-devel@mail.kde.org
http://mail.kde.org/mailman/listinfo/koffice-devel
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic