QUESTION: how can I access the text matrix? #3071
Replies: 4 comments
-
Sorry, no there is no way to access this. |
Beta Was this translation helpful? Give feedback.
-
Oh well, parsing the trace output it is then :) But I am on the right track, right? I mean, when a PDF file does something like this (fake font styles with fancy rendering techniques), is looking at matrix transformations inside a |
Beta Was this translation helpful? Give feedback.
-
You cold use an XML package and read the trace output. |
Beta Was this translation helpful? Give feedback.
-
Thanks for the help. What I ended up doing, at least for now, was to use The PDF file I have here seems to be quite simple (if we ignore the way it Italics :) )and luckily that makes the Ideally, and this is what I plan on doing next, it is probably best to create a The only problem with this approach is that the callback might (and most of the times will) be called several times until a text block is fully formed, so I'm also going to have to deal with that. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Is it possible to access the text matrix?
I have a PDF file where I'm only interested in the parts of the text that are in Italic. But this particular PDF file uses matrix transformations to render the Italic text. That means I cannot rely on the flags to tell whether the text is in Italic or not.
When I use
mupdf
mutool trace
I can see that the<span>
tag generated for the lines that are rendered in Italic have a specific value for thetrm
attribute. Unfortunately that attribute is not put insidepymupdf
span
dictionary. You guys seem to use it, only it is not put inside the dictionary.Is there any other property that can give me the information I need?
Beta Was this translation helpful? Give feedback.
All reactions