Fonts

Each PDF font object is converted to a font element:
<font tag="font-tag" name="font-name" type="font-type" file="font-file" problematicForUnicode="yes|no"> <size fontSize="font-size"> ... </font>
The font tag is an arbitrary tag for this font, which will be used by word elements to refer to the font.

The font name is the name as given in the PDF font object. It may include a font subset tag (e.g., "AAAAAA+Times-Roman").

The font type is one of:

"(OT)" refers to OpenType. For more information on font types, see the PDF reference manual.

For embedded fonts, there will be an associated font file in the output directory. Its name is given by the file attribute. The font file will be in its native format, i.e., PDFdeconstruct does not do any font type conversion.

For non-embedded fonts, there will be no file attribute.

The problematicForUnicode attribute value is either "yes" or "no". A "yes" value indicates that the font is likely to be problematic when converting text to Unicode. Note that this is a heuristic; it's impossible to automatically detect problematic fonts with 100% certainty.

The size elements list all of the sizes at which this font was used, i.e., all unique fontSize values from word elements referencing this font.