relocat results from the CPAN

relocat
PDF-Builder
view release on metacpan or search on metacpan
   but we need to be able to suppress selected ligatures (e.g., 'ff' in the
   English word 'shelfful'). Support for swashes and alternate glyph choices 
   would be very nice to have (embedded markup language?). For complex scripts 
   like Arabic family and southern/southeastern Asian families, proper support 
   (Pango?) is vital.

   UPDATE: See Text::Layout and HarfBuzz::Shaper packages. Layout is usable
     with Builder (but no explicit support yet). Shaper is supported by Builder
     for ligatures and complex scripts. Need to see if it supports true small
     and petite capitals (included with font) as a sort of alternate glyphs.
     It's probably not feasible to decompose outline fonts and shrink them
     down nonlinearly (stroke widths reduced less than overall height/width)
     and recreate the new outlines as synthetic small/petite caps.

(A1.) Look at examples/HarfBuzz.pl to see some problems with ligatures. In some
     cases, such as "waffle", a PDF Reader can search for and find it even if
     "ffl" has been replaced by a ligature (single glyph). However, in other
     cases, such as "strasse", the Reader can NOT find the word when "ss" has
     been replaced by an eszet. My keyboard doesn't have an eszet, so I can't
     easily test if it can be searched for. I don't think there's anything in
     the PDF::Builder code which is substituting eszet for "ss". Interestingly, 
     the "st" ligature in the same word does not present any problem.

B. Unification of font support: including character set and encoding support
   improvements [see CTS 16/#81 and RT 120048] to make more commonality between 
   using UTF-8 and single byte encodings, across all the font types (core, 
   TrueType, Type1/PS, etc.). One problem with core fonts is, even though most 
   core fonts are already TrueType, that only the Latin-1 glyph set has widths 
   defined, and only single byte encodings are possible (similar for Type1/PS 
   fonts). To support UTF-8 for core and PS, the font might have to be built on 
   the fly for a page (like a synthetic font), with translations to single 
   bytes for all glyphs. If the resulting font exceeds 256 characters, 
   something would have to be done to split the page internally into two or 
   more sections, each with their own embedded virtual font. Glyph widths would 
   have to be available for all characters.

   Add: start a subfont with the ASCII set and empty top 128. Add new chars
   to it (from x80 to xFF) as single byte glyphs (not matching any standard
   encoding) and use this new subfont. When it fills up and more characters
   are needed on a page, start another subfont.

C. Improved documentation, possibly even a book giving detailed explanations
   and examples, as both a reference and a tutorial. Needless to say, there
   would have to be sufficient interest to warrant the time and expense of
   writing/editing and publishing (in any format) a book to be sold!

D. PDF/A (archival document management, RT 120375): this might be more than
   throwing a few flags/overriding flags to force font embedding and no 
   encryption/ passwords. There may be other stuff that needs to be done to 
   achieve recognition as a proper archival format (and there are apparently 
   several archival formats).

E. JPEG2000 image file support (CTS 12/#97): I don't know if this is worth it, 
   as there seems to be very little use of this, but if someone is interested,
   have at it...  any other newish image formats that PDF can support?

F. Fix Bar Code generation (CTS 1/#48): there seems to be something quite wrong
   with the current bar code generation, so it's possible that no one is using
   it in real documents yet. It's also possible that I'm not writing my test
   cases properly -- does anyone know if they work? I suspect that the use of 
   XForms (relocatable text and graphics) for the bar image is not scaling 
   nicely, and may have to be replaced by drawn graphics primitives (text and 
   graphics drawn in their final place). Many other 1D and 2D bar codes 
   (including QR) would be good, but perhaps the bar codes should go into a 
   separate module, due to their potentially large code size and use of "new"
   Perl modules. Even the existing 5 or so formats could be moved out, as 
   presumably no one is using them yet (if they are, in fact, broken). This is 
   in section I, as bar codes are already implemented in the base, but it's 
   possible that bar codes could be removed and reimplemented in section II as 
   a separate library or module.

   Update: Actually printing out the example bar codes separates the "merged"
   blobs into discrete lines. This may indicate rounding errors when presenting
   on a low-to-medium resolution computer screen. However, on a consumer-grade
   laser printer, the lines are still so close that I fear most scanners would
   hae trouble reading the bar codes. I may have to do something with reducing
   bar widths a little to allow for irregular edges ("blotting").

   Take a look at package PDF::QRCode. It uses Text::QRCode internally, but 
   the interesting feature is using a code monkey to integrate itself (like
   a retrovirus) into PDF::Builder, and can be called $gfx->qrcode(parms).

   Long term: consider a new package that generates an output-agnostic generic
   graphics list, along with sample GD, PS, PDF::Builder drivers; as well as 
   PDF::Builder generic barcode. Grab all sorts of Perl and non-Perl open
   source generators and snag their algorithms to incorporate into the library.
   incl PHP tecnickcom/tc-lib/barcode, Perl PDF::QRCode, etc. Allow qw/code1
   code2/ in use BarCodes statement, as most users will just want to import a
   very small subset of available codes.

G. Fix Small Caps (and capitalization in general) for ligatures (CTS 13/#79):
   some ligatures given in Unicode or single byte encodings don't get properly
   uppercased. The probable solution would be to decompose ligatures to their
   individual letters before capitalization or Small/Petite Caps (if an 
   uppercase version doesn't exist in the font, or use GSUB processing to 
   recreate a ligature from the capital letters). As Perl doesn't seem to handle
   capitalizing ligatures properly, a "capitals" function would need to be 
   offered, as well as improvements to the Small Caps in "synfonts". Various
   non-Latin single characters (e.g., Greek terminal/nonterminal sigma, German
   eszett, long s) also may need proper handling for capitalization.

   UPDATE: It may be better to use individual letters (rather than ligature
     Unicode points), allowing easy capitalization and small caps. Then use
     HarfBuzz::Shaper to replace lowercase letter sequences with true ligatures
     on the fly.

H. Fallback glyphs (CTS 5/#56) when a desired glyph is not found in one font,
   but can be found in another. This is similar to HTML when you give a font
   family list in CSS. Pango might help with this.

   UPDATE: This is being considered for Text::Layout, but nothing scheduled yet.

I. Support for tagged structure (CTS 17/#76 and RT 120375). At the least, don't 
   corrupt an existing tagged PDF file when extracting pages.

J. Adding comment fields to any object (and possibly standalone comments as
   their own objects). An example would be an image object with a comment
   giving the source image file, for debugging purposes).

K. Text method to move to arbitrary points: relative or absolute movement
   horizontally and vertically (a range of units), including tab support
( run in 3.081 seconds using v1.01-cache-2.11-cpan-84de2e75c66 )