Data-Edit-Xml-Xref

 view release on metacpan or  search on metacpan

Build.PL  view on Meta::CPAN

  dist_abstract         => 'Cross reference Dita XML, match topics and ameliorate missing references.',
  dist_author           => 'philiprbrenan@gmail.com',
  license               => 'perl',
  module_name           => 'Data::Edit::Xml::Xref',
  requires              => {
     perl               => '5.26.1',
     Carp               => 0,
    'Data::Dump'        => 0,
    'Data::Edit::Xml'                => 20200218,
    'Data::Table::Text'              => 20200418,
    'Dita::GB::Standard'             => 20190911,
    'Test::More'        => 0,
    'Test2::API'        => 0,
     utf8               => 0,
   },
 );

$b->create_build_script();

META.json  view on Meta::CPAN

         "requires" : {
            "Module::Build" : "0.4224"
         }
      },
      "runtime" : {
         "requires" : {
            "Carp" : "0",
            "Data::Dump" : "0",
            "Data::Edit::Xml" : "20200218",
            "Data::Table::Text" : "20200418",
            "Dita::GB::Standard" : "20190911",
            "Test2::API" : "0",
            "Test::More" : "0",
            "perl" : "v5.26.1",
            "utf8" : "0"
         }
      }
   },
   "provides" : {
      "Data::Edit::Xml::Xref" : {
         "file" : "lib/Data/Edit/Xml/Xref.pm",

META.yml  view on Meta::CPAN

name: Data-Edit-Xml-Xref
provides:
  Data::Edit::Xml::Xref:
    file: lib/Data/Edit/Xml/Xref.pm
    version: '20200424'
requires:
  Carp: '0'
  Data::Dump: '0'
  Data::Edit::Xml: '20200218'
  Data::Table::Text: '20200418'
  Dita::GB::Standard: '20190911'
  Test2::API: '0'
  Test::More: '0'
  perl: v5.26.1
  utf8: '0'
resources:
  license: http://dev.perl.org/licenses/
version: '20200424'
x_serialization_backend: 'CPAN::Meta::YAML version 0.018'

lib/Data/Edit/Xml/Xref.pm  view on Meta::CPAN


package Data::Edit::Xml::Xref;
our $VERSION = 20200424;
use v5.26;
use warnings FATAL => qw(all);
use strict;
use Carp qw(confess cluck);
use Data::Dump qw(dump);
use Data::Edit::Xml;
use Data::Table::Text qw(:all);
use Dita::GB::Standard;
use Storable qw(store retrieve);
use Time::HiRes qw(time);
use utf8;

#sub improvementLength      {80}                                                 #P Maximum length of the test of an improvement suggestion
sub classificationMapSuffix{q(_classification.ditamap)}                         #P Suffix to add to map files to create corresponding classification map file

#D1 Cross reference                                                             # Check the cross references in a set of Dita files and report the results.

sub newXref(%)                                                                  #P Create a new cross referencer

lib/Data/Edit/Xml/Xref.pm  view on Meta::CPAN

    fixedFolder                         => undef,                               #I Fixed files are placed in this folder.
    fixedFolderTemp                     => undef,                               #I Fixed files are placed in this folder if we are on aws but nit the session leader - this folder is then copied back to L<fixedFolder> on the session leader.
    fixedRefsBad                        => [],                                  # [] hrefs and conrefs from L<fixRefs|/fixRefs> which were moved to the "xtrf" attribute as requested by the L<fixBadHrefs|/fixBadHrefs> attribute because the reference w...
    fixedRefsGB                         => [],                                  # [] files fixed to the Gearhart-Brenan file naming standard
    fixedRefsGood                       => [],                                  # [] hrefs and conrefs from L<fixRefs|/fixRefs> which were invalid but have been fixed by L<deguidizing|/deguidize> them to a valid file name.
    fixedRefsNoAction                   => [],                                  # [] hrefs and conrefs from L<fixRefs|/fixRefs> for which no action was taken.
    fixRefs                             => {},                                  # {file}{ref} where the href or conref target is not valid.
    fixRelocatedRefs                    => undef,                               #I Fix references to topics that have been moved around in the out folder structure assuming that all file names are unique which they will be if they have been renamed t...
    fixXrefsByTitle                     => undef,                               #I Try to fix invalid xrefs by the Gearhart Title Method enhanced by the Monroe map method if true
    flattenFiles                        => {},                                  # {old full file name} = file renamed to Gearhart-Brenan file naming standard
    flattenFolder                       => undef,                               #I Files are renamed to the Gearhart standard and placed in this folder if set.  References to the unflattened files are updated to references to the flattened files.  Th...
    getFileUrl => qq(/cgi-bin/uiSelfServiceXref/client.pl?getFile=),            #I A url to retrieve a specified file from the server running xref used in generating html reports. The complete url is obtained by appending the fully qualified file nam...
    goodImageFiles                      => {},                                  # {file}++ : number of references to each good image
    goodNavTitles                       => {},                                  # Details of nav titles that were resolved.
    guidHrefs                           => {},                                  # {file}{href} = location where href starts with GUID- and is thus probably a guid.
    guidToFile                          => {},                                  # {topic id which is a guid} = file defining topic id.
    hrefUrlEncoding                     => {},                                  # Hrefs that need url encoding because they contain white space.
    html                                => undef,                               #I Generate html version of reports in this folder if supplied
    idNotReferenced                     => {},                                  # {file}{id}++ - id in a file that is not referenced
    idReferencedCount                   => {},                                  # {file}{id}++ - the number of times this id in this file is referenced from the rest of the corpus
    ids                                 => {},                                  # {file}{id}   - id definitions across all files.

lib/Data/Edit/Xml/Xref.pm  view on Meta::CPAN


  if ($@)                                                                       # Check we were able to parse the xml
   {$xref->parseFailed->{$iFile}++;
    return $xref;
   }

  my $md5 = $xref->md5Sum->{$iFile} = -M $x;                                    # Md5 sum for parse tree

  if ($Xref->flattenFolder)
   {$xref->flattenFiles->{$iFile} =                                             # Record correspondence between existing file and its GB Standard file name
      Dita::GB::Standard::gbStandardFileName($source, fe($iFile), md5=>$md5);
   }

  my $saveReference = sub                                                       # Save a reference so it can be integrity checked later
   {my ($ref) = @_;                                                             # Reference
    return if externalReference($ref);                                          # Looks like an external reference
    $xref->references->{$iFile}{$ref}++;                                        # Save reference
   };

  my $isADitaMap = $x->isADitaMap;                                              # Map

lib/Data/Edit/Xml/Xref.pm  view on Meta::CPAN

B<changeBadXrefToPh> - Change xrefs being placed in B<M3> by L<fixBadRefs> to B<ph>.

B<classificationMaps> - Create classification maps if true

B<deguidize> - Set true to replace guids in dita references with file name. Given reference B<g1#g2/id> convert B<g1> to a file name by locating the topic with topicId B<g2>.  This requires the guids to be genuinely unique. SDL guids are thought to b...

B<deleteUnusedIds> - Delete ids (except on topics) that are not referenced in any reference in the corpus regardless of the file component of any such reference.

B<fixBadRefs> - Fix any remaining bad references after any all allowed attempts have been made to fix failing references by moving the failing reference to the B<xtrf> attribute i.e. placing it in B<M3> possibly renaming the tag to B<ph> if L<changeB...

B<fixDitaRefs> - Fix references in a corpus of L<Dita|http://docs.oasis-open.org/dita/dita/v1.3/os/part2-tech-content/dita-v1.3-os-part2-tech-content.html> documents that have been converted to the L<GB Standard|http://metacpan.org/pod/Dita::GB::Stan...

B<fixRelocatedRefs> - Fix references to topics that have been moved around in the out folder structure assuming that all file names are unique which they will be if they have been renamed to the GB Standard.

B<fixXrefsByTitle> - Try to fix invalid xrefs by the Gearhart Title Method enhanced by the Monroe map method if true

B<fixedFolder> - Fixed files are placed in this folder.

B<fixedFolderTemp> - Fixed files are placed in this folder if we are on aws but nit the session leader - this folder is then copied back to L<fixedFolder> on the session leader.

B<flattenFolder> - Files are renamed to the Gearhart standard and placed in this folder if set.  References to the unflattened files are updated to references to the flattened files.  This option will eventually be deprecated as the Dita::GB::Standar...

B<getFileUrl> - A url to retrieve a specified file from the server running xref used in generating html reports. The complete url is obtained by appending the fully qualified file name to this value.

B<html> - Generate html version of reports in this folder if supplied

B<indexWords> - Index words to topics and topics to words if true.

B<indexWordsFolder> - Folder into which to save words to topic and topics to word indexes if L<indexWords> is true.

B<inputFolder> - A folder containing the dita and ditamap files to be cross referenced.

lib/Data/Edit/Xml/Xref.pm  view on Meta::CPAN

=head2 fixOneFileGB($xref, $file)

Fix one file to the Gearhart-Brenan standard

     Parameter  Description
  1  $xref      Xref results
  2  $file      File to fix

=head2 fixFilesGB($xref)

Rename files to the L<GB Standard|http://metacpan.org/pod/Dita::GB::Standard>

     Parameter  Description
  1  $xref      Xref results

=head2 analyzeOneFileParallel($Xref, $iFile)

Analyze one input file

     Parameter  Description
  1  $Xref      Xref request

lib/Data/Edit/Xml/Xref.pm  view on Meta::CPAN

43 L<createUrlTests|/createUrlTests> - Check urls

44 L<createWordsToFilesTest|/createWordsToFilesTest> - Index words to file

45 L<deleteVariableFields|/deleteVariableFields> - Remove time and other fields that do not affect the end results

46 L<editXml|/editXml> - Edit an xml file retaining any existing XML headers and lint trailers

47 L<externalReference|/externalReference> - Check for an external reference

48 L<fixFilesGB|/fixFilesGB> - Rename files to the L<GB Standard|http://metacpan.org/pod/Dita::GB::Standard>

49 L<fixingRun|/fixingRun> - A fixing run fixes problems where it can and thus induces changes which might make the updated output different from the incoming source.

50 L<fixOneFileGB|/fixOneFileGB> - Fix one file to the Gearhart-Brenan standard

51 L<fixReferences|/fixReferences> - Fix just the file containing references using a number of techniques and report those references that cannot be so fixed.

52 L<fixReferencesInOneFile|/fixReferencesInOneFile> - Fix one file by moving unresolved references to the xtrf attribute

53 L<fixReferencesParallel|/fixReferencesParallel> - Fix the references in one file



( run in 0.283 second using v1.01-cache-2.11-cpan-b61123c0432 )