HTML-Toc

 view release on metacpan or  search on metacpan

t/anchors.t  view on Meta::CPAN


HTML

$tocInsertor->insert($toc, $content, {output => \$output});
eq_or_diff($output, <<'HTML', 'conflicting anchor names due to encoding of forbidden characters', {max_width => 120});
<ul>
   <li><a href="#L.25.25">.25%</a></li>
   <li><a href="#L.25.25_2">%.25</a></li>
   <li><a href="#L.25">.25</a></li>
   <li><a href="#L.25_2">%</a></li>
   <li><a href="#Yes...">Yes...</a></li>
   <li><a href="#L.25_3">%</a></li>
   <li><a href="#The_Big_Step">%</a></li>
   <li><a href="#The_big_step_2">The big step</a></li>
   <li><a href="#The_Big_Step_2_2">The Big Step 2</a></li>
</ul>
<!-- End of generated Table of Contents -->
<h1><a name="L.25.25"></a>.25%</h1>
  <h1><a name="L.25.25_2"></a>%.25</h1>
  <h1><a name="L.25"></a>.25</h1>
  <h1><a name="L.25_2"></a>%</h1>
  <h1><a name="Yes..."></a>Yes...</h1>
  <h1><a name="L.25_3"></a>%</h1>
  Per http://www.w3.org/TR/REC-html40/types.html#type-name,
  &#8220;ID and NAME tokens must begin with a letter ([A-Za-z]) and may be followed by any number of letters, digits ([0-9]), hyphens ("-"), underscores ("_"), colons (":"), and periods (".").&#8221;,
  and MediaWiki does that too (see http://en.wikipedia.org/wiki/Hierarchies#Ethics.2C_behavioral_psychology.2C_philosophies_of_identity)

  <h1><a name="The_Big_Step"></a>The Big Step</h1>
  <h1><a name="The_big_step_2"></a>The big step</h1>
  Per http://www.w3.org/TR/REC-html40/struct/links.html#h-12.2.1, <br />
  &#8220;Anchor names must be unique within a document. Anchor names that differ only in case may not appear in the same document.&#8221;<br />
  <h1><a name="The_Big_Step_2_2"></a>The Big Step 2</h1>
  MediaWiki fails here, see http://en.wikipedia.org/w/index.php?title=User:Dandv/Sandbox&oldid=274553709#The_Big_Step_2

HTML

}  # TODO tests



# ------------------------------------------------------------------------
# --- Comprehensive test of character set in anchor names
# ------------------------------------------------------------------------
$toc->setOptions({
    header => '',  # by default, \n<!-- Table of Contents generated by Perl - HTML::Toc -->\n
    templateAnchorName => \&assembleAnchorName,
});
$content = <<'HTML';
{{toc}}<br />
  <h1>The Big Step 1</h1>
  The first heading text goes here<br />
  <h1>The Big Step 2</h1>
  This is the second heading text<br />
    <h2>second header, first subheader</h2>
    Some subheader text here<br />
    <h2>second header, second subheader</h2>
    Another piece of subheader text here<br />
  <h1>The Big Step</h1>
  Third text for heading h1 #3<br />
  <h1>The Big Step #6</h1>
  Per the XHTML 1.0 spec, the number/hash sign is NOT allowed in fragments; in practice, the fragment starts with the first hash.<br />
  Such anchors also work in Firefox 3 and IE 6.<br />
  <h1>Calculation #7: 7/5&gt;3 or &lt;2?</h1>
  Hail the spec, http://www.w3.org/TR/REC-html40/types.html#type-name:
  ID and NAME tokens must begin with a letter ([A-Za-z]) and may be followed by any number of letters, digits ([0-9]), hyphens ("-"), underscores ("_"), colons (":"), and periods (".").
  <h1>#8: start with a number (hash) [pound] {comment} sign</h1>
  <h1>Lots of gibberish here: &#8220;!&#8221;#$%&amp;&#39;()*+,-./:;&lt;=&gt;?@[\]^_`{|}~</h1>
  Note how the straight quotes were replaced by smart quotes, which are invalid in id attributes for <span class="caps">XHTML</span> 1.0 (!)
HTML

$tocInsertor->insert($toc, $content, {output => \$output});
eq_or_diff($output, <<'EOT', 'comprehensive test of character set in anchor names', {max_width => 50});
<ul>
   <li><a href="#The_Big_Step_1">The Big Step 1</a></li>
   <li><a href="#The_Big_Step_2">The Big Step 2</a>
      <ul>
         <li><a href="#second_header.2C_first_subheader">second header, first subheader</a></li>
         <li><a href="#second_header.2C_second_subheader">second header, second subheader</a></li>
      </ul>
   </li>
   <li><a href="#The_Big_Step">The Big Step</a></li>
   <li><a href="#The_Big_Step_.236">The Big Step #6</a></li>
   <li><a href="#Calculation_.237:_7.2F5.3E3_or_.3C2.3F">Calculation #7: 7/5&gt;3 or &lt;2?</a></li>
   <li><a href="#L.238:_start_with_a_number_.28hash.29_.5Bpound.5D_.7Bcomment.7D_sign">#8: start with a number (hash) [pound] {comment} sign</a></li>
   <li><a href="#Lots_of_gibberish_here:_.E2.80.9C.21.E2.80.9D.23.24.25.26.27.28.29.2A.2B.2C-..2F:.3B.3C.3D.3E.3F.40.5B.5C.5D.5E_.60.7B.7C.7D.7E">Lots of gibberish here: &#8220;!&#8221;#$%&amp;&#39;()*+,-./:;&lt;=&gt;?@[\]^_`{|}~</a></li>
</ul>
<!-- End of generated Table of Contents -->
<br />
  <h1><a name="The_Big_Step_1"></a>The Big Step 1</h1>
  The first heading text goes here<br />
  <h1><a name="The_Big_Step_2"></a>The Big Step 2</h1>
  This is the second heading text<br />
    <h2><a name="second_header.2C_first_subheader"></a>second header, first subheader</h2>
    Some subheader text here<br />
    <h2><a name="second_header.2C_second_subheader"></a>second header, second subheader</h2>
    Another piece of subheader text here<br />
  <h1><a name="The_Big_Step"></a>The Big Step</h1>
  Third text for heading h1 #3<br />
  <h1><a name="The_Big_Step_.236"></a>The Big Step #6</h1>
  Per the XHTML 1.0 spec, the number/hash sign is NOT allowed in fragments; in practice, the fragment starts with the first hash.<br />
  Such anchors also work in Firefox 3 and IE 6.<br />
  <h1><a name="Calculation_.237:_7.2F5.3E3_or_.3C2.3F"></a>Calculation #7: 7/5&gt;3 or &lt;2?</h1>
  Hail the spec, http://www.w3.org/TR/REC-html40/types.html#type-name:
  ID and NAME tokens must begin with a letter ([A-Za-z]) and may be followed by any number of letters, digits ([0-9]), hyphens ("-"), underscores ("_"), colons (":"), and periods (".").
  <h1><a name="L.238:_start_with_a_number_.28hash.29_.5Bpound.5D_.7Bcomment.7D_sign"></a>#8: start with a number (hash) [pound] {comment} sign</h1>
  <h1><a name="Lots_of_gibberish_here:_.E2.80.9C.21.E2.80.9D.23.24.25.26.27.28.29.2A.2B.2C-..2F:.3B.3C.3D.3E.3F.40.5B.5C.5D.5E_.60.7B.7C.7D.7E"></a>Lots of gibberish here: &#8220;!&#8221;#$%&amp;&#39;()*+,-./:;&lt;=&gt;?@[\]^_`{|}~</h1>
  Note how the straight quotes were replaced by smart quotes, which are invalid in id attributes for <span class="caps">XHTML</span> 1.0 (!)
EOT


# ------------------------------------------------------------------------
# --- range of header levels to make TOC out of: 1-1
# ------------------------------------------------------------------------
$content = <<'HTML';
<div class="ToC">{{toc 1-1}}</div>
  <h1>The Big Step 1</h1>
  The first heading text goes here<br />
  <h1>The Big Step 2</h1>
  This is the second heading text<br />
    <h2>second header, first subheader</h2>
    Some subheader text here<br />
    <h2>second header, second subheader</h2>
    Another piece of subheader text here<br />
  <h1>The Big Step #3</h1>
  another h1
    <h2>Second level heading</h2>
      <h3>Third level heading</h3>
        <h4>fourth level heading</h4>
        header text level 4
          <h5>Fifth level heading</h5>
  <h1>Back to level one with an interrobang&#x203D;</h1>
  '&#x203D;' is an interrobang.
</div>
HTML


$toc->setOptions({
    header => '',  # by default, \n<!-- Table of Contents generated by Perl - HTML::Toc -->\n
    templateAnchorName => \&assembleAnchorName,
    levelToToc => "[1-1]",
    insertionPoint => 'replace {{toc \[?\d*-?\d*\]?}}'
});
$tocInsertor->insert($toc, $content, {output => \$output});
eq_or_diff($output, <<'HTML', 'range of header levels to make TOC out of: 1-1', {max_width => 120});
<div class="ToC"><ul>
   <li><a href="#The_Big_Step_1">The Big Step 1</a></li>
   <li><a href="#The_Big_Step_2">The Big Step 2</a></li>
   <li><a href="#The_Big_Step_.233">The Big Step #3</a></li>
   <li><a href="#Back_to_level_one_with_an_interrobang.E2.80.BD">Back to level one with an interrobang&#x203D;</a></li>
</ul>
<!-- End of generated Table of Contents -->
</div>
  <h1><a name="The_Big_Step_1"></a>The Big Step 1</h1>
  The first heading text goes here<br />
  <h1><a name="The_Big_Step_2"></a>The Big Step 2</h1>
  This is the second heading text<br />
    <h2>second header, first subheader</h2>
    Some subheader text here<br />
    <h2>second header, second subheader</h2>
    Another piece of subheader text here<br />



( run in 0.722 second using v1.01-cache-2.11-cpan-bbe5e583499 )