HTML-ExtractContent

 view release on metacpan or  search on metacpan

t/input1.html  view on Meta::CPAN

<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="ja">
<head>
  <link rel="start" href="http://orezdnu.org/" />
  <link rev="made" href="http://orezdnu.org/" />
  <title>Sample for content extraction test (1)</title>
</head>
<body>
  <div id="content">
    <h1>Sample for content extraction test (1)</h1>
    <p>This file is for a simple test that the single content of the page can

t/input2.html  view on Meta::CPAN

<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="ja">
<head>
  <link rel="start" href="http://orezdnu.org/" />
  <link rev="made" href="http://orezdnu.org/" />
  <title>Sample for content extraction test (2)</title>
</head>
<body>
  <div id="content">
    <h1>Sample for content extraction test (2)</h1>



( run in 1.284 second using v1.01-cache-2.11-cpan-49f99fa48dc )