EBook-Tools

 view release on metacpan or  search on metacpan

Changes  view on Meta::CPAN

  * Some advertisements are deleted from metadata automatically

 Bug fixes:
  * Lots of bugs in epub generation fixed, including automatic
    generation of NCX files
  * fix_links() no longer breaks when encountering mailto: news: or
    backwards directory traversal links.
  * fix_links() should no longer sometimes attempt to add the same
    href multiple times.
  * hrefs are decoded to give actual filesystem filenames
  * XHTML 1.1 source files are no longer backwards-converted to XHTML
    1.0
  * Mobipocket UTF-8 HTML generation bugs fixed
  * Fixed bugs that could cause Mobipocket filepos anchors to have
    wrong ids
  * 'ebook genepub' options now match the documentation
  * OPF encoding is now autodetected instead of assumed as UTF-8
  * Compatible with Perl 5.20

 Behavior changes:
  * Unpacking books other than .lit now causes the OPF filename to

Changes  view on Meta::CPAN

0.4.7

 Bug fixes:
  * Mobipocket unpacks now correctly account for the extra data that
    can be appended to PalmDoc-compressed text records that should not
    be made part of the decompression process.

0.4.6

 Bug fixes:
  * EReader HTML conversion now creates (semi-valid) XHTML output and
    better handles paragraphs
  * EReader font marker handling improved
  * Missing config file options are properly handled
  * Documentation fixes

0.4.5

 Bug Fixes:
  * user script tests avoid smoke tests that tend to break on
    non-libraries

README.Helpers.txt  view on Meta::CPAN

Some functionality in EBook::Tools is only available with additional
helper applications.  This is a quick guide to what they are and how
to find them.


====
Tidy
====

This tool is used to clean up HTML files, making them conformant to a
given HTML/XHTML specification.  The main development page for Tidy
is:

http://tidy.sourceforge.net/

A MSWin32 executable (and GUI) are available from:

http://www.paehl.com/open_source/?HTML_Tidy_for_Windows


==========

lib/EBook/Tools.pm  view on Meta::CPAN


    my ($filebase,$filedir,$fileext);
    my ($fh_html,$fh_htmlout,$fh_pre);
    my $htmloutfile;
    my @preblocks;
    my @prefiles = ();
    my $prefile;
    my $count = 0;

    my $htmlheader = <<'END';
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html>
<head>
<title></title>
</head>
<body>
END

    ($filebase,$filedir,$fileext) = fileparse($htmlfile,'\.\w+$');
    $outfilebase = "$filebase-pre" if(!$outfilebase);

lib/EBook/Tools.pm  view on Meta::CPAN

    }
    else {
        croak($caller," child exited with value ",$CHILD_ERROR >> 8,":\n ",
              join(' ',@syscmd),"\n")
    }
}


=head2 C<system_tidy_xhtml($infile,$outfile)>

Runs tidy on a XHTML file semi-safely (using a secondary file)

Converts HTML to XHTML if necessary

=head3 Arguments

=over

=item $infile

The filename to tidy

=item $outfile

lib/EBook/Tools/EReader.pm  view on Meta::CPAN


=cut

sub html
{
    my $self = shift;
    my $subname = ( caller(0) )[3];
    debug(2,"DEBUG[",$subname,"]");

    my $header = <<"END";
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
 "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html>
<head>
  <title>$self->{title}</title>
</head>
<body>
END
    my $footer = "</body>\n</html>\n";

    return

lib/EBook/Tools/IMP.pm  view on Meta::CPAN

    }
    $textlength = length($$textref);

    if(!$textlength)
    {
        carp($subname,"(): no text extracted from DATA.FRK resource!\n");
        return;
    }

    $self->{text} = <<'END';
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

<html xmlns="http://www.w3.org/1999/xhtml">
<head>
  <meta http-equiv="CONTENT-TYPE" content="text/html; charset=windows-1252" />
END

    $self->{text} .= "  <title>$self->{title}</title>\n";
    $self->{text} .= "</head>\n<body>\n";

scripts/ebook.pl  view on Meta::CPAN

    $args{infile} = $infile;
    $args{outfile} = $outfile;
    $args{noscript} = $opt{noscript};

    strip_script(%args);
    return 0;
}

=head2 C<tidyxhtml>

Run tidy on a HTML file to enforce valid XHTML output (required by the
OPF 2.0 specification).

=cut

sub tidyxhtml
{
    my ($inputfile,$tidyfile) = @_;
    my $retval;

    if(!$inputfile)

t/mobi/test.html  view on Meta::CPAN

<?xml version="1.0"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
  <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
  <title>Test</title>
</head>

<body>
<h1 id="part1">Test</h2>
<p>Test text</p>

t/test-part1.html  view on Meta::CPAN

<?xml version="1.0"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
  <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
  <title>Part I</title>
  <link rel="stylesheet" href="parts.css" type="text/css" />
</head>

<body>
<h1 id="part1">Part I</h2>

t/test-part2.html  view on Meta::CPAN

<?xml version="1.0"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
  <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
  <title>Part II</title>
  <link rel="stylesheet" href="parts.css" type="text/css" />
</head>

<body>
<h1 id="part2">Part II</h2>



( run in 0.656 second using v1.01-cache-2.11-cpan-49f99fa48dc )