App-jupiter

 view release on metacpan or  search on metacpan

script/jupiter  view on Meta::CPAN

=head2 What about the JSON file?

There's a JSON file that gets generated and updated as you run Planet Jupiter.
It's name depends on the OPML files used. It records metadata for every feed in
the OPML file that isn't stored in the feeds themselves.

C<message> is the HTTP status message, or a similar message such as "No entry
newer than 90 days." This is set when update the feeds in your cache.

C<message> is the HTTP status code; this code could be the real status code from
the server (such as 404 for a "not found" status) or one generated by Jupiter
such that it matches the status message (such as 206 for a "partial content"
status when there aren't any recent entries in the feed). This is set when
update the feeds in your cache.

C<title> is the site's title. When you update the feeds in your cache, it is
taken from the OPML file. That's how the feed can have a title even if the
download failed. When you generate the HTML, the feeds in the cache are parsed
and if a title is provided, it is stored in the JSON file and overrides the
title in the OPML file.

C<link> is the site's link for humans. When you generate the HTML, the feeds in
the cache are parsed and if a link is provided, it is stored in the JSON file.
If the OPML element contained a C<htmlURL> attribute, however, that takes
precedence. The reasoning is that when a podcast is hosted on a platform which
generates a link that you don't like and you know the link to the human-readable
blog elsehwere, use the C<htmlURL> attribute in the OPML file to override this.

C<last_modified> and C<etag> are two headers used for caching from the HTTP
response that cannot be changed by data in the feed.

If we run into problems downloading a feed, this setup allows us to still link
to the feeds that aren't working, using their correct names, and describing the
error we encountered.

=head2 Logging

Use the C<--log=LEVEL> to set the log level. Valid values for LEVEL are debug,
info, warn, error, and fatal.

=head1 LICENSE

GNU Affero General Public License

=head1 INSTALLATION

Using C<cpan>:

    cpan App::jupiter

Manual install:

    perl Makefile.PL
    make
    make install

=head2 Dependencies

To run Jupiter on Debian we need:

C<libmodern-perl-perl> for L<Modern::Perl>

C<libmojolicious-perl> for L<Mojo::Template>, L<Mojo::UserAgent>, L<Mojo::Log>,
L<Mojo::JSON>, and L<Mojo::Util>

C<libxml-libxml-perl> for L<XML::LibXML>

C<libfile-slurper-perl> for L<File::Slurper>

C<libdatetime-perl> for L<DateTime>

C<libdatetime-format-mail-perl> for L<DateTime::Format::Mail>

C<libdatetime-format-iso8601-perl> for L<DateTime::Format::ISO8601>

Unfortunately, L<Mojo::UserAgent::Role::Queued> isn't packaged for Debian.
Therefore, let's build it and install it as a Debian package.

    sudo apt-get install libmodule-build-tiny-perl
    sudo apt-get install dh-make-perl
    sudo dh-make-perl --build --cpan Mojo::UserAgent::Role::Queued
    dpkg --install libmojo-useragent-role-queued-perl_1.15-1_all.deb

To generate the C<README.md> from the source file, you need F<pod2markdown>
which you get in C<libpod-markdown-perl>.

=head1 FILES

There are a number of files in the F<share> directory which you can use as
starting points.

F<template.html> is the HTML template.

F<default.css> is a small CSS file used by F<template.html>.

F<personalize.js> is a small Javascript file used by F<template.html> used to
allow visitors to jump from one article to the next using C<J> and C<K>.

F<jupiter.png> is used by F<template.html> as the icon.

F<jupiter.svg> is used by F<template.html> as the logo.

F<feed.png> is used by F<template.html> as the icon for the feeds in the
sidebar.

F<feed.rss> is the feed template.

=head1 OPTIONS

HTML generation uses a template, C<template.html>. It is written for
C<Mojo::Template> and you can find it in the F<share> directory of your
distribution. The default templates use other files, such as the logo, the feed
icon, a CSS file, and a small Javascript snippet to enable navigation using the
C<J> and C<K> keys (see above).

You can specify a different HTML file to generate:

B<jupiter html> I<your.html feed.opml>

If you specify two HTML files, the first is the HTML file to generate and the
second is the template to use. Both must use the C<.html> extension.

B<jupiter html> I<your.html your-template.html feed.opml>

Feed generation uses a template, C<feed.rss>. It writes all the entries into a
file called C<feed.xml>. Again, the template is written for C<Mojo::Template>.

You can specify up to two XML, RSS or ATOM files. They must uses one of these
three extensions: C<.xml>, C<.rss>, or C<.atom>. The first is the name of the
feed to generate, the second is the template to use:

B<jupiter html> I<atom.xml template.xml planet.html template.html feed.opml>

In the above case, Planet Jupiter will write a feed called F<atom.xml> based on
F<template.xml> and a HTML file called F<planet.html> based on F<template.html>,
using the cached entries matching the feeds in F<feed.opml>.

=cut

use DateTime;
use DateTime::Format::Mail;
use DateTime::Format::ISO8601;
use File::Basename;
use File::Slurper qw(read_binary write_binary read_text write_text);
use List::Util qw(uniq min shuffle);
use Modern::Perl;
use Mojo::Log;
use Mojo::JSON qw(decode_json encode_json);
use Mojo::Template;
use Mojo::UserAgent;
use Pod::Simple::Text;
use XML::LibXML;
use Mojo::Util qw(slugify trim xml_escape html_unescape);
use File::ShareDir 'dist_file';

use vars qw($log);
our $log = Mojo::Log->new;

my $xpc = XML::LibXML::XPathContext->new;
$xpc->registerNs('atom', 'http://www.w3.org/2005/Atom');
$xpc->registerNs('html', 'http://www.w3.org/1999/xhtml');
$xpc->registerNs('dc', 'http://purl.org/dc/elements/1.1/');
$xpc->registerNs('itunes', 'http://www.itunes.com/dtds/podcast-1.0.dtd');

my $undefined_date = DateTime->from_epoch( epoch => 0 );

my (%wday, %month, $wday_re, $month_re);
%wday = qw (lun. Mon mar. Tue mer. Wed jeu. Thu ven. Fri sam. Sat dim. Sun);
%month = qw (janv. Jan févr. Feb mars Mar avr. Apr mai May juin Jun
	     juil. Jul août Aug sept. Sep oct. Oct nov. Nov déc. Dec);
$wday_re = join('|', map { quotemeta } keys %wday) unless $wday_re;
$month_re = join('|', map { quotemeta } keys %month) unless $month_re;

# Our tests don't want to call main
__PACKAGE__->main unless caller;

sub main {
  my ($log_level) = grep /^--log=/, @ARGV;
  $log->level(substr($log_level, 6)) if $log_level;
  my ($command) = grep /^[a-z]+$/, @ARGV;
  $command ||= 'help';
  if ($command eq 'update') {
    update_cache(@ARGV);
  } elsif ($command eq 'html') {
    make_html(@ARGV);
  } else {
    my $parser = Pod::Simple::Text->new();
    $parser->parse_file($0);
  }
}

sub update_cache {
  my ($feeds, $files) = read_opml(@_);
  make_directories($feeds);
  load_feed_metadata($feeds, $files);
  my $ua = Mojo::UserAgent->new->with_roles('+Queued')
      ->max_redirects(3)
      ->max_active(5);
  make_promises($ua, $feeds);
  fetch_feeds($feeds);
  save_feed_metadata($feeds, $files);
  cleanup_cache($feeds);
}

sub make_promises {
  my $ua = shift;

 view all matches for this distribution
 view release on metacpan -  search on metacpan

( run in 0.466 second using v1.00-cache-2.02-grep-82fe00e-cpan-2c419f77a38b )