looks results from the CPAN

App-jupiter
#! /usr/bin/env perl
# Planet Jupiter is a feed aggregator that creates a single HTML file
# Copyright (C) 2020â€“2021  Alex Schroeder <alex@gnu.org>

# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU Affero General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU Affero General Public License for more details.
#
# You should have received a copy of the GNU Affero General Public License
# along with this program.  If not, see <https://www.gnu.org/licenses/>.

package Jupiter;

use Encode::Locale;
use Encode;
use utf8;

binmode(STDOUT, ":utf8");
binmode(STDERR, ":utf8");

=encoding UTF8

=head1 NAME

jupiter - turn a list of feeds into a HTML page, a river of news

=head1 SYNOPSIS

To update the feeds from one or more OPML files:

B<jupiter update> I<feed.opml> â€¦ [I</regex/> â€¦]

To generate F<index.html>:

B<jupiter html> I<feed.opml>

=head1 DESCRIPTION

Planet Jupiter is used to pull together the latest updates from a bunch of other
sites and display them on a single web page, the "river of news". The sites we
get our updates from are defined in an OPML file.

A river of news, according to Dave Winer, is a feed aggregator. New items appear
at the top and old items disappear at the bottom. When it's gone, it's gone.
There is no count of unread items. The goal is to fight the I<fear of missing
out> (FOMO).

Each item looks similar to every other: headline, link, an extract, maybe a date
and an author. Extracts contain but the beginning of the article's text; all
markup is removed; no images. The goal is to make the page easy to skim.
Scroll down until you find something interesting and follow the link to the
original article if you want to read it.

=head2 The OPML file

You B<need> an OPML file. It's an XML file linking to I<feeds>. Here's an
example listing just one feed. In order to add more, add more C<outline>
elements with the C<xmlUrl> attribute. The exact order and nesting does not
matter. People can I<import> these OPML files into their own feed readers and
thus it may make sense to spend a bit more effort in making it presentable.

    <opml version="2.0">
      <body>
        <outline title="Alex Schroeder"
                 xmlUrl="https://alexschroeder.ch/view/index.rss"/>
      </body>
    </opml>

=head2 Update the feeds in your cache

This is how you update the feeds in a file called C<feed.opml>. It downloads all
the feeds linked to in the OPML file and stores them in the cache directory.

    jupiter update feed.opml

The directory used to keep a copy of all the feeds in the OPML file has the same
name as the OPML file but without the .opml extension. In other words, if your
OPML file is called C<feed.opml> then the cache directory is called C<feed>.

This operation takes long because it requests an update from all the sites
listed in your OPML file. Don't run it too often or you'll annoy the site
owners.

The OPML file must use the .opml extension. You can update the feeds for
multiple OPML files in one go.

=head2 Adding just one feed

After a while, the list of feeds in your OPML starts getting unwieldy. When you
add a new feed, you might not want to fetch all of them. In this case, provide a
regular expression surrounded by slashes to the C<update> command:

    jupiter update feed.opml /example/

Assuming a feed with a URL or title that matches the regular expression is
listed in your OPML file, only that feed is going to get updated.

There is no need to escape slashes in the regular expression: C<//rss/> works
just fine. Beware shell escaping, however. Most likely, you need to surround the
regular expression with single quotes if it contains spaces:

    jupiter update feed.opml '/Halberds & Helmets/'

Notice how we assume that named entities such as C<&amp;> have already been
parsed into the appropriate strings.

=head2 Generate the HTML
( run in 0.426 second using v1.01-cache-2.11-cpan-39bf76dae61 )