App-FeedDeduplicator

 view release on metacpan or  search on metacpan

lib/App/FeedDeduplicator/Deduplicator.pm  view on Meta::CPAN

deduplicated entries are stored in the $deduplicated attribute.

It is designed to be used in conjunction with the Aggregator and Publisher
classes to provide a complete feed deduplication and publishing solution.

=head2 find_canonical

Finds the canonical link for a given entry. It fetches the entry's link using
LWP::UserAgent and parses the HTML content using HTML::TreeBuilder::XPath.

It looks for the <link rel="canonical"> tag in the HTML content and returns
the canonical URL if found. If the canonical link is not found, it returns
undef.

It is used during the deduplication process to determine the unique
identifier for each entry.

=cut

package App::FeedDeduplicator::Deduplicator; # For MetaCPAN



( run in 1.881 second using v1.01-cache-2.11-cpan-39bf76dae61 )