App-WebSearchUtils
view release on metacpan or search on metacpan
NAME
App::WebSearchUtils - CLI utilites related to searching with search
engines
VERSION
This document describes version 0.001 of App::WebSearchUtils (from Perl
distribution App-WebSearchUtils), released on 2022-10-10.
SYNOPSIS
This distribution provides the following utilities:
* web-search
FUNCTIONS
web_search
Usage:
web_search(%args) -> [$status_code, $reason, $payload, \%result_meta]
Open web search page in browser.
This utility can save you time when you want to open multiple queries
(with added common prefix/suffix words) or specify some options like
time limit. It will formulate the search URL(s) then open them for you
in browser. You can also specify to print out the URLs instead.
Aside from standard web search, you can also generate/open other
searches like image, video, news, or map.
This function is not exported.
Arguments ('*' denotes required arguments):
* action => *str* (default: "open_url")
What to do with the URLs.
Instead of opening the queries in browser ("open_url"), you can also
do other action instead.
Printing search URLs: "print_url" will print the search URL.
"print_html_link" will print the HTML link (the <a> tag). And
"print_org_link" will print the Org-mode link, e.g.
"[[url...][query]]".
Saving search result HTMLs: "save_html" will first visit each search
URL (currently using Firefox::Marionette) then save each result page
to a file named "<num>-<query>.html" in the current directory.
Existing files will not be overwritten; the utility will save to
"*.html.1", "*.html.2" and so on instead.
Extracting search result links: "print_result_link" will first will
first visit each search URL (currently using Firefox::Marionette)
then extract result links and print them. "print_result_html_link"
and "print_result_org_link" are similar but will instead format each
link as HTML and Org link, respectively.
The "print_result_*link" actions are not very useful for some search
engines like Google because result HTML page is obfuscated. Thus we
can only extract all links in each page instead of selecting (via
DOM) only the actual search result entry links, etc.
If you want to filter the links further by domain, path, etc. you
can use grep-url.
* append => *str*
String to add at the end of each query.
* delay => *duration*
Delay between opening each query.
As an alternative to the "--delay" option, you can also use
"--min-delay" and "--max-delay" to set a random delay between a
minimum and maximum value.
* engine => *str* (default: "google")
Search engine to use.
* max_delay => *duration*
Delay between opening each query.
( run in 1.195 second using v1.01-cache-2.11-cpan-39bf76dae61 )