WWW-Mechanize-Chrome-DOMops
view release on metacpan or search on metacpan
my %default_mech_params = (
headless => 1,
# log => $mylogger,
launch_arg => [
'--window-size=600x800',
'--password-store=basic', # do not ask me for stupid chrome account password
# '--remote-debugging-port=9223',
# '--enable-logging', # see also log above
'--disable-gpu',
'--no-sandbox',
'--ignore-certificate-errors',
'--disable-background-networking',
'--disable-client-side-phishing-detection',
'--disable-component-update',
'--disable-hang-monitor',
'--disable-save-password-bubble',
'--disable-default-apps',
'--disable-infobars',
'--disable-popup-blocking',
],
);
my $mech_obj = eval {
WWW::Mechanize::Chrome->new(%default_mech_params)
};
die $@ if $@;
# This transfers all javascript code's console.log(...)
# messages to perl's warn()
# we need to keep $console var in scope!
my $console = $mech_obj->add_listener('Runtime.consoleAPICalled', sub {
warn
"js console: "
. join ", ",
map { $_->{value} // $_->{description} }
@{ $_[0]->{params}->{args} };
})
;
# and now fetch a page
my $URL = '...';
my $retmech = $mech_obj->get($URL);
die "failed to fetch $URL" unless defined $retmech;
$mech_obj->sleep(1); # let it settle
# now the mech object has loaded the URL and has a DOM hopefully.
# You can pass it on to domops_find() or domops_zap() to operate on the DOM.
SECURITY WARNING
WWW::Mechanize::Chrome invokes the google-chrome executable on behalf
of the current user. Headless or not, google-chrome is invoked.
Depending on the launch parameters, either a fresh, new browser session
will be created or the session of the current user with their profile,
data, cookies, passwords, history, etc. will be used. The latter case
is very dangerous.
This behaviour is controlled by WWW::Mechanize::Chrome's constructor
parameters which, in turn, are used for launching the google-chrome
executable. Specifically, see WWW::Mechanize::Chrome#separate_session,
<WWW::Mechanize::Chrome#data_directory and
WWW::Mechanize::Chrome#incognito.
Unless you really need to mechsurf with your current session, aim to
launching the browser with a fresh new session. This is the safest
option.
Do not rely on default behaviour as this may change over time. Be
explicit.
Also, be warned that WWW::Mechanize::Chrome::DOMops executes javascript
code on that google-chrome instance. This is done nternally with
javascript code hardcoded into the WWW::Mechanize::Chrome::DOMops's
package files.
On top of that WWW::Mechanize::Chrome::DOMops allows for user-specified
javascript code to be executed on that google-chrome instance. For
example the callbacks on each element found, etc.
This is an example of what can go wrong if you are not using a fresh
google-chrome session:
You have just used google-chrome to access your yahoo webmail and you
did not logout. So, there will be an access cookie in the google-chrome
when you later invoke it via WWW::Mechanize::Chrome (remember you have
not told it to use a fresh session).
If you allow unchecked user-specified (or copy-pasted from ChatGPT)
javascript code in WWW::Mechanize::Chrome::DOMops's domops_find(),
domops_zap(), etc. then it is, theoretically, possible that this
javascript code initiates an XHR to yahoo and fetch your emails and
pass them on to your perl code.
But there is another problem, WWW::Mechanize::Chrome::DOMops's
integrity of the embedded javascript code may have been compromised to
exploit your current session.
This is very likely with a Windows installation which, being the
security swiss cheese it is, it is possible for anyone to compromise
your module's code. It is less likely in Linux, if your modules are
installed by root and are read-only for normal users. But, still, it is
possible to be compromised (by root).
Another issue is with the saved passwords and the browser's auto-fill
when landing on a login form.
Therefore, for all these reasons, it is advised not to invoke (via
WWW::Mechanize::Chrome) google-chrome with your
current/usual/everyday/email-access/bank-access identity so that it
does not have access to your cookies, passwords, history etc.
It is better to create a fresh google-chrome identity/profile and use
that for your WWW::Mechanize::Chrome::DOMops needs.
No matter what identity you use, you may want to erase the cookies and
history of google-chrome upon its exit. That's a good practice.
It is also advised to review the javascript code you provide via
WWW::Mechanize::Chrome::DOMops callbacks if it is taken from 3rd-party,
human or not, e.g. ChatGPT.
Additionally, make sure that the current installation of
( run in 0.590 second using v1.01-cache-2.11-cpan-e93a5daba3e )