WWW-Mechanize-Chrome-DOMops

 view release on metacpan or  search on metacpan

README  view on Meta::CPAN

        my %default_mech_params = (
            headless => 1,
        #   log => $mylogger,
            launch_arg => [
                    '--window-size=600x800',
                    '--password-store=basic', # do not ask me for stupid chrome account password
        #           '--remote-debugging-port=9223',
        #           '--enable-logging', # see also log above
                    '--disable-gpu',
                    '--no-sandbox',
                    '--ignore-certificate-errors',
                    '--disable-background-networking',
                    '--disable-client-side-phishing-detection',
                    '--disable-component-update',
                    '--disable-hang-monitor',
                    '--disable-save-password-bubble',
                    '--disable-default-apps',
                    '--disable-infobars',
                    '--disable-popup-blocking',
            ],
        );
    
        my $mech_obj = eval {
            WWW::Mechanize::Chrome->new(%default_mech_params)
        };
        die $@ if $@;
    
        # This transfers all javascript code's console.log(...)
        # messages to perl's warn()
        # we need to keep $console var in scope!
        my $console = $mech_obj->add_listener('Runtime.consoleAPICalled', sub {
              warn
                  "js console: "
                . join ", ",
                  map { $_->{value} // $_->{description} }
                  @{ $_[0]->{params}->{args} };
            })
        ;
    
        # and now fetch a page
        my $URL = '...';
        my $retmech = $mech_obj->get($URL);
        die "failed to fetch $URL" unless defined $retmech;
        $mech_obj->sleep(1); # let it settle
        # now the mech object has loaded the URL and has a DOM hopefully.
        # You can pass it on to domops_find() or domops_zap() to operate on the DOM.

SECURITY WARNING

    WWW::Mechanize::Chrome invokes the google-chrome executable on behalf
    of the current user. Headless or not, google-chrome is invoked.
    Depending on the launch parameters, either a fresh, new browser session
    will be created or the session of the current user with their profile,
    data, cookies, passwords, history, etc. will be used. The latter case
    is very dangerous.

    This behaviour is controlled by WWW::Mechanize::Chrome's constructor
    parameters which, in turn, are used for launching the google-chrome
    executable. Specifically, see WWW::Mechanize::Chrome#separate_session,
    <WWW::Mechanize::Chrome#data_directory and
    WWW::Mechanize::Chrome#incognito.

    Unless you really need to mechsurf with your current session, aim to
    launching the browser with a fresh new session. This is the safest
    option.

    Do not rely on default behaviour as this may change over time. Be
    explicit.

    Also, be warned that WWW::Mechanize::Chrome::DOMops executes javascript
    code on that google-chrome instance. This is done nternally with
    javascript code hardcoded into the WWW::Mechanize::Chrome::DOMops's
    package files.

    On top of that WWW::Mechanize::Chrome::DOMops allows for user-specified
    javascript code to be executed on that google-chrome instance. For
    example the callbacks on each element found, etc.

    This is an example of what can go wrong if you are not using a fresh
    google-chrome session:

    You have just used google-chrome to access your yahoo webmail and you
    did not logout. So, there will be an access cookie in the google-chrome
    when you later invoke it via WWW::Mechanize::Chrome (remember you have
    not told it to use a fresh session).

    If you allow unchecked user-specified (or copy-pasted from ChatGPT)
    javascript code in WWW::Mechanize::Chrome::DOMops's domops_find(),
    domops_zap(), etc. then it is, theoretically, possible that this
    javascript code initiates an XHR to yahoo and fetch your emails and
    pass them on to your perl code.

    But there is another problem, WWW::Mechanize::Chrome::DOMops's
    integrity of the embedded javascript code may have been compromised to
    exploit your current session.

    This is very likely with a Windows installation which, being the
    security swiss cheese it is, it is possible for anyone to compromise
    your module's code. It is less likely in Linux, if your modules are
    installed by root and are read-only for normal users. But, still, it is
    possible to be compromised (by root).

    Another issue is with the saved passwords and the browser's auto-fill
    when landing on a login form.

    Therefore, for all these reasons, it is advised not to invoke (via
    WWW::Mechanize::Chrome) google-chrome with your
    current/usual/everyday/email-access/bank-access identity so that it
    does not have access to your cookies, passwords, history etc.

    It is better to create a fresh google-chrome identity/profile and use
    that for your WWW::Mechanize::Chrome::DOMops needs.

    No matter what identity you use, you may want to erase the cookies and
    history of google-chrome upon its exit. That's a good practice.

    It is also advised to review the javascript code you provide via
    WWW::Mechanize::Chrome::DOMops callbacks if it is taken from 3rd-party,
    human or not, e.g. ChatGPT.

    Additionally, make sure that the current installation of



( run in 0.590 second using v1.01-cache-2.11-cpan-e93a5daba3e )