Firefox-Marionette
view release on metacpan or search on metacpan
accepts a filesystem path to a bookmarks file and imports all the
bookmarks in that file. It can deal with backups from Firefox
<https://support.mozilla.org/en-US/kb/export-firefox-bookmarks-to-backup-or-transfer>,
Chrome <https://support.google.com/chrome/answer/96816?hl=en> or Edge.
use Firefox::Marionette();
use v5.10;
my $firefox = Firefox::Marionette->new()->import_bookmarks('/path/to/bookmarks_file.html');
This method returns itself to aid in chaining methods.
images
returns a list of all of the following elements;
* img <https://developer.mozilla.org/en-US/docs/Web/HTML/Element/img>
* image inputs
<https://developer.mozilla.org/en-US/docs/Web/HTML/Element/input/image>
as Firefox::Marionette::Image objects.
This method is subject to the implicit timeout, which, by default is 0
seconds.
use Firefox::Marionette();
my $firefox = Firefox::Marionette->new()->go('https://metacpan.org/');
if (my $link = $firefox->images()) {
say "Found a image with width " . $image->width() . "px and height " . $image->height() . "px from " . $image->URL();
}
If no elements are found, this method will return undef.
install
accepts the following as the first parameter;
* path to an xpi file
<https://developer.mozilla.org/en-US/docs/Mozilla/XPI>.
* path to a directory containing firefox extension source code
<https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/WebExtensions/Your_first_WebExtension>.
This directory will be packaged up as an unsigned xpi file.
* path to a top level file (such as manifest.json
<https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/WebExtensions/Anatomy_of_a_WebExtension#manifest.json>)
in a directory containing firefox extension source code
<https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/WebExtensions/Your_first_WebExtension>.
This directory will be packaged up as an unsigned xpi file.
and an optional true/false second parameter to indicate if the xpi file
should be a temporary extension
<https://extensionworkshop.com/documentation/develop/temporary-installation-in-firefox/>
(just for the existence of this browser instance). Unsigned xpi files
may only be loaded temporarily
<https://wiki.mozilla.org/Add-ons/Extension_Signing> (except for
nightly firefox installations
<https://www.mozilla.org/en-US/firefox/channel/desktop/#nightly>). It
returns the GUID for the addon which may be used as a parameter to the
uninstall method.
use Firefox::Marionette();
my $firefox = Firefox::Marionette->new();
my $extension_id = $firefox->install('/full/path/to/gnu_terry_pratchett-0.4-an+fx.xpi');
# OR downloading and installing source code
system { 'git' } 'git', 'clone', 'https://github.com/kkapsner/CanvasBlocker.git';
if ($firefox->nightly()) {
$extension_id = $firefox->install('./CanvasBlocker'); # permanent install for unsigned packages in nightly firefox
} else {
$extension_id = $firefox->install('./CanvasBlocker', 1); # temp install for normal firefox
}
interactive
returns true if document.readyState === "interactive" or if loaded is
true
use Firefox::Marionette();
my $firefox = Firefox::Marionette->new()->go('https://metacpan.org/');
$firefox->find_id('metacpan_search-input')->type('Type::More');
$firefox->await(sub { $firefox->find_class('autocomplete-suggestion'); })->click();
while(!$firefox->interactive()) {
# redirecting to Test::More page
}
is_displayed
accepts an element as the first parameter. This method returns true or
false depending on if the element is displayed
<https://firefox-source-docs.mozilla.org/testing/marionette/internals/interaction.html#interaction.isElementDisplayed>.
is_enabled
accepts an element as the first parameter. This method returns true or
false depending on if the element is enabled
<https://w3c.github.io/webdriver/#is-element-enabled>.
is_selected
accepts an element as the first parameter. This method returns true or
false depending on if the element is selected
<https://w3c.github.io/webdriver/#dfn-is-element-selected>. Note that
this method only makes sense for checkbox
<https://developer.mozilla.org/en-US/docs/Web/HTML/Element/input/checkbox>
or radio
<https://developer.mozilla.org/en-US/docs/Web/HTML/Element/input/radio>
inputs or option
<https://developer.mozilla.org/en-US/docs/Web/HTML/Element/option>
* developer - only allow a developer edition
<https://www.mozilla.org/en-US/firefox/developer/> to be launched.
This defaults to "0" (off).
* devtools - begin the session with the devtools
<https://developer.mozilla.org/en-US/docs/Tools> window opened in a
separate window.
* geo - setup the browser preferences
<http://kb.mozillazine.org/About:config> to allow the Geolocation API
<https://developer.mozilla.org/en-US/docs/Web/API/Geolocation_API> to
work. If the value for this key is a URI object or a string beginning
with '^(?:data|http)', this object will be retrieved using the json
method and the response will used to build a GeoLocation object,
which will be sent to the geo method. If the value for this key is a
hash, the hash will be used to build a GeoLocation object, which will
be sent to the geo method.
* height - set the height
<http://kb.mozillazine.org/Command_line_arguments#List_of_command_line_arguments_.28incomplete.29>
of the initial firefox window
* har - begin the session with the devtools
<https://developer.mozilla.org/en-US/docs/Tools> window opened in a
separate window. The HAR Export Trigger
<https://addons.mozilla.org/en-US/firefox/addon/har-export-trigger/>
addon will be loaded into the new session automatically, which means
that -safe-mode
<http://kb.mozillazine.org/Command_line_arguments#List_of_command_line_arguments_.28incomplete.29>
will not be activated for this session AND this functionality will
only be available for Firefox 61+.
* host - use ssh <https://man.openbsd.org/ssh.1> to create and
automate firefox on the specified host. See REMOTE AUTOMATION OF
FIREFOX VIA SSH and NETWORK ARCHITECTURE. The user will default to
the current user name (see the user parameter to change this).
Authentication should be via public keys loaded into the local
ssh-agent <https://man.openbsd.org/ssh-agent>.
* implicit - a shortcut to allow directly providing the implicit
timeout, instead of needing to use timeouts from the capabilities
parameter. Overrides all longer ways.
* index - a parameter to allow the user to specify a specific firefox
instance to survive and reconnect to. It does not do anything else at
the moment. See the survive parameter.
* insecure - this is a shortcut method for setting the
accept_insecure_certs option in the capabilities parameter above.
* kiosk - start the browser in kiosk
<https://support.mozilla.org/en-US/kb/firefox-enterprise-kiosk-mode>
mode.
* mime_types - any MIME types that Firefox will encounter during this
session. MIME types that are not specified will result in a hung
browser (the File Download popup will appear).
* nightly - only allow a nightly release
<https://www.mozilla.org/en-US/firefox/channel/desktop/#nightly> to
be launched. This defaults to "0" (off).
* port - if the "host" parameter is also set, use ssh
<https://man.openbsd.org/ssh.1> to create and automate firefox via
the specified port. See REMOTE AUTOMATION OF FIREFOX VIA SSH and
NETWORK ARCHITECTURE.
* page_load - a shortcut to allow directly providing the page_load
timeout, instead of needing to use timeouts from the capabilities
parameter. Overrides all longer ways.
* profile - create a new profile based on the supplied profile. NOTE:
firefox ignores any changes made to the profile on the disk while it
is running, instead, use the set_pref and clear_pref methods to make
changes while firefox is running.
* profile_name - pick a specific existing profile to automate, rather
than creating a new profile. Firefox <https://firefox.com> refuses to
allow more than one instance of a profile to run at the same time.
Profile names can be obtained by using the
Firefox::Marionette::Profile::names() method. The following
conditions are required to use existing profiles;
* the preference security.webauth.webauthn_enable_softtoken must be
set to true in the profile OR
* the webauth parameter to this method must be set to 0
NOTE: firefox ignores any changes made to the profile on the disk
while it is running, instead, use the set_pref and clear_pref methods
to make changes while firefox is running.
* proxy - this is a shortcut method for setting a proxy using the
capabilities parameter above. It accepts a proxy URL, with the
following allowable schemes, 'http' and 'https'. It also allows a
reference to a list of proxy URLs which will function as list of
proxies that Firefox will try in left to right order
<https://developer.mozilla.org/en-US/docs/Web/HTTP/Proxy_servers_and_tunneling/Proxy_Auto-Configuration_PAC_file#description>
until a working proxy is found. See REMOTE AUTOMATION OF FIREFOX VIA
SSH, NETWORK ARCHITECTURE and SETTING UP SOCKS SERVERS USING SSH.
* reconnect - an experimental parameter to allow a reconnection to
firefox that a connection has been discontinued. See the survive
parameter.
* scp - force the scp protocol when transferring files to remote
hosts via ssh. See REMOTE AUTOMATION OF FIREFOX VIA SSH and the
--scp-only option in the ssh-auth-cmd-marionette
<https://metacpan.org/pod/ssh-auth-cmd-marionette> script in this
distribution.
* script - a shortcut to allow directly providing the script timeout,
instead of needing to use timeouts from the capabilities parameter.
Overrides all longer ways.
* seer - this option is switched off "0" by default. When it is
switched on "1", it will activate the various speculative and
pre-fetch options for firefox. NOTE: that this option only works when
profile_name/profile is not specified.
* sleep_time_in_ms - the amount of time (in milliseconds) that this
module should sleep when unsuccessfully calling the subroutine
provided to the await or bye methods. This defaults to "1"
millisecond.
* stealth - stops navigator.webdriver
<https://developer.mozilla.org/en-US/docs/Web/API/Navigator/webdriver>
from being accessible by the current web page. This is achieved by
loading an extension
<https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/WebExtensions>,
which will automatically switch on the addons parameter for the new
method. This is extremely experimental. See IMITATING OTHER BROWSERS
for a discussion.
* survive - if this is set to a true value, firefox will not
automatically exit when the object goes out of scope. See the
reconnect parameter for an experimental technique for reconnecting.
* system_access - firefox after version 138
<https://bugzilla.mozilla.org/show_bug.cgi?id=1944565> allows
disabling system access for javascript. By default, this module will
turn on system access.
* trust - give a path to a root certificate
<https://en.wikipedia.org/wiki/Root_certificate> encoded as a PEM
encoded X.509 certificate
<https://datatracker.ietf.org/doc/html/rfc7468#section-5> that will
be trusted for this session.
* timeouts - a shortcut to allow directly providing a timeout object,
instead of needing to use timeouts from the capabilities parameter.
Overrides the timeouts provided (if any) in the capabilities
parameter.
* trackable - if this is set, profile preferences will be set to make
it harder to be tracked by the browsers fingerprint
<https://en.wikipedia.org/wiki/Device_fingerprint#Browser_fingerprint>
across browser restarts. This is on by default, but may be switched
off by setting it to 0;
* user - if the "host" parameter is also set, use ssh
<https://man.openbsd.org/ssh.1> to create and automate firefox with
the specified user. See REMOTE AUTOMATION OF FIREFOX VIA SSH and
NETWORK ARCHITECTURE. The user will default to the current user name.
Authentication should be via public keys loaded into the local
ssh-agent <https://man.openbsd.org/ssh-agent>.
* via - specifies a proxy jump box
<https://man.openbsd.org/ssh_config#ProxyJump> to be used to connect
to a remote host. See the host parameter.
* visible - should firefox be visible on the desktop. This defaults
to "0". When moving from a X11 platform to another X11 platform, you
can set visible to 'local' to enable X11 forwarding
<https://man.openbsd.org/ssh#X>. See X11 FORWARDING WITH FIREFOX.
* waterfox - only allow a binary that looks like a waterfox version
<https://www.waterfox.net/> to be launched.
* webauthn - a boolean parameter to determine whether or not to add a
webauthn authenticator after the connection is established. The
default is to add a webauthn authenticator for Firefox after version
118.
* width - set the width
<http://kb.mozillazine.org/Command_line_arguments#List_of_command_line_arguments_.28incomplete.29>
of the initial firefox window
This method returns a new Firefox::Marionette object, connected to an
instance of firefox <https://firefox.com>. In a non MacOS/Win32/Cygwin
environment, if necessary (no DISPLAY variable can be found and the
visible parameter to the new method has been set to true) and possible
(Xvfb can be executed successfully), this method will also
automatically start an Xvfb <https://en.wikipedia.org/wiki/Xvfb>
instance.
use Firefox::Marionette();
my $remote_darwin_firefox = Firefox::Marionette->new(
debug => 'timestamp,nsHttp:1',
host => '10.1.2.3',
trust => '/path/to/root_ca.pem',
binary => '/Applications/Firefox.app/Contents/MacOS/firefox'
); # start a temporary profile for a remote firefox and load a new CA into the temp profile
...
foreach my $profile_name (Firefox::Marionette::Profile->names()) {
my $firefox_with_existing_profile = Firefox::Marionette->new( profile_name => $profile_name, visible => 1 );
...
}
new_window
accepts an optional hash as the parameter. Allowed keys are below;
* focus - a boolean field representing if the new window be opened in
the foreground (focused) or background (not focused). Defaults to
false.
* private - a boolean field representing if the new window should be
a private window. Defaults to false.
* type - the type of the new window. Can be one of 'tab' or 'window'.
Defaults to 'tab'.
Returns the window handle for the new window.
use Firefox::Marionette();
my $firefox = Firefox::Marionette->new();
my $window_handle = $firefox->new_window(type => 'tab');
$firefox->switch_to_window($window_handle);
new_session
creates a new WebDriver session. It is expected that the caller
performs the necessary checks on the requested capabilities to be
WebDriver conforming. The WebDriver service offered by Marionette does
not match or negotiate capabilities beyond type and bounds checks.
nightly
returns true if the current version of firefox is a nightly release
<https://www.mozilla.org/en-US/firefox/channel/desktop/#nightly> (does
the minor version number end with an 'a1'?)
paper_sizes
returns a list of all the recognised names for paper sizes, such as A4
or LEGAL.
pause
accepts a parameter in milliseconds and returns a corresponding action
for the perform method that will cause a pause in the chain of actions
given to the perform method.
pdf
accepts a optional hash as the first parameter with the following
allowed keys;
* landscape - Paper orientation. Boolean value. Defaults to false
* margin - A hash describing the margins. The hash may have the
following optional keys, 'top', 'left', 'right' and 'bottom'. All
these keys are in cm and default to 1 (~0.4 inches)
* page - A hash describing the page. The hash may have the following
keys; 'height' and 'width'. Both keys are in cm and default to US
letter size. See the 'size' key.
* page_ranges - A list of the pages to print. Available for Firefox
96
<https://developer.mozilla.org/en-US/docs/Mozilla/Firefox/Releases/96#webdriver_conformance_marionette>
and after.
* print_background - Print background graphics. Boolean value.
Defaults to false.
* raw - rather than a file handle containing the PDF, the binary PDF
will be returned.
* scale - Scale of the webpage rendering. Defaults to 1.
shrink_to_fit should be disabled to make scale work.
* size - The desired size (width and height) of the pdf, specified by
name. See the page key for an alternative and the paper_sizes method
for a list of accepted page size names.
* shrink_to_fit - Whether or not to override page size as defined by
CSS. Boolean value. Defaults to true.
returns a File::Temp object containing a PDF encoded version of the
current page for printing.
use Firefox::Marionette();
my $firefox = Firefox::Marionette->new()->go('https://metacpan.org/');
my $handle = $firefox->pdf();
foreach my $paper_size ($firefox->paper_sizes()) {
$handle = $firefox->pdf(size => $paper_size, landscape => 1, margin => { top => 0.5, left => 1.5 });
...
print $firefox->pdf(page => { width => 21, height => 27 }, raw => 1);
my $firefox = Firefox::Marionette->new();
foreach my $credential ($firefox->webauthn_credentials()) {
say "Credential host is " . $credential->host();
}
# OR
my $authenticator = $firefox->add_webauthn_authenticator( transport => Firefox::Marionette::WebAuthn::Authenticator::INTERNAL(), protocol => Firefox::Marionette::WebAuthn::Authenticator::CTAP2() );
foreach my $credential ($firefox->webauthn_credentials($authenticator)) {
say "Credential host is " . $credential->host();
}
webauthn_set_user_verified
This method accepts a boolean for the is_user_verified field and an
optional authenticator (the default authenticator will be used
otherwise). It sets the is_user_verified field to the supplied boolean
value.
use Firefox::Marionette();
my $firefox = Firefox::Marionette->new();
$firefox->webauthn_set_user_verified(1);
wheel
accepts a element parameter, or a ( x => 0, y => 0 ) type hash manually
describing exactly where to move the mouse from and returns an action
for use in the perform method that corresponding with such a wheel
action, either to the specified co-ordinates or to the middle of the
supplied element parameter. Other parameters that may be passed are
listed below;
* origin - the origin of the C(<x => 0, y => 0)> co-ordinates. Should
be either viewport, pointer or an element.
* duration - Number of milliseconds over which to distribute the
move. If not defined, the duration defaults to 0.
* deltaX - the change in X co-ordinates during the wheel. If not
defined, deltaX defaults to 0.
* deltaY - the change in Y co-ordinates during the wheel. If not
defined, deltaY defaults to 0.
win32_organisation
accepts a parameter of a Win32 product name and returns the matching
organisation. Only of interest when sub-classing.
win32_product_names
returns a hash of known Windows product names (such as 'Mozilla
Firefox') with priority orders. The lower the priority will determine
the order that this module will check for the existence of this
product. Only of interest when sub-classing.
window_handle
returns the current window's handle. On desktop this typically
corresponds to the currently selected tab. returns an opaque
server-assigned identifier to this window that uniquely identifies it
within this Marionette instance. This can be used to switch to this
window at a later point. This is the same as the window
<https://developer.mozilla.org/en-US/docs/Web/API/Window> object in
Javascript.
use Firefox::Marionette();
my $firefox = Firefox::Marionette->new();
my $original_window = $firefox->window_handle();
my $javascript_window = $firefox->script('return window'); # only works for Firefox 121 and later
if ($javascript_window ne $original_window) {
die "That was unexpected!!! What happened?";
}
window_handles
returns a list of top-level browsing contexts. On desktop this
typically corresponds to the set of open tabs for browser windows, or
the window itself for non-browser chrome windows. Each window handle is
assigned by the server and is guaranteed unique, however the return
array does not have a specified ordering.
use Firefox::Marionette();
use 5.010;
my $firefox = Firefox::Marionette->new();
my $original_window = $firefox->window_handle();
$firefox->new_window( type => 'tab' );
$firefox->new_window( type => 'window' );
say "There are " . $firefox->window_handles() . " tabs open in total";
say "Across " . $firefox->chrome()->window_handles()->content() . " chrome windows";
window_rect
accepts an optional position and size as a parameter, sets the current
browser window to that position and size and returns the previous
position, size and state of the browser window. If no parameter is
supplied, it returns the current position, size and state of the
browser window.
window_type
returns the current window's type. This should be 'navigator:browser'.
xvfb_pid
returns the pid of the xvfb process if it exists.
xvfb_display
returns the value for the DISPLAY environment variable if one has been
generated for the xvfb environment.
xvfb_xauthority
returns the value for the XAUTHORITY environment variable if one has
been generated for the xvfb environment
NETWORK ARCHITECTURE
This module allows for a complicated network architecture, including
SSH and HTTP proxies.
my $firefox = Firefox::Marionette->new(
host => 'Firefox.runs.here'
via => 'SSH.Jump.Box',
trust => '/path/to/ca-for-squid-proxy-server.crt',
proxy => 'https://Squid.Proxy.Server:3128'
)->go('https://Target.Web.Site');
produces the following effect, with an ascii box representing a
separate network node.
--------- ---------- -----------
| Perl | SSH | SSH | SSH | Firefox |
| runs |--------->| Jump |-------->| runs |
| here | | Box | | here |
operating systems, including recent version of Windows 10 or Windows
Server 2019
<https://docs.microsoft.com/en-us/windows-server/administration/openssh/openssh_install_firstuse>,
OS X, and Linux and BSD distributions. It expects to be able to login
to the remote node via public key authentication. It can be further
secured via the command
<https://man.openbsd.org/sshd#command=_command_> option in the OpenSSH
<https://www.openssh.com/> authorized_keys
<https://man.openbsd.org/sshd#AUTHORIZED_KEYS_FILE_FORMAT> file such
as;
no-agent-forwarding,no-pty,no-X11-forwarding,permitopen="127.0.0.1:*",command="/usr/local/bin/ssh-auth-cmd-marionette" ssh-rsa AAAA ... == user@server
As an example, the ssh-auth-cmd-marionette
<https://metacpan.org/pod/ssh-auth-cmd-marionette> command is provided
as part of this distribution.
The module will expect to access private keys via the local ssh-agent
<https://man.openbsd.org/ssh-agent> when authenticating.
When using ssh, Firefox::Marionette will attempt to pass the TMPDIR
<https://en.wikipedia.org/wiki/TMPDIR> environment variable across the
ssh connection to make cleanups easier. In order to allow this, the
AcceptEnv <https://man.openbsd.org/sshd_config#AcceptEnv> setting in
the remote sshd configuration should be set to allow TMPDIR, which will
look like;
AcceptEnv TMPDIR
This module uses ControlMaster
<https://man.openbsd.org/ssh_config#ControlMaster> functionality when
using ssh, for a useful speedup of executing remote commands.
Unfortunately, when using ssh to move from a cygwin
<https://gcc.gnu.org/wiki/SSH_connection_caching>, Windows 10 or
Windows Server 2019
<https://docs.microsoft.com/en-us/windows-server/administration/openssh/openssh_install_firstuse>
node to a remote environment, we cannot use ControlMaster, because at
this time, Windows does not support ControlMaster
<https://github.com/Microsoft/vscode-remote-release/issues/96> and
therefore this type of automation is still possible, but slower than
other client platforms.
The NETWORK ARCHITECTURE section has an example of a more complicated
network design.
WEBGL
There are a number of steps to getting WebGL
<https://en.wikipedia.org/wiki/WebGL> to work correctly;
1. The addons parameter to the new method must be set. This will
disable -safe-mode
<http://kb.mozillazine.org/Command_line_arguments#List_of_command_line_arguments_.28incomplete.29>
2. The visible parameter to the new method must be set. This is due to
an existing bug in Firefox
<https://bugzilla.mozilla.org/show_bug.cgi?id=1375585>.
3. It can be tricky getting WebGL <https://en.wikipedia.org/wiki/WebGL>
to work with a Xvfb <https://en.wikipedia.org/wiki/Xvfb> instance.
glxinfo <https://dri.freedesktop.org/wiki/glxinfo/> can be useful to
help debug issues in this case. The mesa-dri-drivers rpm is also
required for Redhat systems.
With all those conditions being met, WebGL
<https://en.wikipedia.org/wiki/WebGL> can be enabled like so;
use Firefox::Marionette();
my $firefox = Firefox::Marionette->new( addons => 1, visible => 1 );
if ($firefox->script(q[let c = document.createElement('canvas'); return c.getContext('webgl2') ? true : c.getContext('experimental-webgl') ? true : false;])) {
$firefox->go("https://get.webgl.org/");
} else {
die "WebGL is not supported";
}
FILE UPLOADS
Uploading files in forms is accomplished by using the type command to
enter the full path of the file you want to upload. An example is shown
below;
use Firefox::Marionette();
use File::Spec();
use Cwd();
my $firefox = Firefox::Marionette->new();
my $firefox_marionette_directory = Cwd::cwd();
$firefox->go("https://practice.expandtesting.com/upload");
while($firefox->percentage_visible($firefox->find_id("fileSubmit")) < 90) {
sleep 1;
}
$firefox->find_id("fileInput")->type(File::Spec->catfile($firefox_marionette_directory, qw(t 04-uploads.t)));
$firefox->find_id("fileSubmit")->click();
FINDING ELEMENTS IN A SHADOW DOM
One aspect of Web Components
<https://developer.mozilla.org/en-US/docs/Web/API/Web_components> is
the shadow DOM
<https://developer.mozilla.org/en-US/docs/Web/API/Web_components/Using_shadow_DOM>.
When you need to explore the structure of a custom element
<https://developer.mozilla.org/en-US/docs/Web/API/Web_components/Using_custom_elements>,
you need to access it via the shadow DOM. The following is an example
of navigating the shadow DOM via a html file included in the test suite
of this package.
use Firefox::Marionette();
use Cwd();
my $firefox = Firefox::Marionette->new();
my $firefox_marionette_directory = Cwd::cwd();
$firefox->go("file://$firefox_marionette_directory/t/data/elements.html");
my $shadow_root = $firefox->find_tag('custom-square')->shadow_root();
my $outer_div = $firefox->find_id('outer-div', $shadow_root);
So, this module is designed to allow you to navigate the shadow DOM
using normal find methods, but you must get the shadow element's shadow
root and use that as the root for the search into the shadow DOM. An
There are a collection of methods and techniques that may be useful if
you would like to change your geographic location or how the browser
appears to your web site.
* the stealth parameter of the new method. This method will stop the
browser reporting itself as a robot and will also (when combined with
the agent method, change other javascript characteristics to match
the User Agent
<https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/User-Agent>
string.
* the agent method, which if supplied a recognisable User Agent
<https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/User-Agent>,
will attempt to change other attributes to match the desired browser.
This is extremely experimental and feedback is welcome.
* the geo method, which allows the modification of the Geolocation
<https://developer.mozilla.org/en-US/docs/Web/API/Geolocation>
reported by the browser, but not the location produced by mapping the
external IP address used by the browser (see the NETWORK ARCHITECTURE
section for a discussion of different types of proxies that can be
used to change your external IP address).
* the languages method, which can change the requested languages
<https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Accept-Language>
for your browser session.
* the tz method, which can change the timezone
<https://en.wikipedia.org/wiki/List_of_tz_database_time_zones#List>
for your browser session.
This list of methods may grow.
WEBSITES THAT BLOCK AUTOMATION
Marionette by design
<https://developer.mozilla.org/en-US/docs/Web/API/Navigator/webdriver>
allows web sites to detect that the browser is being automated. Firefox
no longer (since version 88)
<https://bugzilla.mozilla.org/show_bug.cgi?id=1632821> allows you to
disable this functionality while you are automating the browser, but
this can be overridden with the stealth parameter for the new method.
This is extremely experimental and feedback is welcome.
If the web site you are trying to automate mysteriously fails when you
are automating a workflow, but it works when you perform the workflow
manually, you may be dealing with a web site that is hostile to
automation. I would be very interested if you can supply a test case.
At the very least, under these circumstances, it would be a good idea
to be aware that there's an ongoing arms race
<https://en.wikipedia.org/wiki/Web_scraping#Methods_to_prevent_web_scraping>,
and potential legal issues
<https://en.wikipedia.org/wiki/Web_scraping#Legal_issues> in this area.
X11 FORWARDING WITH FIREFOX
X11 Forwarding <https://man.openbsd.org/ssh#X> allows you to launch a
remote firefox via ssh and have it visually appear in your local X11
desktop. This can be accomplished with the following code;
use Firefox::Marionette();
my $firefox = Firefox::Marionette->new(
host => 'remote-x11.example.org',
visible => 'local',
debug => 1,
);
$firefox->go('https://metacpan.org');
Feedback is welcome on any odd X11 workarounds that might be required
for different platforms.
UBUNTU AND FIREFOX DELIVERED VIA SNAP
Ubuntu 22.04 LTS
<https://ubuntu.com/blog/ubuntu-22-04-lts-whats-new-linux-desktop> is
packaging firefox as a snap <https://ubuntu.com/blog/whats-in-a-snap>.
This breaks the way that this module expects to be able to run,
specifically, being able to setup a firefox profile in a systems
temporary directory (/tmp or $TMPDIR in most Unix based systems) and
allow the operating system to cleanup old directories caused by
exceptions / network failures / etc.
Because of this design decision, attempting to run a snap version of
firefox will simply result in firefox hanging, unable to read it's
custom profile directory and hence unable to read the marionette port
configuration entry.
Which would be workable except that; there does not appear to be _any_
way to detect that a snap firefox will run (/usr/bin/firefox is a bash
shell which eventually runs the snap firefox), so there is no way to
know (heuristics aside) if a normal firefox or a snap firefox will be
launched by execing 'firefox'.
It seems the only way to fix this issue (as documented in more than a
few websites) is;
1. sudo snap remove firefox
2. sudo add-apt-repository -y ppa:mozillateam/ppa
3. sudo apt update
4. sudo apt install -t 'o=LP-PPA-mozillateam' firefox
5. echo -e "Package: firefox*\nPin: release
o=LP-PPA-mozillateam\nPin-Priority: 501" >/tmp/mozillateamppa
6. sudo mv /tmp/mozillateamppa /etc/apt/preferences.d/mozillateamppa
If anyone is aware of a reliable method to detect if a snap firefox is
going to launch vs a normal firefox, I would love to know about it.
This technique is used in the setup-for-firefox-marionette-build.sh
script in this distribution.
DIAGNOSTICS
Failed to correctly setup the Firefox process
The module was unable to retrieve a session id and capabilities from
Firefox when it requests a new_session as part of the initial setup
of the connection to Firefox.
Failed to correctly determined the Firefox process id through the
initial connection capabilities
The module was found that firefox is reporting through it's
Capabilities object a different process id than this module was
using. This is probably a bug in this module's logic. Please report
as described in the BUGS AND LIMITATIONS section below.
'%s --version' did not produce output that could be parsed. Assuming
modern Marionette is available:%s
The Firefox binary did not produce a version number that could be
( run in 0.470 second using v1.01-cache-2.11-cpan-cdf2f3d4e48 )