Config-Model-Systemd

 view release on metacpan or  search on metacpan

lib/Config/Model/models/Systemd/Section/Service.pod  view on Meta::CPAN


Takes a boolean parameter. If set, the processes of this unit will be run in their own private
file system (mount) namespace with all mount propagation from the processes towards the host's main file system
namespace turned off. This means any file system mount points established or removed by the unit's processes
will be private to them and not be visible to the host. However, file system mount points established or
removed on the host will be propagated to the unit's processes. See L<mount_namespaces(7)> for
details on file system namespaces. Defaults to off.

When turned on, this executes three operations for each invoked process: a new
C<CLONE_NEWNS> namespace is created, after which all existing mounts are remounted to
C<MS_SLAVE> to disable propagation from the unit's processes to the host (but leaving
propagation in the opposite direction in effect). Finally, the mounts are remounted again to the propagation
mode configured with C<MountFlags>, see below.

File system namespaces are set up individually for each process forked off by the service manager. Mounts
established in the namespace of the process created by C<ExecStartPre> will hence be cleaned
up automatically as soon as that process exits and will not be available to subsequent processes forked off for
C<ExecStart> (and similar applies to the various other commands configured for
units). Similarly, C<JoinsNamespaceOf> does not permit sharing kernel mount namespaces between
units, it only enables sharing of the C</tmp/> and C</var/tmp/>
directories.

Other file system namespace unit settings — C<PrivateTmp>,
C<PrivateDevices>, C<ProtectSystem>,
C<ProtectHome>, C<ReadOnlyPaths>,
C<InaccessiblePaths>, C<ReadWritePaths>,
C<BindPaths>, C<BindReadOnlyPaths>, … — also enable file system
namespacing in a fashion equivalent to this option. Hence it is primarily useful to explicitly
request this behaviour if none of the other settings are used. I< Optional. Type boolean.  > 

=over 4

=item upstream_default value :

no

=back



=head2 MountFlags

Takes a mount propagation setting: C<shared>, C<slave> or
C<private>, which controls whether file system mount points in the file system namespaces set up
for this unit's processes will receive or propagate mounts and unmounts from other file system namespaces. See
L<mount(2)>
for details on mount propagation, and the three propagation flags in particular.

This setting only controls the final propagation setting in effect on all mount
points of the file system namespace created for each process of this unit. Other file system namespacing unit
settings (see the discussion in C<PrivateMounts> above) will implicitly disable mount and
unmount propagation from the unit's processes towards the host by changing the propagation setting of all mount
points in the unit's file system namespace to C<slave> first. Setting this option to
C<shared> does not reestablish propagation in that case.

If not set – but file system namespaces are enabled through another file system namespace unit setting –
C<shared> mount propagation is used, but — as mentioned — as C<slave> is applied
first, propagation from the unit's processes to the host is still turned off.

It is not recommended to use C<private> mount propagation for units, as this means
temporary mounts (such as removable media) of the host will stay mounted and thus indefinitely busy in forked
off processes, as unmount propagation events will not be received by the file system namespace of the unit.

Usually, it is best to leave this setting unmodified, and use higher level file system namespacing
options instead, in particular C<PrivateMounts>, see above. I< Optional. Type uniline.  > 

=head2 SystemCallFilter

Takes a space-separated list of system call names or system call groups. If this
setting is used, system calls executed by the unit processes except for the listed ones will result
in the system call being denied (allow-listing). If the first character of the list is
C<~>, the effect is inverted: only the listed system calls will be denied
(deny-listing). This option may be specified more than once, in which case the filter masks are
merged. If the empty string is assigned, the filter is reset, all prior assignments will have no
effect.

Commands prefixed with C<+> are not subject to filtering. The
execve(), exit(), exit_group(),
getrlimit(), rt_sigreturn(),
sigreturn() system calls and the system calls for querying time and sleeping are
implicitly allow-listed and do not need to be listed explicitly.

The default action when a system call is denied is to terminate the processes with a
C<SIGSYS> signal. This can changed using C<SystemCallErrorNumber>,
see below. In addition, deny-listed system calls and system call groups may optionally be suffixed
with a colon (C<:>) and an argument in the same format as
C<SystemCallErrorNumber>, to take this action when the matching system call is made.
This takes precedence over the action specified in C<SystemCallErrorNumber>.

This feature makes use of the Secure Computing Mode 2 interfaces of the kernel ('seccomp
filtering') and is useful for enforcing a minimal sandboxing environment.

Note that on systems supporting multiple ABIs (such as x86/x86-64) it is recommended to turn
off alternative ABIs for services, so that they cannot be used to circumvent the restrictions of this
option. Specifically, it is recommended to combine this option with
C<SystemCallArchitectures=native> or similar.

Note that strict system call filters may impact execution and error handling code paths of the
service invocation. Specifically, access to the execve() system call is required
for the execution of the service binary — if it is blocked service invocation will necessarily fail.
Also, if execution of the service binary fails for some reason (for example: missing service
executable), the error handling logic might require access to an additional set of system calls in
order to process and log this failure correctly. It might be necessary to temporarily disable system
call filters in order to allow debugging of such failures.

If you specify both types of this option (i.e. allow-listing and deny-listing), the first
encountered will take precedence and will dictate the default action (termination or approval of a
system call). Then the next occurrences of this option will add or delete the listed system calls
from the set of the filtered system calls, depending of its type and the default action. (For
example, if you have started with an allow list rule for read() and
write(), and right after it add a deny list rule for write(),
then write() will be removed from the set.)

As the number of possible system calls is large, predefined groups of system calls are
provided. A group starts with C<@> character, followed by name of the set.
Currently predefined system call setsSetDescription@aioAsynchronous I/O (L<io_setup(2)>, L<io_submit(2)>, and related
calls)@basic-ioSystem calls for basic I/O: reading, writing, seeking, file descriptor duplication and closing
(L<read(2)>, L<write(2)>, and related calls)@chownChanging file ownership (L<chown(2)>, L<fchownat(2)>, and related
calls)@clockSystem calls for changing the system clock (L<adjtimex(2)>, L<settimeofday(2)>, and related
calls)@cpu-emulationSystem calls for CPU emulation functionality (L<vm86(2)> and related calls)@debugDebugging,
performance monitoring and tracing functionality (L<ptrace(2)>, L<perf_event_open(2)> and related



( run in 1.830 second using v1.01-cache-2.11-cpan-39bf76dae61 )