Config-Model-Systemd

 view release on metacpan or  search on metacpan

lib/Config/Model/models/Systemd/Common/Exec.pl  view on Meta::CPAN

Usually, it is best to leave this setting unmodified, and use higher level file system namespacing
options instead, in particular C<PrivateMounts>, see above.",
        'type' => 'leaf',
        'value_type' => 'uniline'
      },
      'SystemCallFilter',
      {
        'cargo' => {
          'type' => 'leaf',
          'value_type' => 'uniline'
        },
        'description' => "Takes a space-separated list of system call names or system call groups. If this
setting is used, system calls executed by the unit processes except for the listed ones will result
in the system call being denied (allow-listing). If the first character of the list is
C<~>, the effect is inverted: only the listed system calls will be denied
(deny-listing). This option may be specified more than once, in which case the filter masks are
merged. If the empty string is assigned, the filter is reset, all prior assignments will have no
effect.

Commands prefixed with C<+> are not subject to filtering. The
execve(), exit(), exit_group(),
getrlimit(), rt_sigreturn(),
sigreturn() system calls and the system calls for querying time and sleeping are
implicitly allow-listed and do not need to be listed explicitly.

The default action when a system call is denied is to terminate the processes with a
C<SIGSYS> signal. This can changed using C<SystemCallErrorNumber>,
see below. In addition, deny-listed system calls and system call groups may optionally be suffixed
with a colon (C<:>) and an argument in the same format as
C<SystemCallErrorNumber>, to take this action when the matching system call is made.
This takes precedence over the action specified in C<SystemCallErrorNumber>.

This feature makes use of the Secure Computing Mode 2 interfaces of the kernel ('seccomp
filtering') and is useful for enforcing a minimal sandboxing environment.

Note that on systems supporting multiple ABIs (such as x86/x86-64) it is recommended to turn
off alternative ABIs for services, so that they cannot be used to circumvent the restrictions of this
option. Specifically, it is recommended to combine this option with
C<SystemCallArchitectures=native> or similar.

Note that strict system call filters may impact execution and error handling code paths of the
service invocation. Specifically, access to the execve() system call is required
for the execution of the service binary \x{2014} if it is blocked service invocation will necessarily fail.
Also, if execution of the service binary fails for some reason (for example: missing service
executable), the error handling logic might require access to an additional set of system calls in
order to process and log this failure correctly. It might be necessary to temporarily disable system
call filters in order to allow debugging of such failures.

If you specify both types of this option (i.e. allow-listing and deny-listing), the first
encountered will take precedence and will dictate the default action (termination or approval of a
system call). Then the next occurrences of this option will add or delete the listed system calls
from the set of the filtered system calls, depending of its type and the default action. (For
example, if you have started with an allow list rule for read() and
write(), and right after it add a deny list rule for write(),
then write() will be removed from the set.)

As the number of possible system calls is large, predefined groups of system calls are
provided. A group starts with C<\@> character, followed by name of the set.
Currently predefined system call setsSetDescription\@aioAsynchronous I/O (L<io_setup(2)>, L<io_submit(2)>, and related
calls)\@basic-ioSystem calls for basic I/O: reading, writing, seeking, file descriptor duplication and closing
(L<read(2)>, L<write(2)>, and related calls)\@chownChanging file ownership (L<chown(2)>, L<fchownat(2)>, and related
calls)\@clockSystem calls for changing the system clock (L<adjtimex(2)>, L<settimeofday(2)>, and related
calls)\@cpu-emulationSystem calls for CPU emulation functionality (L<vm86(2)> and related calls)\@debugDebugging,
performance monitoring and tracing functionality (L<ptrace(2)>, L<perf_event_open(2)> and related
calls)\@file-systemFile system operations: opening, creating files and directories for read and write, renaming and
removing them, reading file properties, or creating hard and symbolic links\@io-eventEvent loop system calls
(L<poll(2)>, L<select(2)>, L<epoll(7)>, L<eventfd(2)> and related calls)\@ipcPipes, SysV IPC, POSIX Message Queues and
other IPC (L<mq_overview(7)>, L<svipc(7)>)\@keyringKernel keyring access (L<keyctl(2)> and related calls)\@memlockLocking
of memory in RAM (L<mlock(2)>, L<mlockall(2)> and related calls)\@moduleLoading and unloading of kernel modules
(L<init_module(2)>, L<delete_module(2)> and related calls)\@mountMounting and unmounting of file systems (L<mount(2)>,
L<chroot(2)>, and related calls)\@network-ioSocket I/O (including local AF_UNIX): L<socket(7)>,
L<unix(7)>\@obsoleteUnusual, obsolete or unimplemented (L<create_module(2)>, L<gtty(2)>, \x{2026})\@pkeySystem calls that deal
with memory protection keys (L<pkeys(7)>)\@privilegedAll system calls which need super-user capabilities
(L<capabilities(7)>)\@processProcess control, execution, namespacing operations (L<clone(2)>, L<kill(2)>,
L<namespaces(7)>, \x{2026})\@raw-ioRaw I/O port access (L<ioperm(2)>, L<iopl(2)>, pciconfig_read(), \x{2026})\@rebootSystem calls for
rebooting and reboot preparation (L<reboot(2)>, kexec(), \x{2026})\@resourcesSystem calls for changing resource limits, memory
and scheduling parameters (L<setrlimit(2)>, L<setpriority(2)>, \x{2026})\@sandboxSystem calls for sandboxing programs
(L<seccomp(2)>, Landlock system calls, \x{2026})\@setuidSystem calls for changing user ID and group ID credentials,
(L<setuid(2)>, L<setgid(2)>, L<setresuid(2)>, \x{2026})\@signalSystem calls for manipulating and handling process signals
(L<signal(2)>, L<sigprocmask(2)>, \x{2026})\@swapSystem calls for enabling/disabling swap devices (L<swapon(2)>,
L<swapoff(2)>)\@syncSynchronizing files and memory to disk (L<fsync(2)>, L<msync(2)>, and related calls)\@system-serviceA
reasonable set of system calls used by common system services, excluding any special purpose calls. This is the
recommended starting point for allow-listing system calls for system services, as it contains what is typically needed
by system services, but excludes overly specific interfaces. For example, the following APIs are excluded: C<\@clock>,
C<\@mount>, C<\@swap>, C<\@reboot>.\@timerSystem calls for scheduling operations by time (L<alarm(2)>, L<timer_create(2)>,
\x{2026})\@knownAll system calls defined by the kernel. This list is defined statically in systemd based on a kernel version
that was available when this systemd version was released. It will become progressively more out-of-date as the kernel
is updated.
Note, that as new system calls are added to the kernel, additional system calls might be added to the groups
above. Contents of the sets may also change between systemd versions. In addition, the list of system calls
depends on the kernel version and architecture for which systemd was compiled. Use
systemd-analyze\x{a0}syscall-filter to list the actual list of system calls in each
filter.

Generally, allow-listing system calls (rather than deny-listing) is the safer mode of
operation. It is recommended to enforce system call allow lists for all long-running system
services. Specifically, the following lines are a relatively safe basic choice for the majority of
system services:

    [Service]
    SystemCallFilter=\@system-service
    SystemCallErrorNumber=EPERM

Note that various kernel system calls are defined redundantly: there are multiple system calls
for executing the same operation. For example, the pidfd_send_signal() system
call may be used to execute operations similar to what can be done with the older
kill() system call, hence blocking the latter without the former only provides
weak protection. Since new system calls are added regularly to the kernel as development progresses,
keeping system call deny lists comprehensive requires constant work. It is thus recommended to use
allow-listing instead, which offers the benefit that new system calls are by default implicitly
blocked until the allow list is updated.

Also note that a number of system calls are required to be accessible for the dynamic linker to
work. The dynamic linker is required for running most regular programs (specifically: all dynamic ELF
binaries, which is how most distributions build packaged programs). This means that blocking these
system calls (which include open(), openat() or
mmap()) will make most programs typically shipped with generic distributions
unusable.

It is recommended to combine the file system namespacing related options with
C<SystemCallFilter=~\@mount>, in order to prohibit the unit's processes to undo the



( run in 0.761 second using v1.01-cache-2.11-cpan-71847e10f99 )