Capabilities
Difficulty: Advanced
Time: Approximately 30 minutes
In this lab you’ll learn the basics of capabilities in the Linux kernel. You’ll learn how they work with Docker, some basic commands to view and manage them, as well as how to add and remove capabilities in new containers.
You will complete the following steps as part of this lab.
- Step 1 - Introduction to capabilities
- Step 2 - Working with Docker and capabilities
- Step 3 - Testing Docker capabilities
- Step 4 - Extra for experts
Prerequisites
You will need all of the following to complete this lab:
- A Linux-based Docker Host running Docker 1.13 or higher
Step 1: Introduction to capabilities
In this step you’ll learn the basics of capabilities.
The Linux kernel is able to break down the privileges of the root
user into distinct units referred to as capabilities. For example, the CAP_CHOWN capability is what allows the root use to make arbitrary changes to file UIDs and GIDs. The CAP_DAC_OVERRIDE capability allows the root user to bypass kernel permission checks on file read, write and execute operations. Almost all of the special powers associated with the Linux root user are broken down into individual capabilities.
This breaking down of root privileges into granular capabilities allows you to:
- Remove individual capabilities from the
root
user account, making it less powerful/dangerous. - Add privileges to non-root users at a very granular level.
Capabilities apply to both files and threads. File capabilities allow users to execute programs with higher privileges. This is similar to the way the setuid
bit works. Thread capabilities keep track of the current state of capabilities in running programs.
The Linux kernel lets you set capability bounding sets that impose limits on the capabilities that a file/thread can gain.
Docker imposes certain limitations that make working with capabilities much simpler. For example, file capabilities are stored within a file’s extended attributes, and extended attributes are stripped out when Docker images are built. This means you will not normally have to concern yourself too much with file capabilities in containers.
It is of course possible to get file capabilities into containers at runtime, however this is not recommended.
In an environment without file based capabilities, it’s not possible for applications to escalate their privileges beyond the bounding set (a set beyond which capabilities cannot grow). Docker sets the bounding set before starting a container. You can use Docker commands to add or remove capabilities to or from the bounding set.
By default, Docker drops all capabilities except [those needed] using a whitelist approach.The following list contains all capabilities that are enabled by default when you run a docker container with their descriptions from the capabilities man page:
Capabilities | Definition |
---|---|
CHOWN | Make arbitrary changes to file UIDs and GIDs |
DAC_OVERRIDE | Discretionary access control (DAC) - Bypass file read, write, and execute permission checks |
FSETID | Don’t clear set-user-ID and set-group-ID mode bits when a file is modified; set the set-group-ID bit for a file whose GID does not match the file system or any of the supplementary GIDs of the calling process. |
FOWNER | Bypass permission checks on operations that normally require the file system UID of the process to match the UID of the file, excluding those operations covered by CAP_DAC_OVERRIDE and CAP_DAC_READ_SEARCH. |
MKNOD | MKNOD - Create special files using mknod(2) |
NET_RAW | Use RAW and PACKET sockets; bind to any address for transparent proxying. |
SETGID | Make arbitrary manipulations of process GIDs and supplementary GID list; forge GID when passing socket credentials via UNIX domain sockets; write a group ID mapping in a user namespace. |
SETUID | Make arbitrary manipulations of process UIDs; forge UID when passing socket credentials via UNIX domain sockets; write a user ID mapping in a user namespace. |
SETFCAP | Set file capabilities. |
SETPCAP | If file capabilities are not supported: grant or remove any capability in the caller’s permitted capability set to or from any other process. |
NET_BIND_SERVICE | Bind a socket to Internet domain privileged ports (port numbers less than 1024). |
SYS_CHROOT | Use chroot(2) to change to a different root directory. |
KILL | Bypass permission checks for sending signals. This includes use of the ioctl(2) KDSIGACCEPT operation. |
AUDIT_WRITE | Write records to kernel auditing log. |
Step 2: Working with Docker and capabilities
In this step you’ll learn the basic approach to managing capabilities with Docker. You’ll also learn the Docker commands used to manage capabilities for a container’s root account.
As of Docker 1.12 you have 3 high level options for using capabilities:
- Run containers as root with a large set of capabilities and try to manage capabilities within your container manually.
- Run containers as root with limited capabilities and never change them within a container.
- Run containers as an unprivileged user with no capabilities.
Option 2 as the most realistic as of Docker 1.12. Option 3 would be ideal but not realistic. Option 1 should be avoided wherever possible.
Note: Another option may be added in future versions of Docker that will allow you to run containers as a non-root user with added capabilities. The correct way of doing this requires ambient capabilities which was added to the Linux kernel in version 4.3. Whether it is possible for Docker to approximate this behavior in older kernels requires more research.
In the following commands, $CAP
will be used to indicate one or more individual capabilities.
-
To drop capabilities from the
root
account of a container.$sudo docker run --rm -it --cap-drop $CAP alpine sh
-
To add capabilities to the
root
account of a container.$ sudo docker run --rm -it --cap-add $CAP alpine sh
-
To drop all capabilities and then explicitly add individual capabilities to the
root
account of a container.$ sudo docker run --rm -it --cap-drop ALL --cap-add $CAP alpine sh
The Linux kernel prefixes all capability constants with “CAP_”. For example, CAP_CHOWN, CAP_NET_ADMIN, CAP_SETUID, CAP_SYSADMIN etc. Docker capability constants are not prefixed with “CAP_” but otherwise match the kernel’s constants.
For more information on capabilities, including a full list, see the capabilities man page
Step 3: Testing Docker capabilities
In this step you will start various new containers. Each time you will use the commands learned in the previous step to tweak the capabilities associated with the account used to run the container.
-
Start a new container and prove that the container’s root account can change the ownership of files.
$ docker container run --rm -it alpine chown nobody /
The command gives no return code indicating that the operation succeeded. The command works because the default behavior is for new containers to be started with a root user. This root user has the CAP_CHOWN capability by default.
-
Start another new container and drop all capabilities for the containers root account other than the CAP_CHOWN capability. Remember that Docker does not use the “CAP_” prefix when addressing capability constants.
$ docker container run --rm -it --cap-drop ALL --cap-add CHOWN alpine chown nobody /
This command also gives no return code, indicating a successful run. The operation succeeds because although you dropped all capabilities for the container’s
root
account, you added thechown
capability back. Thechown
capability is all that is needed to change the ownership of a file. -
Start another new container and drop only the
CHOWN
capability form its root account.$ docker container run --rm -it --cap-drop CHOWN alpine chown nobody / chown: /: Operation not permitted
This time the command returns an error code indicating it failed. This is because the container’s root account does not have the
CHOWN
capability and therefore cannot change the ownership of a file or directory. -
Create another new container and try adding the
CHOWN
capability to the non-root user callednobody
. As part of the same command try and change the ownership of a file or folder.$ docker container run --rm -it --cap-add chown -u nobody alpine chown nobody / chown: /: Operation not permitted
The above command fails because Docker does not yet support adding capabilities to non-root users.
In this step you have added and removed capabilities to a range of new containers. You have seen that capabilities can be added and removed from the root user of a container at a very granular level. You also learned that Docker does not currently support adding capabilities to non-root users.
Step 4: Extra for experts
The remainder of this lab will show you additional tools for working with capabilities form the Linux shell.
There are two main sets of tools for managing capabilities:
- libcap focuses on manipulating capabilities.
- libcap-ng has some useful tools for auditing.
Below are some useful commands from both.
You may need to manually install the packages required for some of these commands.
sudo apt-get install libcap-dev
,sudo apt-get install libcap-ng-dev
,sudo apt-get install libcap-ng-utils
.
libcap
capsh
- lets you perform capability testing and limited debuggingsetcap
- set capability bits on a filegetcap
- get the capability bits from a file
libcap-ng
pscap
- list the capabilities of running processesfilecap
- list the capabilities of filescaptest
- test capabilities as well as list capabilities for current process
The remainder of this step will show you some examples of libcap
and libcap-ng
.
Listing all capabilities
The following command will start a new container using Alpine Linux, install the libcap
package and then list capabilities.
$ docker container run --rm -it alpine sh -c 'apk add -U libcap; capsh --print'
(1/1) Installing libcap (2.25-r0)
Executing busybox-1.24.2-r9.trigger
OK: 5 MiB in 12 packages
Current: = cap_chown,cap_dac_override,cap_fowner,cap_fsetid,cap_kill,cap_setgid,cap_setuid,cap_setpcap,cap_net_bind_service,cap_net_raw,cap_sys_chroot,cap_mknod,cap_audit_write,cap_setfcap+eip
Bounding set =cap_chown,cap_dac_override,cap_fowner,cap_fsetid,cap_kill,cap_setgid,cap_setuid,cap_setpcap,cap_net_bind_service,cap_net_raw,cap_sys_chroot,cap_mknod,cap_audit_write,cap_setfcap
Securebits: 00/0x0/1'b0
secure-noroot: no (unlocked)
secure-no-suid-fixup: no (unlocked)
secure-keep-caps: no (unlocked)
uid=0(root)
gid=0(root)
groups=0(root),1(bin),2(daemon),3(sys),4(adm),6(disk),10(wheel),11(floppy),20(dialout),26(tape),27(video)
In the output above, Current is multiple sets separated by spaces. Multiple capabilities within the same set are separated by commas ,
. The letters following the +
at the end of each set are as follows:
e
is effectivei
is inheritablep
is permitted
For information on what these mean, see the capabilities manpage.
Experimenting with capabilities
The capsh
command can be useful for experimenting with capabilities. capsh --help
shows how to use the command:
$ sudo capsh --help
usage: capsh [args ...]
--help this message (or try 'man capsh')
--print display capability relevant state
--decode=xxx decode a hex string to a list of caps
--supports=xxx exit 1 if capability xxx unsupported
--drop=xxx remove xxx,.. capabilities from bset
--caps=xxx set caps as per cap_from_text()
--inh=xxx set xxx,.. inheritiable set
--secbits=<n> write a new value for securebits
--keep=<n> set keep-capabability bit to <n>
--uid=<n> set uid to <n> (hint: id <username>)
--gid=<n> set gid to <n> (hint: id <username>)
--groups=g,... set the supplemental groups
--user=<name> set uid,gid and groups to that of user
--chroot=path chroot(2) to this path
--killit=<n> send signal(n) to child
--forkfor=<n> fork and make child sleep for <n> sec
== re-exec(capsh) with args as for --
-- remaing arguments are for /bin/bash
(without -- [capsh] will simply exit(0))
Warning:
--drop
sounds like what you want to do, but it only affects the bounding set. This can be very confusing because it doesn’t actually take away the capability from the effective or inheritable set. You almost always want to use--caps
,sudo apt-get install attr
.
Modifying capabilities
Libcap and libcap-ng can both be used to modify capabilities.
-
Use libcap to modify the capabilities on a file.
The command below shows how to set the CAP_NET_RAW capability as effective and permitted on the file represented by
$file
. Thesetcap
command calls on libcap to do this.$ sudo setcap cap_net_raw=ep $file
-
Use libcap-ng to set the capabilities of a file.
The
filecap
command calls on libcap-ng.$ filecap /absolute/path net_raw
Note:
filecap
requires absolute path names. Shortcuts like./
are not permitted.
Auditing
There are multiple ways to read out the capabilities from a file.
-
Using libcap:
$ getcap $file $file = cap_net_raw+ep
-
Using libcap-ng:
$ filecap /absolue/path/to/file file capabilities /absolute/path/to/file net_raw
-
Using extended attributes (attr package):
$ getfattr -n security.capability $file # file: $file security.capability=0sAQAAAgAgAAAAAAAAAAAAAAAAAAA=
Tips
Docker images cannot have files with capability bits set. This reduces the risk of Docker containers using capabilities to escalate privileges. However, it is possible to mount volumes that contain files with capability bits set into containers. Therefore you should use caution if doing this.
- You can audit directories for capability bits with the following commands:
# with libcap
$ getcap -r /
# with libcap-ng
$ filecap -a
- To remove capability bits you can use.
# with libcap
$ setcap -r $file
# with libcap-ng
$ filecap /path/to/file none
Further reading:
This article explains capabilities in a lot of detail. It will help you understand how capability sets interact with each other, and is very useful if you plan to run privileged docker containers and manage capabilities manually inside of them.
This is the man page for capabilities. Most of the complex interactions between capability sets don’t affect Docker containers as long as there are no files with capability bits set.