The Problem: I would like to create isolated build environments for my home CI/CD pipeline to build different versions of my application with No Overhead of spinning up Virtual Machines, running Docker inside an unprivileged LXC container on Proxmox would have been the ideal method, however I hit a wall with numerous permission denied errors and missing /proc/keys that broke every docker build.
The Constraints: It is required for the LXC to remain Unprivileged for security reasons, however in order to run the Docker Daemon, the Docker Daemon requires; cgroups, kernel keyrings, and mount capabilities that are hidden (by default) from a standard LXC container. Most of the how-to guides published either skip the AppArmor dance or state you need to make it Privileged which I refuse.
The Solution: Installation of the Docker Daemon from an unprivileged LXC was accomplished by having three specific feature flags for the LXC enabled; Nesting, keyctl and a user-defined AppArmor Profile that allows overlay mounting, pivot_root, and keyctl (once line added) and to enable cgroup v2 & overlay2 inside the container prior to Docker installation. I have run this environment as a Prod system running Docker inside an unprivileged LXC on Proxmox for 6 months without crashing.
Quick Summary:
- Enable Nesting/Keyctl in the features availability of the container.
- Add
lxc.mount.auto: cgroup:rw proc:rw sys:rwinto your area of the container configuration. - Create an AppArmor Profile allowing for overlay mounts, pivot_root and keyctl.
- Verify the existence of cgroup v2 and overlay2 inside the LXC prior to installing Docker.
Tested on: Proxmox VE 8.2, Debian 12, LXC Template, Kernel 6.8.4-2-pve. This configuration has been used so far to run Jenkins Agents and private registries.
Prerequisites & Planning
Host Environment Requirements
Host of the environment must be set up to use cgroups in V2 hierarchy.Verify:
mount | grep cgroup
Verify cgroup2 is mounted in /sys/fs/cgroup. If running v1 you must add systemd.unified_cgroup_hierarchy=1 to the kernel command line and reboot the server. See the kernel.org cgroup v2 docs for the reason why; the short version is that due to requiring usage of v2, Docker Engine wants v2.
Install AppArmor userspace tools.
apt update && apt install apparmor-profiles apparmor-utils
Your container template must be systemd-based and have an AppArmor profile patched for nesting. Using Debian 12 straight out of the box should work.
Container Feature Flags
Before starting your Container, open Options → Features for the Container from the Proxmox WebUI and make sure that the options for Nesting and keyctl are checked. If you plan to use FUSE storage, be sure to check Fuse.
When in CLI, you should be able to confirm these are set by running:
pct config CTID | grep features
The output will contain nesting=1,keyctl=1.
Pre‑flight Checklist
As root on the Proxmox host, run the following:
mount | grep cgroup→cgroup2 on /sys/fs/cgroup type cgroup2 (rw,nosuid,nodev,noexec,relatime)aa-status→ show profile loaded with no errors.pct config CTID→ features line should havenesting=1,keyctl=1.
Core Setup: Docker Inside an Unprivileged LXC Container on Proxmox
BEFORE YOU DO ANY CONFIGURATION, MAKE A BACKUP OF YOUR PROXMOX HOST!
cp /etc/pve/lxc/CTID.conf ~/backup/CTID.conf.bak
Always have 2 shell sessions open as root on the Proxmox host in case you need to rollback.
Modifying the Container Configuration File
Edit /etc/pve/lxc/CTID.conf directly or through pct, append the following lines at the bottom of the container configuration file:
lxc.mount.auto: cgroup:rw proc:rw sys:rw
lxc.apparmor.profile: unconfined
features: nesting=1,keyctl=1
The line lxc.mount.auto is the important part!LXC is instructed to automatically mount the cgroup, proc, and sysfs file systems inside the container with read and write permissions, as specified in the lxc.container.conf man page, which explains this step in lxc.mount.auto. If the cgroup:rw part is not included, Docker’s ability to detect cgroups will be unsuccessful.
You should temporarily set the lxc.apparmor.profile to unconfined. You will later switch it out for a custom profile. After you’ve completed this configuration, start your container.
Confirm Configuration Was Applied: Inside the container, run the command from cat /proc/1/mountinfo | grep cgroup. Check for cgroup2 mounting in /sys/fs/cgroup on the output display.
Enabling the Kernel Keyring Inside the Container
Using the keyctl=1 feature flag automatically sets up /proc/keys. Use the pct enter CTID -- cat /proc/keys to validate that your keyring has been set up properly.
The output for the command you executed should have at least 1 keyring entry, like what I have here. (I have removed the bulk of the hex identifiers for readability.)
0a3f6c5d I--Q--- 1 perm 1f3f0000 0 65534 keyring _uid_ses.0: empty
2f9b7d45 I--Q--- 2 perm 1f3f0000 0 65534 keyring _uid.0: empty
If this command returned no output, verify your features line and restart the container.
Starting and Verifying cgroups v2 Inside the Container
Next, confirm what type of cgroup filesystem type is mounted.
pct enter CTID -- stat -fc %T /sys/fs/cgroup
The output should be cgroup2fs. If not, then the lxc.mount.auto line is either missing or has the wrong value.
Installing Docker and Validating Overlay2 Storage
Install Docker CE in the container using the convenience installation script or the Debian repository. I use:
curl -fsSL https://get.docker.com | sh
After completing the installation, check which storage driver has been used:
docker info | grep -i storage
The output should have at least a reference to Storage Driver: overlay2. You can view supported backing filesystems at the official Docker OverlayFS docs, and you should see the output you were expecting to see:
Storage Driver: overlay2 <--
Backing Filesystem: extfs
Supports d_type: true
Using metacopy: false
If it has reverted back to vfs, then there is an issue with your mount configuration.
Optimization & Best Practices
Crafting a Custom AppArmor Profile for Docker‑Inside‑LXC
- The profile must be constructed so that it is secure enough for Docker to be run inside of LXC.
- To create a new profile, first, copy the existing LXC-cgns profile into the relevant directory.
cp /etc/apparmor.d/lxc/lxc-default-cgns /etc/apparmor.d/lxc.docker-lxc
- After doing that, add the following lines at the end of your profile block.
mount fstype=overlay options=(rw,lowerdir,upperdir,workdir) -> /var/lib/docker/**,
mount fstype=tmpfs,
pivot_root,
capability sys_admin,
capability sys_ptrace,
unix (bind,listen,accept) type=stream,
signal (send) peer=dockerd,
keyctl,
- In addition to allowing Docker to run correctly, the biggest changes you will be making will be allowing Docker to create and use overlay mounts on its storage dir, allowing the container to boot from
pivot_root, and allowingkeyctlto access the kernel keyrings. - Load your new profile using
apparmor_parser -r /etc/apparmor.d/lxc.docker-lxc. - To update the container’s configuration file, do the following:
lxc.apparmor.profile: lxc.docker-lxc
- After completing the above tasks, restart your container to ensure that the new policy has been applied correctly.
aa-status | grep docker-lxc
# should show docker-lxc (enforce)
Tuning Storage and Performance
Use ext4 or xfs as the container’s disk image, especially the one backing /var/lib/docker. If your storage backend allows, add noatime to the host mount point to cut down I/O overhead.
What Didn’t Work For Me
What Went Wrong: The Keyctl Denial That Broke docker build
I had cloned a working container and the IP address had changed. I was getting errors with Operation not permitted in the container.
Running strace -f docker build ..., it told me what syscall had failed inside of the container:
[pid 12345] add_key("encrypted", "user:keyring", ...) = -1 EACCES (Permission denied) <--
- Before checking the contents of
/proc/keysoutside of the container, I queried to see the expected contents of/proc/keys– which there were none.
So even though I had cloned the features line into my new container, the cloned container’s profile did not have the keyctl checkbox selected, and therefore there was no keyctl=1.docker build worked immediately after I added and restarted it.
Common Mistakes & Edge Cases
Nested Containerization Issues & User Namespace Clashes
If the Docker daemon started in an LXC uses its own User Namespace, it could potentially cause syscalls to be blocked by the AppArmor profile as a result of a stricter profile being used due to nesting levels. You will see these blocks as denials in your container’s journal:
audit: type=1400 apparmor="DENIED" operation="mount" info="failed perms" error=-13 profile="lxc.docker-lxc" name="/" pid=2345 comm="dockerd" <--
You can work around this by running the inner Docker daemon with --userns=host, or by ensuring your custom AppArmor profile includes capability sys_admin and a pivot_root rule (which we already did). The breakdown for this nesting issue can be found at the Arch Wiki LXC page.
ZFS‑Backed Container Storage and overlay2
If your Proxmox host uses ZFS, the container’s root disk will be a ZFS dataset, which means that you cannot directly use a ZFS dataset as the overlay2 location because ZFS does not provide d_type support in some forms of configuration. You have to create a regular ext4 file on the ZFS volume and mount it to /var/lib/docker, or create a directory dataset that is ext4 formatted using a loopback. A list of supported file systems can be found at the Docker storage driver docs.
Frequently Asked Questions
Why does systemd-detect-virt say “lxc” while Docker still has issues with cgroups?
The reason for this is that during the cgroup detection, Docker looks for the cgroup namespace type rather than the virt fact. When lxc.mount.auto: cgroup:rw is missing, Docker detects a hybrid or v1 cgroup hierarchy. You can check this with docker info | grep -i "cgroup"; if you see the warning about missing controllers, it means the mount was not successful. The fix for this is the same line we added to the container configuration file if your host runs cgroup v2.
Can I run Kubernetes (k3s) inside this Docker‑enabled LXC container?
You can run k3s in a Docker-enabled LXC container, but it can be messy. First, k3s requires the kernel modules br_netfilter and overlay to be loaded on the host machine. You will have to allow these modules via lxc.cgroup2.devices.allow on the host machine and probably set the container to be run in privileged mode. Once you run the container in a privileged mode, you have breached the security of the unprivileged model, and you will need to consider the potential security implications.
How do I update the Docker daemon without breaking the LXC nesting?
Updates to the Docker CE daemon will be successful as long as the /var/lib/docker location is persistent. After you update, verify that the AppArmor profile still points to the new Docker daemon location using:
aa-status | grep -A5 docker-lxc
If any of the new paths are listed as denials, add the path to the profile and then reload the AppArmor profile.