Files
ecr/SPEC.md
T
vhaudiquet 4f44af4449 feat: add --kernel flag for QEMU system emulation mode
Add --kernel <PATH> option to boot extracted rootfs in a QEMU virtual
machine instead of namespace/chroot mode. The rootfs is converted to an
ext4 disk image using mke2fs and booted with the provided kernel.
2026-06-16 19:16:34 +02:00

332 lines
9.4 KiB
Markdown

# ecr - implementation specification
## Synopsis
```
ecr [OPTIONS] <DISTRO[:VERSION]> -- [COMMAND]...
```
## CLI Interface
### Positional Arguments
- `<distro>` (required): Distribution name or OCI image reference
- `<version>` (optional): Distribution version/codename
### Options
| Flag | Default | Description |
|------|---------|-------------|
| `-a, --arch <arch>` | host arch | Target architecture |
| `--bind <path>` | cwd | Directory to overlay-mount (can be specified multiple times) |
| `--bind-rw <path>` | none | Read-write bind mount at `/mnt/<basename>` (can be specified multiple times, overrides `--bind` for same path) |
| `--no-cache` | false | Download fresh tarball, ignore cache |
| `--no-bind` | false | Skip mounting any directory |
| `--kernel <path>` | none | Boot with QEMU system emulation using specified kernel (triggers disk image creation) |
| `-m, --memory <size>` | 2G | Memory size for QEMU VM (only used with `--kernel`) |
| `-v, --verbose` | false | Print diagnostic messages |
| `-h, --help` | - | Show help |
| `-V, --version` | - | Show version |
## File Layout
### Cache Directory
```
~/.cache/ecr/
├── ubuntu-noble-amd64.tar.gz
├── alpine-latest-x86_64.tar.gz
├── debian-bookworm-amd64.tar.gz
└── ...
```
No metadata files. Tarballs are downloaded once and never redownloaded. Users can delete files manually or use `--no-cache` to fetch fresh.
### Config File
`~/.config/ecr.yaml`:
```yaml
dns:
- 1.1.1.1
```
## Distro Sources
### Direct Tarball Downloads
| Distro | Version Format | Source |
|--------|----------------|--------|
| Ubuntu | noble, jammy, mantic or 26.04, 25.10, 22.04, latest, lts | cdimage.ubuntu.com |
| Alpine | 3.20, 3.19, latest, edge | dl-cdn.alpinelinux.org |
### Docker Hub (OCI Registry)
All other distributions use Docker Hub images via OCI registry API:
| Distro | Image Reference |
|--------|-----------------|
| Debian | `library/debian` |
| Arch | `library/archlinux` |
| Fedora | `library/fedora` |
| Gentoo | `gentoo/stage3` |
| Custom | `<image>[:tag]` or `<registry>/<image>[:tag]` |
### Custom Image References
Users can specify any OCI-compatible image:
```
ecr debian:bookworm -- ./build.sh
ecr gentoo/stage3 -- emerge --sync
ecr gcr.io/my-project/my-image:v1.0 -- /app/test
```
### Architecture Mapping
| ecr | Ubuntu | Alpine | Docker Hub |
|-----|--------|--------|------------|
| amd64 | amd64 | x86_64 | amd64 |
| arm64 | arm64 | aarch64 | arm64 |
| armhf | armhf | armv7 | arm/v7 |
| riscv64 | riscv64 | riscv64 | riscv64 |
| ppc64el | ppc64el | ppc64le | ppc64le |
| s390x | s390x | s390x | s390x |
### OCI Image Download
For Docker Hub images:
1. Get anonymous bearer token from `https://auth.docker.io/token`
2. Query manifest list: `GET https://registry.hub.docker.com/v2/<repo>/manifests/<tag>`
3. Select manifest matching target architecture
4. Download layer blobs with authentication
5. Extract layers to rootfs
If architecture is not available in manifest list, error with available architectures:
```
Error: No manifest found for architecture 'riscv64'. Available: amd64, arm64, ppc64le, s390x
```
## Execution Flow
1. Parse CLI arguments
2. Resolve distro/version/arch to image source
3. Check cache for existing tarball
4. If not cached, download tarball (direct or OCI)
5. Create temp directory for extraction
6. Extract tarball to temp directory
7. Create namespaces: user, pid, mount, uts
8. Set up mounts: /proc, /sys (ro), /dev, /dev/pts
9. Write /etc/resolv.conf with DNS servers
10. Set up overlay mounts for bind paths
11. Set up read-write bind mounts
12. Set environment variables
13. Exec shell or command in chroot
14. On exit, clean up temp directory
## Namespace Setup
### Namespaces (Always Created)
- **user**: Map current user to root (UID 0) inside
- **pid**: Isolated process tree
- **mount**: Private mounts for chroot setup
- **uts**: Hostname set to `ecr-<distro>-<random>`
### Network
Host network namespace (no isolation).
### User Namespace Mapping
```
uid_map: 0 <current_uid> 1
gid_map: 0 <current_gid> 1
```
This makes the user appear as root inside the chroot while remaining unprivileged on the host.
### Mounts Inside Chroot
| Path | Type | Options |
|------|------|---------|
| /proc | proc | defaults |
| /sys | sysfs | ro,nosuid,nodev,noexec |
| /dev | devtmpfs | nosuid |
| /dev/pts | devpts | nosuid,noexec |
| /root/<basename> | overlay | lowerdir=<bind_path>, upperdir=<temp>, workdir=<temp> |
| /mnt/<basename> | bind | rw (for --bind-rw) |
| /etc/resolv.conf | file | written with DNS |
## QEMU Integration
### Foreign Architecture Detection
If `--arch` differs from host architecture, QEMU is required.
### binfmt_misc Check
Before entering chroot, verify binfmt_misc is registered for target architecture by checking `/proc/sys/fs/binfmt_misc/qemu-<arch>`.
If not registered, error with message:
```
Error: binfmt_misc not registered for riscv64
Install QEMU user emulation:
Ubuntu/Debian: sudo apt install qemu-user-static
Arch: sudo pacman -S qemu-user-static-binfmt
Alpine: sudo apk add qemu-user-static
```
### QEMU Binary
No action required. Modern qemu-user-static packages register binfmt_misc with the `F` (fix binary) flag, loading the interpreter into kernel memory. The kernel handles foreign binary execution transparently.
## QEMU System Emulation Mode
When `--kernel` is specified, ecr switches from namespace/chroot mode to QEMU system emulation. The extracted rootfs is converted to a disk image and booted with the provided kernel.
### Usage
```sh
ecr --kernel /boot/vmlinuz ubuntu:noble
ecr --kernel /boot/vmlinuz --memory 4G alpine
ecr --kernel /boot/vmlinuz debian -- /bin/sh -c "echo hello"
```
### Execution Flow
1. Download/cache rootfs tarball (same as namespace mode)
2. Extract tarball to temporary directory
3. Create ext4 disk image from rootfs using `mke2fs -d` (requires `e2fsprogs`)
4. Launch QEMU with:
- `-kernel <path>` - provided kernel
- `-append "root=/dev/vda rw console=ttyS0"` - kernel command line
- `-m <memory>` - memory size (default 2G)
- `-nographic` - console on stdio
- `-drive file=rootfs.img,format=raw,if=virtio` - rootfs disk
- `-netdev user,id=net0 -device virtio-net-pci,netdev=net0` - network
5. Wait for QEMU to exit
6. Cleanup temporary files
### Disk Image Creation
The rootfs directory is converted to an ext4 disk image using `mke2fs -t ext4 -d <rootfs>`. This requires the `e2fsprogs` package:
- Ubuntu/Debian: `sudo apt install e2fsprogs`
- Arch: `sudo pacman -S e2fsprogs`
- Alpine: `sudo apk add e2fsprogs`
### Architecture Support
| ecr Arch | QEMU System Binary |
|----------|-------------------|
| amd64/x86_64 | qemu-system-x86_64 |
| arm64/aarch64 | qemu-system-aarch64 |
| armhf/armv7 | qemu-system-arm |
| riscv64 | qemu-system-riscv64 |
| ppc64el | qemu-system-ppc64 |
| s390x | qemu-system-s390x |
### Requirements
- QEMU system emulator installed (`qemu-system-<arch>`)
- `e2fsprogs` for disk image creation
- Kernel with virtio support (for disk and network drivers)
### Differences from Namespace Mode
| Feature | Namespace Mode | QEMU Mode |
|---------|---------------|-----------|
| Isolation | User namespace | Full VM |
| Performance | Near-native | Emulated (slower) |
| Root access | No | No |
| Foreign arch | binfmt_misc required | Built-in emulation |
| Bind mounts | Overlay/bind | Not supported |
| Network | Host network | User-mode network |
## File Handling
### Overlay Mount (Default)
By default, the current working directory is mounted as an overlay filesystem at `/root/<basename>` inside the chroot, where `<basename>` is the name of the current directory.
Overlay configuration:
- `lowerdir`: the source directory (read-only)
- `upperdir`: temp directory for modifications
- `workdir`: temp directory required by overlayfs
Changes made inside the chroot are written to upperdir and discarded on exit. The host directory is never modified.
Multiple `--bind` paths can be specified, each creates an overlay at `/root/<basename>`.
Example:
```
$ cd ~/projects/myapp
$ ecr ubuntu:noble -- make build
# ~/projects/myapp mounted at /root/myapp
# Build artifacts written to overlay, discarded on exit
```
### Read-Write Bind Mount
`--bind-rw <path>` creates a true read-write bind mount at `/mnt/<basename>`. This modifies the host filesystem directly. Use with caution.
Multiple `--bind-rw` paths can be specified. If a path is specified in both `--bind` and `--bind-rw`, the read-write mount takes precedence.
If no path is specified, defaults to current working directory.
### No Mount
`--no-bind` skips mounting any directory.
## DNS
Default DNS server is 1.1.1.1. Configured via `/etc/resolv.conf` in chroot:
```
nameserver 1.1.1.1
```
Override with config file (`~/.config/ecr.yaml`):
```yaml
dns:
- 8.8.8.8
- 8.8.4.4
```
## Environment Variables
Default environment inside chroot:
- HOME=/root
- USER=root
- SHELL=/bin/bash (or /bin/sh if bash unavailable)
- TERM=<from host>
- PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
Host environment is not inherited.
## Signal Handling
Forward SIGINT, SIGTERM, SIGHUP, SIGQUIT to child process. Wait for child to exit before cleanup.
## Security Requirements
### User Namespace Required
`ecr` requires unprivileged user namespaces. If unavailable (sysctl `kernel.unprivileged_userns_clone=0` or AppArmor restrictions), error with:
```
Error: User namespaces not available
Enable with:
sysctl -w kernel.unprivileged_userns_clone=1
Or check AppArmor profile restrictions.
```