Files
p/SPEC.md
T
vhaudiquet 92132bc37a
CI / Check, test, lint (push) Failing after 11m1s
spec: update spec
- Default `p -- <cmd>` now streams logs via SSH (like tail -f)
   - Ctrl+C detaches from stream; job keeps running on worker
   - Add `-d/--detach` flag to start job without streaming
   - Remove `p attach` command (use `p logs -f` instead)
   - Remove p-agent daemon; jobs launched via nohup over SSH
   - Simplify worker requirements: only rsync needed (no tmux, no agent)
   - Jobs managed via ad-hoc SSH: kill $(cat pid), tail -f output.log
2026-06-06 02:12:53 +02:00

237 lines
6.8 KiB
Markdown

# `p` — push jobs to worker
A small Rust CLI utility to push command-line jobs to remote worker machines,
with directory sync, job management, and log streaming.
## Motivation
The common developer workflow of "run this build/test/script on a more powerful
remote machine" currently requires manually chaining `rsync` and `ssh` with
a way to keep the job alive in the background (e.g. `nohup`, `tmux`).
`p` wraps that entire flow into a single ergonomic command, while adding proper
job tracking and log capture.
## Core Concepts
### Worker
A remote machine accessible via SSH. Workers are registered locally with a
name and a connection string. One worker is designated as the **default**.
### Job
A command submitted to a worker, along with a (optionally synced) working
directory. Each job has:
- A UUID
- The command run
- The worker it ran on
- The original client CWD
- Start time, end time, exit code
- Captured output log
## CLI Reference
### Running jobs
```
p -- <command>
```
Sync the current directory to the default worker and run `<command>` on it.
Streams the job's output directly to the terminal (like `tail -f`). This feels
like running the command locally.
- `Ctrl+C` detaches from the log stream — the job keeps running on the worker.
Use `p logs -f <job-id>` to resume watching.
- Use `p stop <job-id>` to kill a running job.
- If the network connection drops, the job keeps running on the worker.
Use `p logs -f <job-id>` to resume watching.
When the job finishes, `p` prints the exit code and exits:
```
[Job done: exit 0]
```
```
p <worker> -- <command>
```
Same, but targets a specific named worker.
```
p [-n | --no-sync] -- <command>
```
Run `<command>` on the worker without syncing the current directory first.
Useful for commands that need no local files.
```
p [-d | --detach] -- <command>
```
Run `<command>` and immediately detach — do not stream output to the terminal.
The job starts on the worker and `p` prints the job ID. Useful for fire-and-forget
jobs. Use `p logs -f <job-id>` to watch later.
### Job management
```
p ls
```
List **running jobs** across all workers. Pass `-a` / `--all` to also show
completed jobs (done, failed, stopped).
Shows: ID (short), worker, original CWD, command, status, duration.
Style inspired by `docker ps` / `lxc list`.
```
p logs <job-id>
```
Print the captured output of a job (running or finished). Supports `-f` to
follow a running job's output in real-time. `Ctrl+C` detaches without stopping
the job.
```
p stop <job-id>
```
Kill a running job.
```
p pull <job-id> <remote-path> [<local-dest>]
```
Copy a specific file or directory from a job's work directory back to the
client. Used to retrieve build artifacts.
```
p rm <job-id>
```
Remove a job record and its remote work directory. Refuses to remove a
running job without `--force`.
```
p prune
```
Remove all finished job records (status: done, failed, stopped) and their
remote work directories. Jobs with status `running` or `unknown` are left
untouched. Pass `--force` to also include `unknown` jobs.
Pass `--dry-run` to preview what would be removed without deleting anything.
### Worker management
```
p worker register <connection-string> [-n <name>]
```
Register a worker. The connection string is an SSH target (`user@host`,
`user@host:port`, or an SSH config alias). If `-n` is omitted, the hostname
is used as the name. The first registered worker becomes the default.
```
p worker ls
```
List registered workers with their name and connection string.
Pass `--check` / `-c` to also probe reachability over SSH (slow).
```
p worker rm <name>
```
Unregister a worker. Refuses if the worker has running jobs.
```
p worker default <name>
```
Set the default worker.
---
## Directory Sync
- Uses `rsync` over SSH.
- Respects `.gitignore` by default (via `rsync --filter=':- .gitignore'`).
- `.git/` is **included** — some workflows depend on it (e.g. reading the
current commit SHA or latest tag).
- Each job gets its own isolated work directory on the worker:
`~/.p/workdirs/<job-uuid>/`
- No automatic sync-back after job completion. Use `p pull` to retrieve
specific artifacts.
## Execution Model
No persistent agent daemon is needed. Jobs are launched and managed via
ad-hoc SSH commands:
1. `p -- <command>` syncs the directory, then runs via SSH:
```
nohup sh -c '<command> 2>&1 | tee output.log; echo $? > exitcode' & echo $! > pid
```
2. The client streams `output.log` in real-time over a separate SSH connection.
3. `Ctrl+C` closes the SSH stream — the job keeps running.
4. `p stop <job-id>` runs `kill $(cat pid)` over SSH.
5. `p logs -f <job-id>` tails the log file over SSH.
6. `p ls` reads the local job DB and SSH-polls to reconcile state when needed.
> **Worker requirements:** `rsync` must be available on the worker.
---
## Job Status & Tracking
The client maintains a local job database (`~/.local/share/p/jobs/<uuid>.json`).
`p ls` reads from this local store for fast output.
### State reconciliation
When a job is running, the client periodically checks if `exitcode` exists on
the worker. If the client was offline or the connection dropped, the next
`p ls` SSH-polls workers to reconcile state. Jobs with unknown status are
marked accordingly.
### Connection drops during streaming
If the SSH connection drops while `p` is streaming output, `p` exits with an
error message showing the job ID. The job continues running on the worker.
Resume watching with `p logs -f <job-id>`.
## Worker-side Layout
All data lives under `~/.p/` on the worker (no root access required).
```
~/.p/
jobs/
<uuid>/
cmd # command string
cwd # original client CWD (display only)
worker # worker name (display only)
started_at # unix timestamp
output.log # combined stdout+stderr, always captured
exitcode # written on completion; absent = still running
pid # process ID of the running job
workdirs/
<uuid>/ # rsync'd copy of client CWD for this job
```
## Configuration
File: `~/.config/p/config.yaml`
```yaml
default_worker: beefy
workers:
- name: beefy
connection: user@192.168.1.50
- name: cloud
connection: user@cloud-host.example.com
```
## `p ls` Output (example)
```
ID WORKER CWD COMMAND STATUS DURATION
-------- ------ --------------- --------------- --------- --------
a3f2b091 beefy ~/projects/foo make running 0:02:14
7c91d302 beefy ~/projects/bar cargo test done [0] 0:01:03
b004f123 cloud ~/scripts ./bench.sh done [1] 0:00:47
```
## Open Questions
- **Multiple jobs from the same CWD**: each gets its own `workdirs/<uuid>/`,
so they're fully isolated. This may use significant disk space — `p rm`
should prompt to clean up.
- **Non-Linux workers**: path conventions may differ on macOS workers. Out of
scope for now.