initial commit (spec)
This commit is contained in:
219
SPEC.md
Normal file
219
SPEC.md
Normal file
@@ -0,0 +1,219 @@
|
||||
# `p` — push jobs to worker
|
||||
|
||||
A small Rust CLI utility to push command-line jobs to remote worker machines,
|
||||
with directory sync, job management, and attach/detach support.
|
||||
|
||||
## Motivation
|
||||
|
||||
The common developer workflow of "run this build/test/script on a more powerful
|
||||
remote machine" currently requires manually chaining `rsync`, `ssh`, and `tmux`.
|
||||
`p` wraps that entire flow into a single ergonomic command, while adding proper
|
||||
job tracking, log capture, and attach/detach mechanics.
|
||||
|
||||
## Core Concepts
|
||||
|
||||
### Worker
|
||||
A remote machine accessible via SSH. Workers are registered locally with a
|
||||
name and a connection string. One worker is designated as the **default**.
|
||||
|
||||
### Job
|
||||
A command submitted to a worker, along with a (optionally synced) working
|
||||
directory. Each job has:
|
||||
- A UUID
|
||||
- The command run
|
||||
- The worker it ran on
|
||||
- The original client CWD
|
||||
- Start time, end time, exit code
|
||||
- Captured output log
|
||||
|
||||
### p-agent
|
||||
A small Rust (?) binary, automatically uploaded and started by `p` on first use
|
||||
of a worker. It manages job lifecycle, log capture, and status tracking on
|
||||
the worker side. The user never manually installs or configures it.
|
||||
`p` checks the agent version on each connection and re-uploads if outdated.
|
||||
Communication happens over SSH port forwarding — no extra open ports needed.
|
||||
|
||||
## CLI Reference
|
||||
|
||||
### Running jobs
|
||||
|
||||
```
|
||||
p -- <command>
|
||||
```
|
||||
Sync the current directory to the default worker and run `<command>` on it.
|
||||
Attaches to the job's output by default. `Ctrl+C` detaches without killing the job.
|
||||
|
||||
```
|
||||
p <worker> -- <command>
|
||||
```
|
||||
Same, but targets a specific named worker.
|
||||
|
||||
```
|
||||
p [-n | --no-sync] -- <command>
|
||||
```
|
||||
Run `<command>` on the worker without syncing the current directory first.
|
||||
Useful for commands that need no local files (e.g. `p -n -- htop`).
|
||||
|
||||
### Job management
|
||||
|
||||
```
|
||||
p ls
|
||||
```
|
||||
List jobs across all workers. Shows: ID (short), worker, original CWD,
|
||||
command, status, duration. Style inspired by `docker ps` / `lxc list`.
|
||||
|
||||
```
|
||||
p attach <job-id>
|
||||
```
|
||||
Re-attach to the console of a running job (via tmux). Supports partial IDs.
|
||||
|
||||
```
|
||||
p logs <job-id>
|
||||
```
|
||||
Print the captured output of a job (running or finished). Supports `-f` to
|
||||
follow a running job's output without attaching to its TTY.
|
||||
|
||||
```
|
||||
p stop <job-id>
|
||||
```
|
||||
Kill a running job.
|
||||
|
||||
```
|
||||
p pull <job-id> <remote-path> [<local-dest>]
|
||||
```
|
||||
Copy a specific file or directory from a job's work directory back to the
|
||||
client. Used to retrieve build artifacts.
|
||||
|
||||
```
|
||||
p rm <job-id>
|
||||
```
|
||||
Remove a job record and its remote work directory. Refuses to remove a
|
||||
running job without `--force`.
|
||||
|
||||
### Workers
|
||||
|
||||
```
|
||||
p register [<name>] <connection-string>
|
||||
```
|
||||
Register a worker. The connection string is an SSH target (`user@host`,
|
||||
`user@host:port`, or an SSH config alias). If `<name>` is omitted, the
|
||||
hostname is used. The first registered worker becomes the default.
|
||||
|
||||
```
|
||||
p workers
|
||||
```
|
||||
List registered workers with their name, connection string, and reachability
|
||||
status.
|
||||
|
||||
```
|
||||
p default <worker>
|
||||
```
|
||||
Set the default worker.
|
||||
|
||||
## Directory Sync
|
||||
|
||||
- Uses `rsync` over SSH.
|
||||
- Respects `.gitignore` by default (via `rsync --filter=':- .gitignore'`).
|
||||
- `.git/` is **included** — some workflows depend on it (e.g. reading the
|
||||
current commit SHA or latest tag).
|
||||
- Each job gets its own isolated work directory on the worker:
|
||||
`~/.p/workdirs/<job-uuid>/`
|
||||
- No automatic sync-back after job completion. Use `p pull` to retrieve
|
||||
specific artifacts.
|
||||
|
||||
## Worker-side Layout
|
||||
|
||||
All data lives under `~/.p/` on the worker (no root access required).
|
||||
|
||||
```
|
||||
~/.p/
|
||||
bin/
|
||||
p-agent # auto-uploaded by p, versioned
|
||||
jobs/
|
||||
<uuid>/
|
||||
cmd # command string
|
||||
cwd # original client CWD (display only)
|
||||
worker # worker name (display only)
|
||||
started_at # unix timestamp
|
||||
output.log # combined stdout+stderr, always captured
|
||||
exitcode # written on completion; absent = still running
|
||||
tmux_session # tmux session name (e.g. "p-<short-uuid>")
|
||||
workdirs/
|
||||
<uuid>/ # rsync'd copy of client CWD for this job
|
||||
```
|
||||
|
||||
## Attach / Detach Mechanics
|
||||
|
||||
Jobs run inside a `tmux` session on the worker (requirement: `tmux` must be
|
||||
installed on the worker). Output is simultaneously captured to `output.log`
|
||||
via `tmux pipe-pane` or a `tee` wrapper.
|
||||
|
||||
- `p attach <id>` → `ssh -t worker "tmux attach -t p-<id>"`
|
||||
- `Ctrl+C` while attached → sends detach signal to tmux, **not** SIGINT to
|
||||
the job. The job keeps running.
|
||||
- `p attach` only works on **running** jobs. For finished jobs, use `p logs`.
|
||||
|
||||
> **Note:** `tmux` is the only required dependency on the worker beyond a
|
||||
> standard POSIX environment. The `p-agent` binary and `rsync` are also
|
||||
> required; `p` ensures the agent is present automatically. `rsync` must be
|
||||
> available on the worker (standard on most Linux systems).
|
||||
|
||||
## Job Status & Notification
|
||||
|
||||
The **p-agent** runs as a lightweight background process on the worker
|
||||
(started automatically, not a system service). It:
|
||||
|
||||
- Manages job launch and tmux session creation
|
||||
- Tees output to `output.log`
|
||||
- Writes `exitcode` on completion
|
||||
- Notifies the client over the SSH reverse tunnel when a job finishes
|
||||
|
||||
The client maintains a local job database (`~/.local/share/p/jobs.db`,
|
||||
SQLite) mirroring job state. `p ls` reads from this local DB (fast, no SSH),
|
||||
updated in real time while attached, and via agent notifications otherwise.
|
||||
|
||||
### Degraded mode (agent unreachable / client was offline)
|
||||
If the client missed a completion notification, `p ls` marks affected jobs as
|
||||
`unknown`. Running the next `p ls` SSH-polls all
|
||||
workers with known-running jobs to reconcile state.
|
||||
|
||||
## Configuration
|
||||
|
||||
File: `~/.config/p/config.yaml`
|
||||
|
||||
```yaml
|
||||
default_worker = "beefy"
|
||||
workers:
|
||||
- name: beefy
|
||||
connection: user@192.168.1.50
|
||||
- name: cloud
|
||||
connection: user@cloud-host.example.com
|
||||
```
|
||||
|
||||
## `p ls` Output (example)
|
||||
|
||||
```
|
||||
ID WORKER CWD COMMAND STATUS DURATION
|
||||
a3f2 beefy ~/projects/foo make running 0:02:14
|
||||
7c91 beefy ~/projects/bar cargo test done [0] 0:01:03
|
||||
b004 cloud ~/scripts ./bench.sh done [1] 0:00:47
|
||||
```
|
||||
|
||||
## Open Questions
|
||||
|
||||
- **Worker arch detection**: `p-agent` must be compiled for the worker's
|
||||
architecture. Options: (a) ship common targets and detect via SSH, (b)
|
||||
compile on the worker if a Rust toolchain is present, (c) require user to
|
||||
specify arch in worker config.
|
||||
|
||||
Maybe we could also implement the agent core in the form of a shell script?
|
||||
At least the entry point, which could do some detection and install or something.
|
||||
We will see on implementation what works... Small rust binary seems nice,
|
||||
but we want support for amd64, aarch64 and riscv64.
|
||||
|
||||
- **Multiple jobs from the same CWD**: each gets its own `workdirs/<uuid>/`,
|
||||
so they're fully isolated. This may use significant disk space — `p rm`
|
||||
should prompt to clean up.
|
||||
|
||||
- **Non-Linux workers**: tmux availability and path conventions may differ on
|
||||
macOS workers. Out of scope for Phase 1.
|
||||
Reference in New Issue
Block a user