{"kind":"Skill","metadata":{"namespace":"community","name":"crabbox","version":"0.1.0"},"spec":{"description":"Use the Crabbox wrapper for OpenClaw remote validation across Linux, macOS, Windows, and WSL2, including delegated Blacksmith Testbox proof. Report the actual provider and id.","files":{"SKILL.md":"---\nname: crabbox\ndescription: Use the Crabbox wrapper for OpenClaw remote validation across Linux, macOS, Windows, and WSL2, including delegated Blacksmith Testbox proof. Report the actual provider and id.\n---\n\n# Crabbox\n\nUse the Crabbox wrapper when OpenClaw needs remote Linux proof for broad tests,\nCI-parity checks, secrets, hosted services, Docker/E2E/package lanes, warmed\nreusable boxes, sync timing, logs/results, cache inspection, or lease cleanup.\n\nCrabbox is the transport/orchestration surface. The actual backend can be:\n\n- brokered AWS Crabbox: direct provider, `provider=aws`, lease ids like\n  `cbx_...`, `syncDelegated=false`\n- Blacksmith Testbox through Crabbox: delegated provider,\n  `provider=blacksmith-testbox`, ids like `tbx_...`, `syncDelegated=true`\n\nFor OpenClaw maintainer broad `pnpm` gates, Blacksmith Testbox through the\nCrabbox wrapper is acceptable and often preferred when the standing Testbox\nrules apply. Do not describe those runs as \"AWS Crabbox\"; report them as\nTestbox-through-Crabbox with the `tbx_...` id and Actions run.\n\nUse the repo `.crabbox.yaml` brokered AWS path when the task specifically needs\ndirect AWS Crabbox behavior, persistent direct-provider leases, `--fresh-pr`,\n`--full-resync`, environment forwarding, capture/download support, or provider\ncomparison. Use `--provider blacksmith-testbox` when the task needs OpenClaw\nmaintainer Testbox proof, prepared CI environment, broad/heavy pnpm gates, or\nthe user asks for Testbox/Blacksmith.\n\n## First Checks\n\n- Run from the repo root. Crabbox sync mirrors the current checkout.\n- Check the wrapper and providers before remote work:\n\n```sh\ncommand -v crabbox\n../crabbox/bin/crabbox --version\npnpm crabbox:run -- --help | sed -n '1,120p'\n../crabbox/bin/crabbox desktop launch --help\n../crabbox/bin/crabbox webvnc --help\n```\n\n- OpenClaw scripts prefer `../crabbox/bin/crabbox` when present. The user PATH\n  shim can be stale.\n- Check `.crabbox.yaml` for direct-provider defaults. Omitting `--provider`\n  means brokered AWS today.\n- For broad OpenClaw maintainer `pnpm` gates, prefer the repo wrapper with\n  `--provider blacksmith-testbox` or the repo Testbox helpers when the standing\n  Testbox policy applies.\n- Always report the actual provider and id. `cbx_...` means AWS Crabbox;\n  `tbx_...` means Blacksmith Testbox through Crabbox. If the output only says\n  `blacksmith testbox list`, use `blacksmith testbox list --all` before\n  concluding no box exists.\n- If a warm direct-provider lease smells stale, retry with `--full-resync`\n  (alias `--fresh-sync`) before replacing the lease. This resets the remote\n  workdir, skips the fingerprint fast path, reseeds Git when possible, and\n  uploads the checkout from scratch.\n- For live/provider bugs, use the configured secret workflow before downgrading\n  to mocks. Copy only the exact needed key into the remote process environment\n  for that one command. Do not print it, do not sync it as a repo file, and do\n  not leave it in remote shell history or logs. If no secret-safe injection path\n  is available, say true live provider auth is blocked instead of silently using\n  a fake key.\n- Prefer local targeted tests for tight edit loops. Broad gates belong remote.\n- Do not treat inherited shell env as operator intent. In particular,\n  `OPENCLAW_LOCAL_CHECK_MODE=throttled` from the local shell is not permission\n  to move broad `pnpm check:changed`, `pnpm test:changed`, full `pnpm test`, or\n  lint/typecheck fan-out onto the laptop.\n- Only use `OPENCLAW_LOCAL_CHECK_MODE=throttled|full` when the user explicitly\n  asks for local proof in the current task. If Testbox is queued or capacity is\n  constrained, report the blocker and keep only targeted local edit-loop checks\n  running.\n\n## macOS And Windows Targets\n\nUse these only when the task needs an existing non-Linux host. OpenClaw broad\nLinux validation uses the repo Crabbox config unless a provider is explicitly\nrequested.\n\nWhen the user explicitly asks for brokered macOS runners, use Crabbox AWS\nmacOS only after confirming the deployed coordinator supports EC2 Mac host\nlifecycle/image routes and the operator has AWS EC2 Mac Dedicated Host quota\nand IAM. Prefer `CRABBOX_HOST_ID` for a known Crabbox-managed Dedicated Host,\nor run the no-spend preflight first:\n\n```sh\ncrabbox admin hosts quota --provider aws --target macos --region eu-west-1 --type mac2.metal --json\ncrabbox admin hosts allocate --provider aws --target macos --region eu-west-1 --type mac2.metal --dry-run --json\nCRABBOX_MACOS_TYPES=all scripts/macos-host-region-preflight.sh\n```\n\nDo not silently substitute AWS macOS for normal OpenClaw Linux proof. Report\npaid-host blockers as quota, IAM, coordinator deployment, or host availability\ninstead of falling back to local macOS.\n\nCrabbox supports static SSH targets:\n\n```sh\n../crabbox/bin/crabbox run --provider ssh --target macos --static-host mac-studio.local -- xcodebuild test\n../crabbox/bin/crabbox run --provider ssh --target windows --windows-mode normal --static-host win-dev.local -- pwsh -NoProfile -Command \"dotnet test\"\n../crabbox/bin/crabbox run --provider ssh --target windows --windows-mode wsl2 --static-host win-dev.local -- pnpm test\n```\n\n- `target=macos` and `target=windows --windows-mode wsl2` use the POSIX SSH,\n  bash, Git, rsync, and tar contract.\n- Native Windows uses OpenSSH, PowerShell, Git, and tar; sync is manifest tar\n  archive transfer into `static.workRoot`. Direct native Windows runs support\n  `--script*`, `--env-from-profile`, `--preflight`, and PowerShell `--shell`.\n- `crabbox actions hydrate/register` are Linux-only today; use plain\n  `crabbox run` loops for static macOS and Windows hosts.\n- Live proof needs a reachable, operator-managed SSH host. Without one, verify\n  with `../crabbox/bin/crabbox run --help`, config/flag tests, and the Crabbox\n  Go test suite.\n\n## Direct Brokered AWS Backend\n\nUse this when the task needs direct AWS Crabbox semantics rather than the\nprepared Blacksmith Testbox CI environment.\n\nChanged gate:\n\n```sh\npnpm crabbox:run -- \\\n  --idle-timeout 90m \\\n  --ttl 240m \\\n  --timing-json \\\n  --shell -- \\\n  \"env CI=1 NODE_OPTIONS=--max-old-space-size=4096 OPENCLAW_TEST_PROJECTS_PARALLEL=6 OPENCLAW_VITEST_MAX_WORKERS=1 OPENCLAW_VITEST_NO_OUTPUT_TIMEOUT_MS=900000 pnpm test:changed\"\n```\n\nFull suite:\n\n```sh\npnpm crabbox:run -- \\\n  --idle-timeout 90m \\\n  --ttl 240m \\\n  --timing-json \\\n  --shell -- \\\n  \"env CI=1 NODE_OPTIONS=--max-old-space-size=4096 OPENCLAW_TEST_PROJECTS_PARALLEL=6 OPENCLAW_VITEST_MAX_WORKERS=1 OPENCLAW_VITEST_NO_OUTPUT_TIMEOUT_MS=900000 pnpm test\"\n```\n\nFocused rerun:\n\n```sh\npnpm crabbox:run -- \\\n  --idle-timeout 90m \\\n  --ttl 240m \\\n  --timing-json \\\n  --shell -- \\\n  \"env CI=1 NODE_OPTIONS=--max-old-space-size=4096 OPENCLAW_VITEST_MAX_WORKERS=1 OPENCLAW_VITEST_NO_OUTPUT_TIMEOUT_MS=900000 pnpm test \u003cpath-or-filter\u003e\"\n```\n\nRead the JSON summary. Useful fields:\n\n- `provider`: `aws`\n- `leaseId`: `cbx_...`\n- `syncDelegated`: `false`\n- `commandPhases`: populated when the command prints `CRABBOX_PHASE:\u003cname\u003e`\n- `commandMs` / `totalMs`\n- `exitCode`\n\nCrabbox should stop one-shot AWS leases automatically after the run. Verify\ncleanup when a run fails, is interrupted, or the command output is unclear:\n\n```sh\n../crabbox/bin/crabbox list --provider aws\n```\n\n## Blacksmith Testbox Through Crabbox\n\nUse this for OpenClaw maintainer broad/heavy `pnpm` gates when the prepared CI\nenvironment is the right proof surface:\n\n```sh\nnode scripts/crabbox-wrapper.mjs run \\\n  --provider blacksmith-testbox \\\n  --blacksmith-org openclaw \\\n  --blacksmith-workflow .github/workflows/ci-check-testbox.yml \\\n  --blacksmith-job check \\\n  --blacksmith-ref main \\\n  --idle-timeout 90m \\\n  --ttl 240m \\\n  --timing-json \\\n  -- \\\n  CI=1 NODE_OPTIONS=--max-old-space-size=4096 OPENCLAW_TEST_PROJECTS_PARALLEL=6 OPENCLAW_VITEST_MAX_WORKERS=1 OPENCLAW_VITEST_NO_OUTPUT_TIMEOUT_MS=900000 OPENCLAW_TESTBOX=1 OPENCLAW_TESTBOX_REMOTE_RUN=1 pnpm check:changed\n```\n\nRead the JSON summary and the Testbox line. Useful fields:\n\n- `provider`: `blacksmith-testbox`\n- `leaseId`: `tbx_...`\n- `syncDelegated`: `true`\n- `syncPhases`: delegated/skipped because Blacksmith owns checkout/sync\n- Actions run URL/id from the Testbox output\n- `exitCode`\n\n`blacksmith testbox list` may hide hydrating or ready boxes. Use:\n\n```sh\nblacksmith testbox list --all\nblacksmith testbox status \u003ctbx_id\u003e\n```\n\n## Observability Flags\n\nUse these on debugging runs before inventing ad hoc logging:\n\n- `--preflight`: prints run context, workspace mode, SSH target, remote user/cwd,\n  and target-specific tool probes. Defaults cover `git`, `tar`, `node`, `npm`,\n  `corepack`, `pnpm`, `yarn`, `bun`, `docker`, plus POSIX\n  `sudo`/`apt`/`bubblewrap` and native Windows\n  `powershell`/`execution_policy`/`longpaths`/`temp`/`pwsh`. Add\n  `--preflight-tools node,bun,docker`, `CRABBOX_PREFLIGHT_TOOLS`, or repo\n  `run.preflightTools` to replace the list. `default` expands built-ins; `none`\n  prints only the workspace summary. Preflight is diagnostic only; install\n  toolchains through Actions hydration, images, devcontainer/Nix/mise/asdf, or\n  the run script. On `blacksmith-testbox`, this prints a delegated-unsupported\n  note because the workflow owns setup.\n- `CRABBOX_ENV_ALLOW=NAME,...`: forwards only listed local env vars for direct\n  providers and prints `set len=N secret=true` style summaries. On\n  `blacksmith-testbox`, env forwarding is unsupported; put secrets in the\n  Testbox workflow instead.\n- `--env-from-profile \u003cfile\u003e` plus `--allow-env NAME`: loads simple\n  `export NAME=value` / `NAME=value` lines from a local profile without\n  executing it, then forwards only allowlisted names. `--allow-env` is\n  repeatable and comma-separated. Profile values override ambient allowlisted\n  env values for that run. Direct POSIX, WSL2, and native Windows runs are\n  supported; delegated providers are not. Crabbox probes the uploaded profile\n  remotely and prints redacted presence/length metadata before the command.\n- `--env-helper \u003cname\u003e`: with `--env-from-profile` on POSIX SSH targets,\n  persists `.crabbox/env/\u003cname\u003e` and `.crabbox/env/\u003cname\u003e.env` so follow-up\n  commands on the same lease can run through `./.crabbox/env/\u003cname\u003e \u003ccommand\u003e`.\n  Use only on leases you control; the profile stays until cleanup, lease reset,\n  or `--full-resync`.\n- `--script \u003cfile\u003e` / `--script-stdin`: upload a local script into\n  `.crabbox/scripts/` and execute it on the remote box. Shebang scripts execute\n  directly on POSIX; scripts without a shebang run through `bash`. Native\n  Windows uploads run through Windows PowerShell, and Crabbox appends `.ps1`\n  when needed. Arguments after `--` become script args.\n- `--fresh-pr owner/repo#123|URL|number`: skip dirty local sync and create a\n  fresh remote checkout of the GitHub PR. Bare numbers use the current repo's\n  GitHub origin. Add `--apply-local-patch` only when the current local\n  `git diff --binary HEAD` should be applied on top of that PR checkout.\n- `--full-resync` / `--fresh-sync`: reset a stale direct-provider workdir\n  before syncing. Use after sync fingerprints look wrong, SSH times out before\n  sync, or rsync watchdog output suggests it. It is redundant with\n  `--fresh-pr`, incompatible with `--no-sync`, and unsupported by delegated\n  providers.\n- `--capture-stdout \u003cpath\u003e` / `--capture-stderr \u003cpath\u003e`: write remote streams to\n  local files and keep binary/noisy output out of retained logs. Parent\n  directories must already exist. These are direct-provider only.\n- `--capture-on-fail`: on non-zero direct-provider exits, downloads\n  `.crabbox/captures/*.tar.gz` with `test-results`, `playwright-report`,\n  `coverage`, JUnit XML, and nearby logs. Treat as secret-bearing until reviewed.\n- `--keep-on-failure`: leave a failed one-shot lease alive for live debugging\n  until idle/TTL expiry. Useful on direct providers and delegated one-shots.\n- `--timing-json`: final machine-readable timing. Add\n  `echo CRABBOX_PHASE:install`, `CRABBOX_PHASE:test`, etc. in long shell\n  commands; direct providers and Blacksmith Testbox both report them as\n  `commandPhases`.\n\nLive-provider debug template for direct AWS/Hetzner leases:\n\n```sh\nmkdir -p .crabbox/logs\npnpm crabbox:run -- --provider aws \\\n  --preflight \\\n  --allow-env OPENAI_API_KEY,OPENAI_BASE_URL \\\n  --timing-json \\\n  --capture-stdout .crabbox/logs/live-provider.stdout.log \\\n  --capture-stderr .crabbox/logs/live-provider.stderr.log \\\n  --capture-on-fail \\\n  --shell -- \\\n  \"echo CRABBOX_PHASE:install; pnpm install --frozen-lockfile; echo CRABBOX_PHASE:test; pnpm test:live\"\n```\n\nDo not pass `--capture-*`, `--download`, `--checksum`, `--force-sync-large`, or\n`--sync-only` to delegated providers. Also do not pass `--script*`,\n`--fresh-pr`, `--full-resync`, or `--env-helper` there. Crabbox rejects these\nbecause the provider owns sync or command transport. `--keep-on-failure` is OK\nfor delegated one-shots when you need to inspect a failed lease.\n\n## Efficient Bug E2E Verification\n\nUse the smallest Crabbox lane that proves the reported user path, not just the\ntouched code. Aim for one after-fix E2E proof before commenting, closing, or\nopening a PR for a user-visible bug.\n\nWhen the user says \"test in Crabbox\", do not simply copy tests to the remote\nbox and run them there. Crabbox is for remote real-scenario proof: copy or\ninstall OpenClaw as the user would, run the same setup/update/CLI/Gateway/API\ncall that failed, and capture behavior from that entrypoint. For regressions or\nbug reports, prove the broken state first when feasible, then run the same\nscenario after the fix.\n\nPick the lane by symptom:\n\n- Docker/setup/install bug: build a package tarball and run the matching\n  `scripts/e2e/*-docker.sh` or package script. This proves npm packaging,\n  install paths, runtime deps, config writes, and container behavior.\n- Provider/model/auth bug: prefer true live E2E. Use the configured secret\n  workflow, then inject the single needed key into Crabbox if needed. Scrub\n  unrelated provider env vars in the child command so interactive defaults do\n  not drift to another provider. If only a dummy key is used, label the proof\n  narrowly, e.g. \"UI/install path only; live provider auth not exercised.\"\n- Channel delivery bug: use the channel Docker/live lane when available; include\n  setup, config, gateway start, send/receive or agent-turn proof, and redacted\n  logs.\n- Gateway/session/tool bug: prefer an end-to-end CLI or Gateway RPC command that\n  creates real state and inspects the resulting files/API output.\n- Pure parser/config bug: targeted tests may be enough, but still run a\n  Crabbox command when OS, package, Docker, secrets, or service lifecycle could\n  change behavior.\n\nEfficient flow:\n\n1. Reproduce or prove the pre-fix symptom from the real user-facing entrypoint\n   when feasible. If the issue cannot be reproduced, capture the exact command\n   and observed behavior instead.\n2. Patch locally and run narrow local tests for edit speed.\n3. Run one Crabbox E2E command that starts from the user-facing entrypoint:\n   package install, Docker setup, onboarding, channel add, gateway start, or\n   agent turn as appropriate.\n4. Record proof as: Testbox id, command, environment shape, redacted secret\n   source, and copied success/failure output.\n5. If the issue says \"cannot reproduce\", ask for the missing config/log fields\n   that would distinguish the tested path from the reporter's path.\n\nKeep it efficient:\n\n- Reuse existing E2E scripts and helper assertions before writing ad hoc shell.\n- Use `--script \u003cfile\u003e` or `--script-stdin` for multi-line E2E commands instead\n  of quote-heavy `--shell` strings on direct SSH providers.\n- Use `--fresh-pr \u003cpr\u003e` when validating an upstream PR in isolation from the\n  local dirty tree. Add `--apply-local-patch` only when testing a local fixup on\n  top of that PR.\n- Use `--full-resync` before replacing a warmed direct-provider lease when the\n  remote workdir or sync fingerprint appears stale.\n- Use one-shot Crabbox for a single proof; use a reusable Testbox only when\n  several commands must share built images, installed packages, or live state.\n- Prefer `OPENCLAW_CURRENT_PACKAGE_TGZ` with Docker/package lanes when testing a\n  candidate tarball; prefer the repo's package helper instead of direct source\n  execution when the bug might be packaging/install related.\n- Keep secrets redacted. It is fine to report key presence, source, and length;\n  never print secret values.\n- Include `--timing-json` on broad or flaky runs when command duration or sync\n  behavior matters.\n\nBefore/after PR proof on delegated Testbox:\n\n- For PRs that should prove \"broken before, fixed after\", compare base and PR\n  on the same Testbox when practical. Fetch both refs, create detached temp\n  worktrees under `/tmp`, install in each, then run the same harness twice.\n- Do not checkout base/PR refs in the synced repo root. Delegated Testbox sync\n  may leave the root dirty with local files; `git checkout` can abort or mix\n  proof state.\n- Temp harness files under `/tmp` do not resolve repo packages by default. Put\n  the harness inside the worktree, or in ESM use\n  `createRequire(path.join(process.cwd(), \"package.json\"))` before requiring\n  workspace deps such as `@lydell/node-pty`.\n- For full-screen TUI/CLI bugs, a PTY harness is stronger than helper-only\n  assertions. Use a real PTY, wait for visible lifecycle markers, send input,\n  then send control keys and assert process exit/stuck behavior.\n- When validating a rebased local branch before push, remember delegated sync\n  usually validates synced file content on a detached dirty checkout, not a\n  remote commit object. Record the local head SHA, changed files, Testbox id,\n  and final success markers; after pushing, ensure the pushed SHA has the same\n  file content.\n- If GitHub CI is still queued but the exact changed content passed Testbox\n  `pnpm check:changed`, `pnpm check:test-types`, and the real E2E proof, it is\n  reasonable to merge once required checks allow it. Note any still-running\n  unrelated shards in the proof comment instead of waiting forever.\n\nInteractive CLI/onboarding:\n\n- For full-screen or prompt-heavy CLI flows, run the target command inside tmux\n  on the Crabbox and drive it with `tmux send-keys`; capture proof with\n  `tmux capture-pane`, redacted through `sed`.\n- Prefer deterministic arrow navigation over search typing for Clack-style\n  searchable selects. Raw `send-keys -l openai` may not trigger filtering in a\n  tmux pane; inspect option order locally or on-box and send exact Down/Enter\n  sequences.\n- Isolate mutable state with `OPENCLAW_STATE_DIR=$(mktemp -d)`. Plugin npm\n  installs live under that state dir (`npm/node_modules/...`), not under\n  `OPENCLAW_CONFIG_DIR`. Verify downloads by checking the state dir, package\n  lock, and installed package metadata.\n- To test automatic setup installs against local package artifacts, use\n  `OPENCLAW_ALLOW_PLUGIN_INSTALL_OVERRIDES=1` plus\n  `OPENCLAW_PLUGIN_INSTALL_OVERRIDES='{\"plugin-id\":\"npm-pack:/tmp/plugin.tgz\"}'`.\n  Pack with `npm pack`, set an isolated `OPENCLAW_STATE_DIR`, and verify the\n  package under `npm/node_modules`. Overrides are test-only and must not be\n  treated as official/trusted-source installs.\n- For OpenAI/Codex onboarding proof, the useful markers are the UI line\n  `Installed Codex plugin`, `npm/node_modules/@openclaw/codex`, and the\n  package-lock entry showing the bundled `@openai/codex` dependency. A dummy\n  OpenAI-shaped key can prove only UI/install behavior; it is not live auth.\n\n## Reuse And Keepalive\n\nFor most Crabbox calls, one-shot is enough. Use reuse only when you need\nmultiple manual commands on the same hydrated box.\n\nIf Crabbox returns a reusable id or you intentionally keep a lease:\n\n```sh\npnpm crabbox:run -- --id \u003ccbx_id-or-slug\u003e --no-sync --timing-json --shell -- \"pnpm test \u003cpath\u003e\"\n```\n\nStop boxes you created before handoff:\n\n```sh\npnpm crabbox:stop -- \u003cid-or-slug\u003e\nblacksmith testbox stop --id \u003ctbx_id\u003e\n```\n\n## Interactive Desktop And WebVNC\n\nPrefer WebVNC for human inspection because the browser portal can preload the\nlease VNC password and avoids a native VNC client's copy/paste/password dance.\nUse native `crabbox vnc` only when WebVNC is unavailable, the browser portal is\nbroken, or the user explicitly wants a local VNC client.\n\nCommon desktop flow:\n\n```sh\n../crabbox/bin/crabbox warmup --provider hetzner --desktop --browser --class standard --idle-timeout 60m --ttl 240m\n../crabbox/bin/crabbox desktop launch --provider hetzner --id \u003ccbx_id-or-slug\u003e --browser --url https://example.com --webvnc --open --take-control\n```\n\nUseful WebVNC commands:\n\n```sh\n../crabbox/bin/crabbox webvnc --provider hetzner --id \u003ccbx_id-or-slug\u003e --open --take-control\n../crabbox/bin/crabbox webvnc daemon start --provider hetzner --id \u003ccbx_id-or-slug\u003e --open --take-control\n../crabbox/bin/crabbox webvnc daemon status --provider hetzner --id \u003ccbx_id-or-slug\u003e\n../crabbox/bin/crabbox webvnc daemon stop --provider hetzner --id \u003ccbx_id-or-slug\u003e\n../crabbox/bin/crabbox webvnc status --provider hetzner --id \u003ccbx_id-or-slug\u003e\n../crabbox/bin/crabbox webvnc reset --provider hetzner --id \u003ccbx_id-or-slug\u003e --open --take-control\n../crabbox/bin/crabbox desktop doctor --provider hetzner --id \u003ccbx_id-or-slug\u003e\n../crabbox/bin/crabbox desktop click --provider hetzner --id \u003ccbx_id-or-slug\u003e --x 640 --y 420\n../crabbox/bin/crabbox desktop paste --provider hetzner --id \u003ccbx_id-or-slug\u003e --text \"user@example.com\"\n../crabbox/bin/crabbox desktop key --provider hetzner --id \u003ccbx_id-or-slug\u003e ctrl+l\n../crabbox/bin/crabbox artifacts collect --id \u003ccbx_id-or-slug\u003e --all --output artifacts/\u003cslug\u003e\n../crabbox/bin/crabbox artifacts publish --dir artifacts/\u003cslug\u003e --pr \u003cnumber\u003e\n```\n\n`desktop launch --webvnc --open` is usually the nicest one-shot: it starts the\nbrowser/app inside the visible session, bridges the lease into the authenticated\nWebVNC portal, and opens the portal. Keep browsers windowed for human QA; use\n`--fullscreen` only for capture/video workflows.\nFor human handoff, include `--take-control` so the opened portal viewer gets\nkeyboard/mouse control automatically instead of landing as an observer.\n\nHuman handoff preflight:\n\n- Do not assume a visible desktop or launched browser means the repo CLI/app is\n  installed, built, or on the interactive terminal's `PATH`.\n- Before handing WebVNC to a human tester, prove the expected command from the\n  same kept lease and from a neutral directory such as `~`.\n- If the handoff needs repo-local code, sync/build/link it explicitly on that\n  lease. Source-tree CLIs often need build output before a symlink works.\n- Prefer a real `command -v \u003cexpected-command\u003e \u0026\u0026 \u003cexpected-command\u003e --version`\n  check over a repo-root-only `pnpm ...` command.\n\nGeneric handoff repair pattern:\n\n```sh\n../crabbox/bin/crabbox run --id \u003ccbx_id-or-slug\u003e --full-resync --shell -- \\\n  \"set -euo pipefail\n   pnpm install --frozen-lockfile\n   pnpm build\n   sudo ln -sf \\\"\\$PWD/\u003ccli-entry\u003e\\\" /usr/local/bin/\u003cexpected-command\u003e\n   cd ~\n   command -v \u003cexpected-command\u003e\n   \u003cexpected-command\u003e --version\"\n```\n\n## If Crabbox Fails\n\nKeep the fallback narrow. First decide whether the failure is Crabbox itself,\nthe brokered AWS lease, Blacksmith/Testbox, repo hydration, sync, or the test\ncommand.\n\nFast checks:\n\n```sh\ncommand -v crabbox\n../crabbox/bin/crabbox --version\npnpm crabbox:run -- --help | sed -n '1,140p'\n../crabbox/bin/crabbox doctor\ncommand -v blacksmith\nblacksmith --version\nblacksmith testbox list\n```\n\nCommon Crabbox-only failures:\n\n- Provider missing or old CLI: use `../crabbox/bin/crabbox` from the sibling\n  repo, or update/install Crabbox before retrying.\n- Bad local config: inspect `.crabbox.yaml`, `crabbox config show`, and\n  `crabbox whoami`; normal OpenClaw proof should use brokered AWS without\n  asking for cloud keys.\n- Slug/claim confusion: use the raw `cbx_...` / `tbx_...` id, or run one-shot\n  without `--id`.\n- Sync/timing bug: add `--debug --timing-json`; capture the final JSON and the\n  printed Actions URL. Large sync warnings now include top source directories\n  by file count and a hint to update `.crabboxignore` / `sync.exclude`; inspect\n  those before reaching for `--force-sync-large`. Quiet rsync watchdogs and SSH\n  timeouts now print `next_action=` hints; follow them, usually `--full-resync`\n  first and a fresh lease second.\n- Cleanup uncertainty: run `crabbox list --provider aws`; for explicit\n  Blacksmith runs, use `blacksmith testbox list` and stop only boxes you\n  created.\n- Testbox queued/capacity pressure: do not retry Blacksmith repeatedly. Rerun\n  once without `--provider` so `.crabbox.yaml` routes to brokered AWS, or report\n  the Blacksmith blocker if Testbox itself is the requested proof.\n\nIf brokered AWS cannot dispatch, sync, attach, or stop, retry once with\n`--debug` and `--timing-json`:\n\n```sh\npnpm crabbox:run -- --debug --timing-json -- \\\n  CI=1 NODE_OPTIONS=--max-old-space-size=4096 OPENCLAW_TEST_PROJECTS_PARALLEL=6 OPENCLAW_VITEST_MAX_WORKERS=1 OPENCLAW_VITEST_NO_OUTPUT_TIMEOUT_MS=900000 pnpm test:changed\n```\n\nFull suite:\n\n```sh\npnpm crabbox:run -- --debug --timing-json -- \\\n  CI=1 NODE_OPTIONS=--max-old-space-size=4096 OPENCLAW_TEST_PROJECTS_PARALLEL=6 OPENCLAW_VITEST_MAX_WORKERS=1 OPENCLAW_VITEST_NO_OUTPUT_TIMEOUT_MS=900000 pnpm test\n```\n\nAuth fallback, only when `blacksmith` says auth is missing:\n\n```sh\nblacksmith auth login --non-interactive --organization openclaw\n```\n\nRaw Blacksmith footguns:\n\n- Run from repo root. The CLI syncs the current directory.\n- Save the returned `tbx_...` id in the session.\n- Reuse that id for focused reruns; stop it before handoff.\n- Raw commit SHAs are not reliable `warmup --ref` refs; use a branch or tag.\n- Treat `blacksmith testbox list` as cleanup diagnostics, not a shared reusable\n  queue.\n\nUse Blacksmith only when the task is specifically about Testbox, brokered AWS\nis unavailable, or an explicit comparison is needed. If Blacksmith is down or\nquota-limited, do not keep probing it; stay on brokered AWS and note the\ndelegated-provider outage.\n\n## Blacksmith Backend Notes\n\nCrabbox Blacksmith backend delegates setup to:\n\n- org: `openclaw`\n- workflow: `.github/workflows/ci-check-testbox.yml`\n- job: `check`\n- ref: `main` unless testing a branch/tag intentionally\n\nThe hydration workflow owns checkout, Node/pnpm setup, dependency install,\nsecrets, ready marker, and keepalive. Crabbox owns dispatch, sync, SSH command\nexecution, timing, logs/results, and cleanup.\n\nMinimal Blacksmith-backed Crabbox run, from repo root:\n\n```sh\npnpm crabbox:run -- --provider blacksmith-testbox --timing-json -- \\\n  CI=1 NODE_OPTIONS=--max-old-space-size=4096 OPENCLAW_TEST_PROJECTS_PARALLEL=6 OPENCLAW_VITEST_MAX_WORKERS=1 pnpm test:changed\n```\n\nUse direct Blacksmith only when Crabbox is the broken layer and you are\nisolating a Crabbox bug. Prefer direct `blacksmith testbox list` for cleanup\ndiagnostics, not as a reusable work queue.\n\nImportant Blacksmith footguns:\n\n- Always run from repo root. The CLI syncs the current directory.\n- Raw commit SHAs are not reliable `warmup --ref` refs; use a branch or tag.\n- If auth is missing and browser auth is acceptable:\n\n```sh\nblacksmith auth login --non-interactive --organization openclaw\n```\n\n## Brokered AWS\n\nUse AWS for normal OpenClaw remote proof. The repo `.crabbox.yaml` already\nselects brokered AWS, so omit `--provider` unless you are testing a different\nprovider deliberately.\n\n```sh\npnpm crabbox:warmup -- --class beast --market on-demand --idle-timeout 90m\npnpm crabbox:hydrate -- --id \u003ccbx_id-or-slug\u003e\npnpm crabbox:run -- --id \u003ccbx_id-or-slug\u003e --timing-json --shell -- \"env NODE_OPTIONS=--max-old-space-size=4096 OPENCLAW_TEST_PROJECTS_PARALLEL=6 OPENCLAW_VITEST_MAX_WORKERS=1 OPENCLAW_VITEST_NO_OUTPUT_TIMEOUT_MS=900000 pnpm test:changed\"\npnpm crabbox:stop -- \u003ccbx_id-or-slug\u003e\n```\n\nInstall/auth for owned Crabbox if needed:\n\n```sh\nbrew install openclaw/tap/crabbox\ncrabbox login --url https://crabbox.openclaw.ai --provider aws\n```\n\nNew users should self-resolve broker auth before anyone asks for AWS keys:\n\n```sh\ncrabbox config show\ncrabbox doctor\ncrabbox whoami\n```\n\n- If broker auth is missing, run `crabbox login --url https://crabbox.openclaw.ai --provider aws`.\n- If the CLI asks for `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`, or AWS\n  profile setup during normal OpenClaw validation, assume the agent selected\n  the wrong path. Use brokered `crabbox login` or an existing brokered lease\n  before asking the user for cloud credentials.\n- Ask for AWS keys only for explicit direct-provider/account administration,\n  not for normal brokered OpenClaw proof.\n- Trusted automation may still use\n  `printf '%s' \"$CRABBOX_COORDINATOR_TOKEN\" | crabbox login --url https://crabbox.openclaw.ai --provider aws --token-stdin`.\n\nmacOS config lives at:\n\n```text\n~/Library/Application Support/crabbox/config.yaml\n```\n\nIt should include `broker.url`, `broker.token`, and usually `provider: aws`\nfor OpenClaw lanes. Let that config drive normal validation.\n\n### Interactive Desktop / WebVNC\n\nFor human desktop demos, prefer `webvnc` over native `vnc` and keep the remote\ndesktop visible/windowed. Do not fullscreen the remote browser or hide the XFCE\npanel/window chrome unless the explicit goal is video/capture output. After\nlaunch, verify a screenshot shows the desktop panel plus browser title bar. If\nChrome is fullscreen, toggle it back with:\n\n```sh\ncrabbox run --id \u003clease\u003e --shell -- 'DISPLAY=:99 xdotool search --onlyvisible --class google-chrome windowactivate key F11'\n```\n\n## Diagnostics\n\n```sh\ncrabbox status --id \u003cid-or-slug\u003e --wait\ncrabbox inspect --id \u003cid-or-slug\u003e --json\ncrabbox sync-plan\ncrabbox history --limit 20\ncrabbox history --lease \u003cid-or-slug\u003e\ncrabbox attach \u003crun_id\u003e\ncrabbox events \u003crun_id\u003e --json\ncrabbox logs \u003crun_id\u003e\ncrabbox results \u003crun_id\u003e\ncrabbox cache stats --id \u003cid-or-slug\u003e\ncrabbox ssh --id \u003cid-or-slug\u003e\nblacksmith testbox list\n```\n\nUse `--debug` on `run` when measuring sync timing.\nUse `--timing-json` on warmup, hydrate, and run when comparing backends.\nUse `--market spot|on-demand` only on AWS warmup/one-shot runs.\n\n## Failure Triage\n\n- Crabbox cannot find provider: verify `../crabbox/bin/crabbox --help` lists\n  the provider selected by `.crabbox.yaml`; update Crabbox before falling back.\n- Hydration stuck or failed: open the printed GitHub Actions run URL and inspect\n  the hydration step.\n- Sync failed: rerun with `--debug`; check changed-file count and whether the\n  checkout is dirty.\n- Command failed: rerun only the failing shard/file first. Do not rerun a full\n  suite until the focused failure is understood.\n- Cleanup uncertain: `crabbox list --provider aws`; for explicit Blacksmith\n  runs, use `blacksmith testbox list` and stop owned `tbx_...` leases you\n  created.\n- Crabbox broken but Blacksmith works: use the direct Blacksmith fallback above,\n  then file/fix the Crabbox issue.\n\n## Boundary\n\nDo not add OpenClaw-specific setup to Crabbox itself. Put repo setup in the\nhydration workflow and keep Crabbox generic around lease, sync, command\nexecution, logs/results, timing, and cleanup.\n"},"import":{"commit_sha":"424c6d0a5f4665b803ad6768d08b0be7659deaf4","imported_at":"2026-05-18T20:13:36Z","license_text":"MIT License\n\nCopyright (c) 2025 Peter Steinberger\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the \"Software\"), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\nSOFTWARE.\n","owner":"openclaw","repo":"openclaw/openclaw","source_url":"https://github.com/openclaw/openclaw/tree/424c6d0a5f4665b803ad6768d08b0be7659deaf4/.agents/skills/crabbox"}},"content_hash":[143,54,150,55,187,18,34,244,241,30,254,158,176,145,104,34,154,116,194,173,136,43,138,145,86,228,180,215,34,44,254,135],"trust_level":"unsigned","yanked":false}
