Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

primatrix/exec-remote

Name: exec-remote
Author: primatrix

plugins/exec-remote/skills/exec-remote/SKILL.md

npx skillsauth add primatrix/skills exec-remote

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Remote Execution Skill

This skill handles running code on remote GPU or TPU clusters via SkyPilot.

Defaults

The following defaults apply unless the user explicitly overrides them:

| Parameter | Default | |----------------|----------------------------| | PROJECT_ID | tpu-service-473302 | | CLUSTER_NAME | sglang-jax-agent-tests | | ZONE | asia-northeast1-b | | NUM_SLICES | 1 |

Use these values directly — do NOT ask the user to confirm or re-enter them unless they specify otherwise.

1. Determine Target Device

Identify the target device from the user's request:

| Target | Cluster name file | Env prefix | |--------|---------------------|------------------------------------| | GPU | .cluster_name_gpu | export CUDA_VISIBLE_DEVICES=0; | | TPU | .cluster_name_tpu | (none) |

If the user does not specify a device, ask them which one to use.

2. Prerequisites

The cluster must already be provisioned. Check that the corresponding cluster name file (.cluster_name_gpu or .cluster_name_tpu) exists and is non-empty in the project root.
If the file does not exist or is empty, provision the cluster using the appropriate method (see Section 3).

3. Cluster Provisioning

GPU (Standalone SkyPilot)

GPU clusters are provisioned using the standalone launch_gpu.sh script. Locate it in the scripts/ directory alongside this skill definition.

# Common accelerator types: H100:1, A100:1, L4:1
bash <absolute_path_to_launch_gpu.sh> <accelerator_type> <experiment_name>

The launch script automatically updates .cluster_name_gpu.

TPU

There are two provisioning paths for TPU:

Path A: GKE-based (via `deploy-cluster` skill) — Recommended

This path provisions TPU on GKE using the full pipeline: apply-resource -> deploy-cluster -> exec-remote.

Each TPU type gets its own SkyPilot cluster named <cluster>-<username>-<tpu_type>, allowing multiple topologies to run in parallel.

Use the deploy-cluster skill which will:
- Use default cluster/project/zone unless user overrides
- Ensure the GKE cluster exists (via apply-resource)
- Configure SkyPilot for GKE
- Launch a per-TPU-type SkyPilot cluster
- Save the cluster name to .cluster_name_tpu

/deploy-cluster

Supported TPU types: v6e-1, v6e-4, v6e-8, v6e-16, v6e-32, v6e-64, v6e-128, v6e-256

Path B: Standalone SkyPilot TPU VM

For quick, single-node TPU usage without GKE, use the standalone launch_tpu.sh script:

# Common accelerator types: tpu-v4-8, tpu-v4-16, tpu-v6e-1, tpu-v6e-4
bash <absolute_path_to_launch_tpu.sh> <accelerator_type> <experiment_name>

The launch script automatically updates .cluster_name_tpu.

Teardown

# GPU
sky down $(cat .cluster_name_gpu) -y

# TPU (tear down all per-TPU-type clusters)
sky down <CLUSTER_NAME>-<USERNAME>-v6e-1 -y
sky down <CLUSTER_NAME>-<USERNAME>-v6e-4 -y

For GKE-based TPU, also remove the GKE cluster via /apply-resource delete if no longer needed.

4. Execution Command

GPU

sky exec $(cat .cluster_name_gpu) --workdir . "export CUDA_VISIBLE_DEVICES=0; uv run --extra gpu python <PATH_TO_SCRIPT> [ARGS]"

export CUDA_VISIBLE_DEVICES=0; ensures deterministic single-GPU execution. Adjust for multi-GPU jobs.
--extra gpu activates GPU optional dependencies (e.g. jax[cuda]).

TPU

sky exec <CLUSTER_NAME>-<USERNAME>-<TPU_TYPE> --workdir . "uv run --extra tpu python <PATH_TO_SCRIPT> [ARGS]"

--extra tpu activates TPU optional dependencies (e.g. jax[tpu]).
Use the per-TPU-type cluster name (e.g. sglang-jax-agent-tests-hongmao-v6e-1).

Common flags

--workdir . syncs the current local directory to the remote instance before running.
For pytest, use python -m pytest <test_path> instead of calling pytest directly.

5. Usage Examples

Run a benchmark on GPU:

sky exec $(cat .cluster_name_gpu) --workdir . "export CUDA_VISIBLE_DEVICES=0; uv run --extra gpu python src/lynx/perf/benchmark_train.py"

Run tests on TPU (single type):

sky exec sglang-jax-agent-tests-hongmao-v6e-4 --workdir . "uv run --extra tpu python -m pytest src/lynx/test/"

Run CI tests on multiple TPU types in parallel:

# Deploy both types (sequential — config.yaml is global)
python <deploy-cluster>/scripts/deploy.py sglang-jax-agent-tests v6e-1 asia-northeast1-b
python <deploy-cluster>/scripts/deploy.py sglang-jax-agent-tests v6e-4 asia-northeast1-b

# Execute in parallel
sky exec sglang-jax-agent-tests-hongmao-v6e-1 --workdir . "python test/srt/run_suite.py --suite unit-test-tpu-v6e-1" &
sky exec sglang-jax-agent-tests-hongmao-v6e-4 --workdir . "python test/srt/run_suite.py --suite e2e-test-tpu-v6e-4" &
wait

6. Operational Notes

Logs: SkyPilot streams stdout and stderr directly to the terminal.
Interruption: Ctrl+C may not kill the remote process; check SkyPilot docs for cleanup if needed.

7. GKE TPU Full Pipeline Procedure (Path A)

When the user requests to run code on TPU and no .cluster_name_tpu exists (or the user explicitly wants a new cluster), follow this procedure to orchestrate the full pipeline: apply-resource -> deploy-cluster -> exec-remote.

All parameters use defaults unless the user explicitly overrides them — do NOT ask for confirmation.

7.1 Collect Parameters

Only ask the user for parameters they haven't specified. Use defaults for everything else:

| Parameter | Default | Notes | |----------------|-----------------------------------|---------------------------------| | PROJECT_ID | tpu-service-473302 | GCP project ID | | CLUSTER_NAME | sglang-jax-agent-tests | GKE cluster name | | TPU_TYPE | (must specify) | e.g. v6e-4, v6e-1 | | NUM_SLICES | 1 | Default to 1 | | ZONE | asia-northeast1-b | Must support the chosen TPU type |

7.2 Create GKE Cluster (apply-resource)

Check prerequisites, then create the GKE cluster:

which xpk && which gcloud && which kubectl

xpk cluster create-pathways \
  --cluster $CLUSTER_NAME \
  --num-slices=$NUM_SLICES \
  --tpu-type=$TPU_TYPE \
  --zone=$ZONE \
  --spot \
  --project=$PROJECT_ID

7.3 Wait for GKE Cluster Ready

Poll until the cluster status becomes RUNNING. Do NOT proceed to deploy SkyPilot while status is PROVISIONING or RECONCILING — it will fail with SSL errors.

gcloud container clusters list --project=$PROJECT_ID \
  --filter="name=$CLUSTER_NAME" --format="table(name,location,status)"

7.4 Deploy SkyPilot on GKE (deploy-cluster)

Run the deploy script for each required TPU type. Each call creates a separate SkyPilot cluster.

# Deploy each TPU type (must be sequential — config.yaml is global)
# Only tpu_type is required; cluster_name and zone use defaults
python <path-to-deploy-cluster>/scripts/deploy.py v6e-1
python <path-to-deploy-cluster>/scripts/deploy.py v6e-4

This creates:

$CLUSTER_NAME-$USERNAME-v6e-1 — SkyPilot cluster for v6e-1 tests
$CLUSTER_NAME-$USERNAME-v6e-4 — SkyPilot cluster for v6e-4 tests

After completion, verify:

sky status                  # Both clusters should show as UP

7.5 Execute User Code (exec-remote)

Determine num_nodes from the TPU type (v6e-N where total_chips = N, num_nodes = N / 4, minimum 1):

| TPU type | num_nodes | |----------|-----------| | v6e-1 | 1 | | v6e-4 | 1 | | v6e-8 | 2 | | v6e-16 | 4 | | v6e-32 | 8 | | v6e-64 | 16 | | v6e-128 | 32 | | v6e-256 | 64 |

For single-node types (v6e-1, v6e-4), omit --num-nodes. For multi-node types, add --num-nodes <N>.

# Single-node (v6e-1, v6e-4) — use per-TPU-type cluster name
sky exec $CLUSTER_NAME-$USERNAME-v6e-1 --workdir . \
  "uv run --extra tpu python <PATH_TO_SCRIPT> [ARGS]"

# Multi-node (v6e-8+)
sky exec $CLUSTER_NAME-$USERNAME-v6e-8 --num-nodes 2 --workdir . \
  "uv run --extra tpu python <PATH_TO_SCRIPT> [ARGS]"

# Parallel execution across multiple TPU types
sky exec $CLUSTER_NAME-$USERNAME-v6e-1 --workdir . "..." &
sky exec $CLUSTER_NAME-$USERNAME-v6e-4 --workdir . "..." &
wait

7.6 Cleanup

When the user requests teardown, remove both layers:

# 1. Remove SkyPilot clusters (one per TPU type)
sky down $CLUSTER_NAME-$USERNAME-v6e-1 -y
sky down $CLUSTER_NAME-$USERNAME-v6e-4 -y

# 2. Remove GKE cluster (only for Path A / GKE-based)
xpk cluster delete \
  --cluster $CLUSTER_NAME \
  --zone=$ZONE \
  --project=$PROJECT_ID

primatrix/exec-remote

plugins/exec-remote/skills/exec-remote/SKILL.md

Executes Python scripts, tests, or benchmarks on a provisioned remote cluster (GPU or TPU) using SkyPilot. Use this skill when the user asks to run code on GPU, TPU, or any "remote" cluster.

development

Updated Apr 16, 2026

$ install --global

skillsauth

npx skillsauth add primatrix/skills exec-remote

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 16, 2026, 4:50 AM4.1s5 files scanned

SKILL.md

name:: exec-remote
description:: Executes Python scripts, tests, or benchmarks on a provisioned remote cluster (GPU or TPU) using SkyPilot. Use this skill when the user asks to run code on GPU, TPU, or any "remote" cluster.
argument-hint:: [gpu|tpu] [script-path] [args...]

Remote Execution Skill

This skill handles running code on remote GPU or TPU clusters via SkyPilot.

Defaults

The following defaults apply unless the user explicitly overrides them:

Use these values directly — do NOT ask the user to confirm or re-enter them unless they specify otherwise.

1. Determine Target Device

Identify the target device from the user's request:

If the user does not specify a device, ask them which one to use.

2. Prerequisites

The cluster must already be provisioned. Check that the corresponding cluster name file (.cluster_name_gpu or .cluster_name_tpu) exists and is non-empty in the project root.
If the file does not exist or is empty, provision the cluster using the appropriate method (see Section 3).

3. Cluster Provisioning

GPU (Standalone SkyPilot)

GPU clusters are provisioned using the standalone launch_gpu.sh script. Locate it in the scripts/ directory alongside this skill definition.

# Common accelerator types: H100:1, A100:1, L4:1
bash <absolute_path_to_launch_gpu.sh> <accelerator_type> <experiment_name>

The launch script automatically updates .cluster_name_gpu.

TPU

There are two provisioning paths for TPU:

Path A: GKE-based (via `deploy-cluster` skill) — Recommended

This path provisions TPU on GKE using the full pipeline: apply-resource -> deploy-cluster -> exec-remote.

Each TPU type gets its own SkyPilot cluster named <cluster>-<username>-<tpu_type>, allowing multiple topologies to run in parallel.

Use the deploy-cluster skill which will:
- Use default cluster/project/zone unless user overrides
- Ensure the GKE cluster exists (via apply-resource)
- Configure SkyPilot for GKE
- Launch a per-TPU-type SkyPilot cluster
- Save the cluster name to .cluster_name_tpu

/deploy-cluster

Supported TPU types: v6e-1, v6e-4, v6e-8, v6e-16, v6e-32, v6e-64, v6e-128, v6e-256

Path B: Standalone SkyPilot TPU VM

For quick, single-node TPU usage without GKE, use the standalone launch_tpu.sh script:

# Common accelerator types: tpu-v4-8, tpu-v4-16, tpu-v6e-1, tpu-v6e-4
bash <absolute_path_to_launch_tpu.sh> <accelerator_type> <experiment_name>

The launch script automatically updates .cluster_name_tpu.

Teardown

# GPU
sky down $(cat .cluster_name_gpu) -y

# TPU (tear down all per-TPU-type clusters)
sky down <CLUSTER_NAME>-<USERNAME>-v6e-1 -y
sky down <CLUSTER_NAME>-<USERNAME>-v6e-4 -y

For GKE-based TPU, also remove the GKE cluster via /apply-resource delete if no longer needed.

4. Execution Command

GPU

sky exec $(cat .cluster_name_gpu) --workdir . "export CUDA_VISIBLE_DEVICES=0; uv run --extra gpu python <PATH_TO_SCRIPT> [ARGS]"

export CUDA_VISIBLE_DEVICES=0; ensures deterministic single-GPU execution. Adjust for multi-GPU jobs.
--extra gpu activates GPU optional dependencies (e.g. jax[cuda]).

TPU

sky exec <CLUSTER_NAME>-<USERNAME>-<TPU_TYPE> --workdir . "uv run --extra tpu python <PATH_TO_SCRIPT> [ARGS]"

--extra tpu activates TPU optional dependencies (e.g. jax[tpu]).
Use the per-TPU-type cluster name (e.g. sglang-jax-agent-tests-hongmao-v6e-1).

Common flags

--workdir . syncs the current local directory to the remote instance before running.
For pytest, use python -m pytest <test_path> instead of calling pytest directly.

5. Usage Examples

Run a benchmark on GPU:

sky exec $(cat .cluster_name_gpu) --workdir . "export CUDA_VISIBLE_DEVICES=0; uv run --extra gpu python src/lynx/perf/benchmark_train.py"

Run tests on TPU (single type):

sky exec sglang-jax-agent-tests-hongmao-v6e-4 --workdir . "uv run --extra tpu python -m pytest src/lynx/test/"

Run CI tests on multiple TPU types in parallel:

# Deploy both types (sequential — config.yaml is global)
python <deploy-cluster>/scripts/deploy.py sglang-jax-agent-tests v6e-1 asia-northeast1-b
python <deploy-cluster>/scripts/deploy.py sglang-jax-agent-tests v6e-4 asia-northeast1-b

# Execute in parallel
sky exec sglang-jax-agent-tests-hongmao-v6e-1 --workdir . "python test/srt/run_suite.py --suite unit-test-tpu-v6e-1" &
sky exec sglang-jax-agent-tests-hongmao-v6e-4 --workdir . "python test/srt/run_suite.py --suite e2e-test-tpu-v6e-4" &
wait

6. Operational Notes

Logs: SkyPilot streams stdout and stderr directly to the terminal.
Interruption: Ctrl+C may not kill the remote process; check SkyPilot docs for cleanup if needed.

7. GKE TPU Full Pipeline Procedure (Path A)

All parameters use defaults unless the user explicitly overrides them — do NOT ask for confirmation.

7.1 Collect Parameters

Only ask the user for parameters they haven't specified. Use defaults for everything else:

7.2 Create GKE Cluster (apply-resource)

Check prerequisites, then create the GKE cluster:

which xpk && which gcloud && which kubectl

xpk cluster create-pathways \
  --cluster $CLUSTER_NAME \
  --num-slices=$NUM_SLICES \
  --tpu-type=$TPU_TYPE \
  --zone=$ZONE \
  --spot \
  --project=$PROJECT_ID

7.3 Wait for GKE Cluster Ready

Poll until the cluster status becomes RUNNING. Do NOT proceed to deploy SkyPilot while status is PROVISIONING or RECONCILING — it will fail with SSL errors.

gcloud container clusters list --project=$PROJECT_ID \
  --filter="name=$CLUSTER_NAME" --format="table(name,location,status)"

7.4 Deploy SkyPilot on GKE (deploy-cluster)

Run the deploy script for each required TPU type. Each call creates a separate SkyPilot cluster.

# Deploy each TPU type (must be sequential — config.yaml is global)
# Only tpu_type is required; cluster_name and zone use defaults
python <path-to-deploy-cluster>/scripts/deploy.py v6e-1
python <path-to-deploy-cluster>/scripts/deploy.py v6e-4

This creates:

$CLUSTER_NAME-$USERNAME-v6e-1 — SkyPilot cluster for v6e-1 tests
$CLUSTER_NAME-$USERNAME-v6e-4 — SkyPilot cluster for v6e-4 tests

After completion, verify:

sky status                  # Both clusters should show as UP

7.5 Execute User Code (exec-remote)

Determine num_nodes from the TPU type (v6e-N where total_chips = N, num_nodes = N / 4, minimum 1):

| TPU type | num_nodes | |----------|-----------| | v6e-1 | 1 | | v6e-4 | 1 | | v6e-8 | 2 | | v6e-16 | 4 | | v6e-32 | 8 | | v6e-64 | 16 | | v6e-128 | 32 | | v6e-256 | 64 |

For single-node types (v6e-1, v6e-4), omit --num-nodes. For multi-node types, add --num-nodes <N>.

# Single-node (v6e-1, v6e-4) — use per-TPU-type cluster name
sky exec $CLUSTER_NAME-$USERNAME-v6e-1 --workdir . \
  "uv run --extra tpu python <PATH_TO_SCRIPT> [ARGS]"

# Multi-node (v6e-8+)
sky exec $CLUSTER_NAME-$USERNAME-v6e-8 --num-nodes 2 --workdir . \
  "uv run --extra tpu python <PATH_TO_SCRIPT> [ARGS]"

# Parallel execution across multiple TPU types
sky exec $CLUSTER_NAME-$USERNAME-v6e-1 --workdir . "..." &
sky exec $CLUSTER_NAME-$USERNAME-v6e-4 --workdir . "..." &
wait

7.6 Cleanup

When the user requests teardown, remove both layers:

# 1. Remove SkyPilot clusters (one per TPU type)
sky down $CLUSTER_NAME-$USERNAME-v6e-1 -y
sky down $CLUSTER_NAME-$USERNAME-v6e-4 -y

# 2. Remove GKE cluster (only for Path A / GKE-based)
xpk cluster delete \
  --cluster $CLUSTER_NAME \
  --zone=$ZONE \
  --project=$PROJECT_ID

Related Skills

primatrix/memory-profile

development

VerifiedTrustedCommunity

Use when analyzing TPU pretraining HBM occupancy from a profile directory — locates the static HBM peak (the same number TensorBoard's Memory Viewer shows), enumerates every buffer alive at the peak schedule moment with size / HLO instruction / opcode / op_name, and rolls the alive set up by opcode and op_name. Reads compile-time `*.hlo_proto.pb` (BufferAssignmentProto) as the primary source; runtime `*.xplane.pb` allocator events are a secondary, often-truncated signal.

SKILL.mdUpdated May 27, 2026

primatrix/memory-profile

primatrix/compute-breakdown

testing

VerifiedTrustedCommunity

Use when analyzing TPU pretraining compute efficiency from xplane.pb — produces source-line-aggregated HLO duration tables, layer-scoped breakdowns, non-compute (padding/cast/copy) audits, and v7x roofline shortfall vs theoretical peak. Reads schema documented by profile-anatomy.

SKILL.mdUpdated May 25, 2026

primatrix/compute-breakdown

primatrix/plugins/tpu-perf/skills/comm-analysis

tools

VerifiedTrustedCommunity

--- name: comm-analysis description: Use when analyzing communication on a TPU pretraining profile — extracts every comm primitive (async + sync, TC + SparseCore), attributes axes via HLO replica_groups, computes per-row NCCL bus BW vs per-axis peak ICI BW (peak_link × k_torus_dims × directions_per_dim; TPUv7x: 200 GB/s bidir per link on a 3D torus; util% requires `--mesh-spec` with topology), and reports per-step compute/comm overlap. Builds on profile-anatomy. --- # Communication Analysis **

SKILL.mdUpdated May 25, 2026

primatrix/plugins/tpu-perf/skills/comm-analysis

primatrix/profile-anatomy

documentation

VerifiedTrustedCommunity

Use when reading TPU pretraining profiles (xplane.pb, trace.json.gz) — describes the on-disk layout, the XSpace/XPlane/XLine/XEvent/XStat hierarchy, and provides reference scripts that future tpu-perf skills can read as schema documentation.

SKILL.mdUpdated May 24, 2026

primatrix/profile-anatomy

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/primatrix/skills.git

# Copy into Claude Code skills folder (global)
cp -r skills/plugins/exec-remote/skills/exec-remote ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

primatrix/skills

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT

Adoption

primatrix/exec-remote

$ install --global

Security Scan Results

SKILL.md

Remote Execution Skill

Defaults

1. Determine Target Device

2. Prerequisites

3. Cluster Provisioning

GPU (Standalone SkyPilot)

TPU

Path A: GKE-based (via deploy-cluster skill) — Recommended

Path B: Standalone SkyPilot TPU VM

Teardown

4. Execution Command

GPU

TPU

Common flags

5. Usage Examples

6. Operational Notes

7. GKE TPU Full Pipeline Procedure (Path A)

7.1 Collect Parameters

7.2 Create GKE Cluster (apply-resource)

7.3 Wait for GKE Cluster Ready

7.4 Deploy SkyPilot on GKE (deploy-cluster)

7.5 Execute User Code (exec-remote)

7.6 Cleanup

Related Skills

primatrix/memory-profile

primatrix/compute-breakdown

primatrix/plugins/tpu-perf/skills/comm-analysis

primatrix/profile-anatomy

primatrix/exec-remote

$ install --global

Security Scan Results

SKILL.md

Remote Execution Skill

Defaults

1. Determine Target Device

2. Prerequisites

3. Cluster Provisioning

GPU (Standalone SkyPilot)

TPU

Path A: GKE-based (via deploy-cluster skill) — Recommended

Path B: Standalone SkyPilot TPU VM

Teardown

4. Execution Command

GPU

TPU

Common flags

5. Usage Examples

6. Operational Notes

7. GKE TPU Full Pipeline Procedure (Path A)

7.1 Collect Parameters

7.2 Create GKE Cluster (apply-resource)

7.3 Wait for GKE Cluster Ready

7.4 Deploy SkyPilot on GKE (deploy-cluster)

7.5 Execute User Code (exec-remote)

7.6 Cleanup

Related Skills

primatrix/memory-profile

primatrix/compute-breakdown

primatrix/plugins/tpu-perf/skills/comm-analysis

primatrix/profile-anatomy

Path A: GKE-based (via `deploy-cluster` skill) — Recommended

Path A: GKE-based (via `deploy-cluster` skill) — Recommended