Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

tianyudu/marlowe-slurm-operator

Name: marlowe-slurm-operator
Author: tianyudu

marlowe-slurm-operator/SKILL.md

npx skillsauth add tianyudu/slurm-hpc-agent-skill marlowe-slurm-operator

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Marlowe Slurm operator

This skill adapts the generic Slurm workflow to Stanford's Marlowe Computing Platform. Use Marlowe-specific documented defaults as hints, but verify current cluster state with live commands before acting.

Additional resources

Use references/marlowe-reference.md for Stanford-specific details, policies, and examples.
Use templates/preempt-interactive.sh for an interactive preempt shell.
Use templates/batch-gpu.sbatch for a medium-project GPU batch job.
Use templates/hero-gpu.sbatch for a large-project GPU batch job.

Core rules

Always verify cluster state first with read-only discovery commands.
Do not guess Marlowe account names, suffixes, partition limits, GPU counts, memory defaults, or time limits.
Follow Marlowe account conventions:
- Basic account: marlowe-<project-id>
- Medium-project suffix: marlowe-<project-id>-pmXX
- Large-project suffix: marlowe-<project-id>-plXX
If a job runs on preempt with a medium or large suffix account, assume it consumes GPU hours unless current site docs say otherwise.
Respect documented partition caps, but re-check them with sinfo and scontrol show partition before submission.
Load required modules in Marlowe job scripts: slurm, nvhpc, and cudnn/cuda12/9.3.0.75.
If MPI jobs report network fabric or component errors, try loading gcc/64 after nvhpc.
Do not propose workflows that violate Stanford usage policy, such as login-node inference, resource benchmarks on shared nodes, or bypassing login requirements with web tunnels or userspace VPNs.
Treat Open OnDemand code-server sessions as CPU-only and limited; use srun, salloc, or sbatch for GPU work.
Prefer sbatch --test-only or srun --test-only when the user wants validation rather than immediate execution.

Marlowe defaults to verify

These are documented defaults, not permanent guarantees:

preempt: basic access, up to 8 nodes, up to 12 hours, preemptible with about 15 minutes notice
batch: medium projects, up to 16 nodes, up to 2 days, requires a medium suffix account
hero: large projects, up to 25 nodes, up to 24 hours, requires a large suffix account
Hardware: DGX H100 nodes with 8 H100 80 GB GPUs and 2 TB memory per node

Verify these with:

scontrol version
scontrol show config
sinfo --summarize
sinfo --Node --long
sinfo -o "%#P %.5a %.10l %.10L %.6D %G"
scontrol show partition

Use scontrol show node <node_name> before making assumptions about per-node GPUs, memory, features, or constraints.

Account and suffix validation

Before submitting, confirm which account form applies:

sacctmgr show assoc where user=$USER format=cluster,account,user,partition,defaultqos,qos,maxjobs,maxsubmitjobs,maxtresperjob,maxtrespernode,maxwall

Use the result plus current site docs to decide:

preempt without GPU-hour charging intent: prefer the basic marlowe-<project-id> account
batch: require the correct -pmXX suffix
hero: require the correct -plXX suffix
preempt with a suffix account: warn that GPU hours may be charged

If the user forgets a valid account, expect an error similar to ACCOUNT ERROR: Did you remember to set your account?.

If sacctmgr is restricted, say so clearly and fall back to verified site documentation plus known working account examples.

Submission workflow

Interactive shell on `preempt`

Use a command like:

srun -N <nodes> -G <gpus> -A marlowe-<project-id>[-<suffix>] -p <partition> --pty bash

Before running it:

verify the partition exists and is available
verify the account form and whether a suffix is required
warn if a suffix-coded preempt job will consume GPU hours
keep nodes, GPUs, and time inside current partition limits

Interactive allocation

salloc -N <nodes> -A marlowe-<project-id>[-<suffix>] -p <partition> -t <time>

Batch jobs

Use sbatch with explicit directives for account, partition, nodes, GPUs, time, and output paths.

Minimum checklist:

validate account and suffix choice
validate partition and time cap
validate node and GPU counts against visible inventory
load required modules in the script
dry-run when appropriate:

sbatch --test-only job.sh

Monitoring and GPU-hour accounting

Use standard Slurm commands for job state:

squeue -j <job_id> --start
scontrol show jobid=<job_id>
sstat -j <job_id>
sacct -j <job_id>

For medium and large projects, track GPU-hour usage with sreport:

sreport cluster UserUtilizationByAccount -T gres/gpu Start=<YYYY-MM-DD> End=<YYYY-MM-DD> account=marlowe-<project-id>-<suffix> -t hours

Always verify the suffix in the usage query. If the task involves batch, hero, or suffix-coded preempt jobs, mention whether GPU hours are expected to be consumed.

Permissions and restricted visibility

If sacctmgr, sshare, or sreport output is restricted:

say which command is unavailable
say what certainty was lost
continue with sinfo, scontrol, squeue, and documented Marlowe conventions
do not invent hidden limits, balances, or suffix mappings

Response checklist

When helping on Marlowe:

summarize verified cluster facts first
list the exact commands used to verify them
show the smallest valid Marlowe command or script
separate verified facts from documented defaults and assumptions
explicitly call out account suffix requirements and possible GPU-hour charging
prefer unknown over a guess

tianyudu/marlowe-slurm-operator

marlowe-slurm-operator/SKILL.md

Operates Stanford's Marlowe HPC cluster with Slurm. Use when the task involves Marlowe-specific partitions, account suffixes, GPU-hour tracking, job submission, queue diagnosis, or cancellation while still verifying live cluster state first.

testing

Updated Apr 16, 2026

$ install --global

skillsauth

npx skillsauth add tianyudu/slurm-hpc-agent-skill marlowe-slurm-operator

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 16, 2026, 9:38 PM4.6s5 files scanned

SKILL.md

name:: marlowe-slurm-operator
description:: Operates Stanford's Marlowe HPC cluster with Slurm. Use when the task involves Marlowe-specific partitions, account suffixes, GPU-hour tracking, job submission, queue diagnosis, or cancellation while still verifying live cluster state first.
compatibility:: Portable Agent Skills format for OpenCode, Claude Code, and Codex. Requires Slurm CLI tools in PATH on Stanford's Marlowe cluster.
author:: TianyuDu
domain:: hpc
site:: stanford-marlowe

Marlowe Slurm operator

Additional resources

Use references/marlowe-reference.md for Stanford-specific details, policies, and examples.
Use templates/preempt-interactive.sh for an interactive preempt shell.
Use templates/batch-gpu.sbatch for a medium-project GPU batch job.
Use templates/hero-gpu.sbatch for a large-project GPU batch job.

Core rules

Always verify cluster state first with read-only discovery commands.
Do not guess Marlowe account names, suffixes, partition limits, GPU counts, memory defaults, or time limits.
Follow Marlowe account conventions:
- Basic account: marlowe-<project-id>
- Medium-project suffix: marlowe-<project-id>-pmXX
- Large-project suffix: marlowe-<project-id>-plXX
If a job runs on preempt with a medium or large suffix account, assume it consumes GPU hours unless current site docs say otherwise.
Respect documented partition caps, but re-check them with sinfo and scontrol show partition before submission.
Load required modules in Marlowe job scripts: slurm, nvhpc, and cudnn/cuda12/9.3.0.75.
If MPI jobs report network fabric or component errors, try loading gcc/64 after nvhpc.
Do not propose workflows that violate Stanford usage policy, such as login-node inference, resource benchmarks on shared nodes, or bypassing login requirements with web tunnels or userspace VPNs.
Treat Open OnDemand code-server sessions as CPU-only and limited; use srun, salloc, or sbatch for GPU work.
Prefer sbatch --test-only or srun --test-only when the user wants validation rather than immediate execution.

Marlowe defaults to verify

These are documented defaults, not permanent guarantees:

preempt: basic access, up to 8 nodes, up to 12 hours, preemptible with about 15 minutes notice
batch: medium projects, up to 16 nodes, up to 2 days, requires a medium suffix account
hero: large projects, up to 25 nodes, up to 24 hours, requires a large suffix account
Hardware: DGX H100 nodes with 8 H100 80 GB GPUs and 2 TB memory per node

Verify these with:

scontrol version
scontrol show config
sinfo --summarize
sinfo --Node --long
sinfo -o "%#P %.5a %.10l %.10L %.6D %G"
scontrol show partition

Use scontrol show node <node_name> before making assumptions about per-node GPUs, memory, features, or constraints.

Account and suffix validation

Before submitting, confirm which account form applies:

sacctmgr show assoc where user=$USER format=cluster,account,user,partition,defaultqos,qos,maxjobs,maxsubmitjobs,maxtresperjob,maxtrespernode,maxwall

Use the result plus current site docs to decide:

preempt without GPU-hour charging intent: prefer the basic marlowe-<project-id> account
batch: require the correct -pmXX suffix
hero: require the correct -plXX suffix
preempt with a suffix account: warn that GPU hours may be charged

If the user forgets a valid account, expect an error similar to ACCOUNT ERROR: Did you remember to set your account?.

If sacctmgr is restricted, say so clearly and fall back to verified site documentation plus known working account examples.

Submission workflow

Interactive shell on `preempt`

Use a command like:

srun -N <nodes> -G <gpus> -A marlowe-<project-id>[-<suffix>] -p <partition> --pty bash

Before running it:

verify the partition exists and is available
verify the account form and whether a suffix is required
warn if a suffix-coded preempt job will consume GPU hours
keep nodes, GPUs, and time inside current partition limits

Interactive allocation

salloc -N <nodes> -A marlowe-<project-id>[-<suffix>] -p <partition> -t <time>

Batch jobs

Use sbatch with explicit directives for account, partition, nodes, GPUs, time, and output paths.

Minimum checklist:

validate account and suffix choice
validate partition and time cap
validate node and GPU counts against visible inventory
load required modules in the script
dry-run when appropriate:

sbatch --test-only job.sh

Monitoring and GPU-hour accounting

Use standard Slurm commands for job state:

squeue -j <job_id> --start
scontrol show jobid=<job_id>
sstat -j <job_id>
sacct -j <job_id>

For medium and large projects, track GPU-hour usage with sreport:

sreport cluster UserUtilizationByAccount -T gres/gpu Start=<YYYY-MM-DD> End=<YYYY-MM-DD> account=marlowe-<project-id>-<suffix> -t hours

Always verify the suffix in the usage query. If the task involves batch, hero, or suffix-coded preempt jobs, mention whether GPU hours are expected to be consumed.

Permissions and restricted visibility

If sacctmgr, sshare, or sreport output is restricted:

say which command is unavailable
say what certainty was lost
continue with sinfo, scontrol, squeue, and documented Marlowe conventions
do not invent hidden limits, balances, or suffix mappings

Response checklist

When helping on Marlowe:

summarize verified cluster facts first
list the exact commands used to verify them
show the smallest valid Marlowe command or script
separate verified facts from documented defaults and assumptions
explicitly call out account suffix requirements and possible GPU-hour charging
prefer unknown over a guess

Related Skills

tianyudu/slurm-hpc-operator

testing

VerifiedTrustedCommunity

Inspects and operates Slurm-managed HPC clusters safely and portably. Use when the task involves discovering partitions, accounts, QoS, hardware, GRES, memory or time limits, validating or submitting jobs, monitoring queue state, diagnosing pending jobs, reading accounting data, or canceling jobs without assuming site-specific defaults.

SKILL.mdUpdated Apr 16, 2026

tianyudu/slurm-hpc-operator

steipete/skill-creator

testing

VerifiedTrustedCommunity

Create, edit, improve, or audit AgentSkills. Use when creating a new skill from scratch or when asked to improve, review, audit, tidy up, or clean up an existing skill or SKILL.md file. Also use when editing or restructuring a skill directory (moving files to references/ or scripts/, removing stale content, validating against the AgentSkills spec). Triggers on phrases like "create a skill", "author a skill", "tidy up a skill", "improve this skill", "review the skill", "clean up the skill", "audit the skill".

356,423SKILL.mdUpdated Apr 13, 2026

steipete/skill-creator

steipete/healthcheck

testing

VerifiedTrustedCommunity

Host security hardening and risk-tolerance configuration for OpenClaw deployments. Use when a user asks for security audits, firewall/SSH/update hardening, risk posture, exposure review, OpenClaw cron scheduling for periodic checks, or version status checks on a machine running OpenClaw (laptop, workstation, Pi, VPS).

356,423SKILL.mdUpdated Apr 13, 2026

openclaw/skill-creator

testing

VerifiedTrustedCommunity

353,662SKILL.mdUpdated Apr 10, 2026

openclaw/skill-creator

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/tianyudu/slurm-hpc-agent-skill.git

# Copy into Claude Code skills folder (global)
cp -r slurm-hpc-agent-skill/marlowe-slurm-operator ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

tianyudu/slurm-hpc-agent-skill

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT

Adoption

tianyudu/marlowe-slurm-operator

$ install --global

Security Scan Results

SKILL.md

Marlowe Slurm operator

Additional resources

Core rules

Marlowe defaults to verify

Account and suffix validation

Submission workflow

Interactive shell on preempt

Interactive allocation

Batch jobs

Monitoring and GPU-hour accounting

Permissions and restricted visibility

Response checklist

Related Skills

tianyudu/slurm-hpc-operator

steipete/skill-creator

steipete/healthcheck

openclaw/skill-creator

tianyudu/marlowe-slurm-operator

$ install --global

Security Scan Results

SKILL.md

Marlowe Slurm operator

Additional resources

Core rules

Marlowe defaults to verify

Account and suffix validation

Submission workflow

Interactive shell on preempt

Interactive allocation

Batch jobs

Monitoring and GPU-hour accounting

Permissions and restricted visibility

Response checklist

Related Skills

tianyudu/slurm-hpc-operator

steipete/skill-creator

steipete/healthcheck

openclaw/skill-creator

Interactive shell on `preempt`

Interactive shell on `preempt`