skills/domains/ai-ml/vmas-simulator-guide/SKILL.md
Vectorized multi-agent reinforcement learning simulator
npx skillsauth add wentorai/research-plugins vmas-simulator-guideInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
VMAS is a vectorized simulator for multi-agent reinforcement learning (MARL) that runs thousands of parallel environments on GPU via PyTorch. It provides a diverse set of 2D cooperative, competitive, and mixed scenarios for benchmarking multi-agent algorithms. Orders of magnitude faster than CPU-based simulators, enabling rapid research iteration on multi-agent coordination problems.
pip install vmas
import vmas
# Create vectorized environment
env = vmas.make_env(
scenario="simple_spread",
num_envs=1024, # Parallel environments
num_agents=3,
device="cuda", # GPU acceleration
continuous_actions=True,
)
# Environment loop
obs = env.reset()
for step in range(100):
# Random actions for demonstration
actions = [env.action_space[i].sample()
for i in range(env.n_agents)]
obs, rewards, dones, infos = env.step(actions)
# obs: list of [num_envs, obs_dim] tensors
# rewards: list of [num_envs] tensors
| Scenario | Type | Agents | Description | |----------|------|--------|-------------| | simple_spread | Cooperative | 3 | Cover N landmarks | | simple_tag | Competitive | 4 | Predator-prey | | transport | Cooperative | 4 | Move package to goal | | wheel | Cooperative | 4 | Coordination on wheel | | flocking | Cooperative | 5+ | Reynolds flocking | | discovery | Cooperative | 3 | Explore and discover | | navigation | Mixed | N | Multi-agent navigation |
# With TorchRL
from torchrl.envs import VmasEnv
env = VmasEnv(
scenario="simple_spread",
num_envs=512,
device="cuda",
)
# With RLlib
from ray.rllib.env import MultiAgentEnv
# VMAS provides RLlib-compatible wrapper
# With CleanRL / custom training
import torch
env = vmas.make_env("transport", num_envs=2048, device="cuda")
obs = env.reset()
# All tensors on GPU — train directly without CPU transfer
policy_output = policy_network(obs[0]) # Agent 0 observations
from vmas import Scenario, Agent, World, Landmark
class MyScenario(Scenario):
def make_world(self, batch_dim, device):
world = World(batch_dim=batch_dim, device=device)
world.add_agent(Agent(name="agent_0"))
world.add_agent(Agent(name="agent_1"))
world.add_landmark(Landmark(name="goal"))
return world
def reset_world(self, env, world):
# Randomize positions
for agent in world.agents:
agent.set_pos(torch.rand(env.batch_dim, 2) * 2 - 1)
def reward(self, agent, world):
# Distance to goal
goal = world.landmarks[0]
return -torch.linalg.norm(agent.state.pos - goal.state.pos,
dim=-1)
# Register and use
env = vmas.make_env(MyScenario(), num_envs=512)
tools
10 document processing skills. Trigger: extracting text from PDFs, parsing references, document Q&A. Design: parsing pipelines (GROBID, marker) and structured extraction tools.
documentation
Guide to tldraw for infinite canvas whiteboarding and diagram creation
testing
Create graphical abstracts, schematic diagrams, and scientific illustrations
documentation
Create UML diagrams and architecture visualizations with PlantUML