/SKILL.md
TRIGGER when the user: writes or reviews ROS 2 nodes (rclcpp/rclpy), creates packages (colcon/ament), edits launch files (.launch.py), configures QoS or DDS, writes URDF/xacro, implements ros2_control hardware interfaces or controllers, sets up Nav2/MoveIt 2 pipelines, processes sensor data (camera/LiDAR/PCL), works with Gazebo/Isaac Sim, configures SROS2 security, develops micro-ROS firmware, manages multi-robot fleets (Open-RMF), debugs with ros2 doctor/rosbag2, deploys via Docker/cross-compilation, or migrates from ROS 1. DO NOT TRIGGER for general C++/Python questions unrelated to ROS 2, non-robotics middleware, or web/mobile development tasks.
npx skillsauth add dbwls99706/ros2-engineering-skills ros2-engineering-skillsInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Single responsibility: This skill is an API reference & code template guide for ROS 2 development. It tells you how to use ROS 2 APIs correctly and what mistakes to avoid. It does NOT do CI/CD orchestration, incident response, data analysis, or deployment automation — those are separate skill categories.
A progressive-disclosure skill for ROS 2 development — from first workspace to
production fleet deployment. Each section below gives you the essential decision
framework; detailed patterns, code templates, and anti-patterns live in the
references/ directory. Read the relevant reference file before writing code.
Progressive disclosure — do NOT read everything at once. This skill is structured in layers. Only load what you need for the current task:
references/*.md — load on demand. Use the Decision Router below to
pick the 1–2 files relevant to the user's current task. Do NOT read all 20
reference files — that wastes context and causes confusion.scripts/ — run only when the user needs code generation, QoS checking,
or launch validation. These are tools, not reading material.Steps:
.skill-runs.log exists in the workspace, read the last few lines to
understand what was done and what issues occurred in previous sessions.references/*.md file(s) for detailed guidance.Execution log: The Stop hook automatically appends a session summary to
.skill-runs.log in the workspace. This lets you see what was validated last
time and what issues were found — check it to avoid repeating past mistakes.
| User is doing... | Read |
|---------------------------------------------------|-----------------------------------|
| Creating a workspace, package, or build config | references/workspace-build.md |
| Writing nodes, executors, callback groups | references/nodes-executors.md |
| Topics, services, actions, custom interfaces, QoS | references/communication.md |
| Lifecycle nodes, component loading, composition | references/lifecycle-components.md |
| Launch files, conditional logic, event handlers | references/launch-system.md |
| tf2, URDF, xacro, robot_state_publisher | references/tf2-urdf.md |
| ros2_control, hardware interfaces, controllers | references/hardware-interface.md |
| Real-time constraints, PREEMPT_RT, memory, jitter | references/realtime.md |
| Nav2, SLAM, costmaps, behavior trees | references/navigation.md |
| MoveIt 2, planning scene, grasp pipelines | references/manipulation.md |
| Camera, LiDAR, PCL, cv_bridge, depth processing | references/perception.md |
| Unit tests, integration tests, launch_testing, CI | references/testing.md |
| ros2 doctor, tracing, profiling, rosbag2, CLI cheat sheet | references/debugging.md |
| Docker, cross-compile, fleet deployment, OTA | references/deployment.md |
| Gazebo, Isaac Sim, sim-to-real, use_sim_time | references/simulation.md |
| SROS2, DDS security, certificates, supply chain | references/security.md |
| micro-ROS, MCU/RTOS, XRCE-DDS, rclc | references/micro-ros.md |
| Multi-robot fleet, Open-RMF, DDS discovery scale | references/multi-robot.md |
| Message types, units, covariance, frame conventions | references/message-types.md |
| ROS 1 migration, ros1_bridge, hybrid operation | references/migration-ros1.md |
Cross-cutting concerns: Security, error handling, and QoS are not isolated to single reference files — apply them whenever the data path crosses a trust boundary, a node owns hardware, or communication reliability matters. Use your judgment about which cross-cutting concerns apply to the user's specific situation.
These apply to every ROS 2 artifact you produce, regardless of domain.
Staleness warning: The table below was last verified on 2026-03-30. If the current date is more than 6 months past that, re-verify EOL dates and feature support against https://docs.ros.org/en/rolling/Releases.html before relying on this table. When you update it, change both
LAST_UPDATEDandNEXT_REVIEWcomments above.
Always ask which ROS 2 distribution the user targets. Key differences:
| Feature | Foxy (EOL) | Humble (LTS) | Jazzy (LTS) | Kilted (non-LTS) | Rolling | |---------------------------|----------------------|--------------------|--------------------|--------------------|--------------------| | EOL | Jun 2023 (ended) | May 2027 | May 2029 | Nov 2025 | Rolling | | Ubuntu | 20.04 | 22.04 | 24.04 | 24.04 | Latest | | Default DDS | Fast DDS | Fast DDS | Fast DDS | Fast DDS | Fast DDS | | Zenoh support | — | — | — | Tier 1 | Tier 1 | | Type description support | No | No | Yes | Yes | Yes | | Service introspection | No | No | Yes | Yes | Yes | | EventsExecutor | No | No | Experimental | Stable (+ rclpy) | Stable (+ rclpy) | | Default bag format | sqlite3 | sqlite3 | MCAP | MCAP | MCAP | | ros2_control interface | N/A (separate) | 2.x | 4.x | 4.x | Latest | | CMake recommendation | ament_target_deps | ament_target_deps | either | target_link_libs | target_link_libs |
When the user does not specify, default to the latest LTS (Jazzy). Pin the exact distro in Dockerfile, CI, and documentation so builds are reproducible.
Choose the language based on the node's role, not personal preference.
Use rclcpp (C++) when:
Use rclpy (Python) when:
Mixed stacks are normal. A typical robot has C++ drivers/controllers and Python
orchestration/monitoring. Note: component_container (composition) only loads
C++ components via pluginlib. Python nodes run as separate processes, but can
share a launch file and communicate via zero-overhead intra-host DDS.
Intra-process communication works for any nodes sharing a process — not only
composable components. Any nodes instantiated in the same process with
use_intra_process_comms(true) can use zero-copy transfer.
Every package should follow this layout. Consistency across a workspace reduces onboarding time and makes CI scripts portable.
my_package/
├── CMakeLists.txt # or setup.py for pure Python
├── package.xml # format 3, with <depend> tags
├── config/
│ └── params.yaml # default parameters
├── launch/
│ └── bringup.launch.py # Python launch file
├── include/my_package/ # C++ public headers (if library)
├── src/ # C++ source files
├── my_package/ # Python modules (if ament_python or mixed)
├── test/ # gtest, pytest, launch_testing
├── urdf/ # URDF/xacro (if applicable)
├── msg/ srv/ action/ # custom interfaces (dedicated _interfaces package preferred)
└── README.md
Separate interface definitions into a *_interfaces package so downstream
packages can depend on interfaces without pulling in implementation.
ParameterDescriptor with FloatingPointRange or IntegerRange
for numeric bounds. The parameter server rejects out-of-range values at set time.controller.kp, controller.ki, controller.kd.config/params.yaml; allow launch-time overrides.set_parameters_callback and
validate new values atomically before accepting.FINALIZED and alert the operator).Start from these profiles and adjust per use case:
| Use case | Reliability | Durability | History | Depth | Deadline | Lifespan | |-----------------------|---------------|------------------|---------|-------|-------------|-------------| | Sensor stream | BEST_EFFORT | VOLATILE | KEEP_LAST | 5 | — | — | | Command velocity | RELIABLE | VOLATILE | KEEP_LAST | 1 | 100 ms | 200 ms | | Map (latched) | RELIABLE | TRANSIENT_LOCAL | KEEP_LAST | 1 | — | — | | Diagnostics | RELIABLE | VOLATILE | KEEP_LAST | 10 | — | — | | Parameter events | RELIABLE | VOLATILE | KEEP_LAST | 1000| — | — | | Action feedback | RELIABLE | VOLATILE | KEEP_LAST | 1 | — | — | | Safety heartbeat | RELIABLE | VOLATILE | KEEP_LAST | 1 | 500 ms | 1 s |
QoS mismatches are the #1 cause of "I published but nobody receives."
Always check compatibility with ros2 topic info -v when debugging.
DEADLINE and LIFESPAN are critical for safety-critical systems. DEADLINE fires an
event when no message arrives within the specified period (detect stale data). LIFESPAN
discards messages older than the specified duration before delivery (prevent acting on
stale data). See references/communication.md section 9 for full API and examples.
| Entity | Convention | Example |
|-------------|-----------------------------|--------------------------------|
| Package | snake_case | arm_controller |
| Node | snake_case | joint_state_broadcaster |
| Topic | /snake_case with ns | /arm/joint_states |
| Service | /snake_case | /arm/set_mode |
| Action | /snake_case | /arm/follow_joint_trajectory |
| Parameter | snake_case with dot ns | controller.publish_rate |
| Frame | snake_case | base_link, camera_optical |
| Interface | PascalCase.msg/srv/action | JointState.msg |
MutuallyExclusiveCallbackGroup serializes its callbacks — safe for
shared state without locks, but limits throughput.ReentrantCallbackGroup allows parallel execution — you must protect
shared state with std::mutex (C++) or threading.Lock (Python).MutuallyExclusiveCallbackGroup from the calling callback. Otherwise
the executor deadlocks — the callback waits for the response while the executor
cannot deliver it. Always use async_send_request with a response callback;
never use spin_until_future_complete inside an executor callback.sleep) inside a
timer or subscription callback on the default executor. Offload to a
dedicated thread or use a MultiThreadedExecutor with a reentrant group.std::shared_ptr<const MessageT> in subscription
callbacks to avoid unnecessary copies and enable zero-copy intra-process.Default to lifecycle (managed) nodes for anything that owns resources: hardware drivers, sensor pipelines, planners, controllers.
┌──────────────┐
create() ──► │ Unconfigured │
└──────┬───────┘
on_configure │
┌──────▼───────┐
│ Inactive │
└──────┬───────┘
on_activate │
┌──────▼───────┐
│ Active │
└──────┬───────┘
on_deactivate │
┌──────▼───────┐
│ Inactive │
└──────┬───────┘
on_cleanup │
┌──────▼───────┐
│ Unconfigured │
└──────┬───────┘
on_shutdown │
┌──────▼───────┐
│ Finalized │
└───────────────┘
This gives the system manager (launch file, orchestrator, or operator) explicit control over when resources are allocated, when the node starts processing, and how it shuts down. It also makes error recovery predictable.
colcon build --cmake-args -DCMAKE_BUILD_TYPE=RelWithDebInfo for
development; Release for deployment.-Wall -Wextra -Wpedantic and treat warnings as errors in CI.colcon test with --event-handlers console_cohesion+ so test
output groups by package.rosdep.yaml for reproducible dependency resolution./opt/ros/, .ccache/, and build//install/ in CI to cut build
times by 60–80%.| Anti-pattern | Why it hurts | Fix |
|---|---|---|
| Global variables for node state | Breaks composition, untestable | Store state as class members |
| spin() in main() for multi-node processes | Starves other nodes | Use MultiThreadedExecutor or component composition |
| Hardcoded topic names | Breaks reuse across robots | Use relative names + namespace remapping |
| KEEP_ALL history with no bound | Memory grows unbounded on slow subscribers | Use KEEP_LAST with explicit depth |
| Using time.sleep() / std::this_thread::sleep_for | Blocks the executor thread | Use create_wall_timer or a dedicated thread |
| Monolithic launch file for everything | Unmanageable past 10 nodes | Compose launch files with IncludeLaunchDescription |
| Skipping package.xml dependencies | Builds locally, breaks CI and Docker | Declare every dependency explicitly |
| Publishing in constructor | Subscribers may not be ready, messages lost | Publish in on_activate or after a short timer |
| Ignoring QoS compatibility | Silent communication failure | Match publisher/subscriber QoS or check with ros2 topic info -v |
| Creating timers/subs in callbacks | Resource leak, unpredictable behavior | Create all entities in constructor or on_configure |
| Synchronous service call in callback | Deadlocks the executor thread | Use async_send_request with a callback or dedicated thread |
| Service client in same callback group as caller | Deadlocks even with async in MultiThreadedExecutor | Put service client in a separate MutuallyExclusiveCallbackGroup |
| No safe command on shutdown | Motors hold last velocity after node exits | Send zero-velocity in on_deactivate AND destructor (see references/hardware-interface.md) |
| Dynamic subscriptions with StaticSingleThreadedExecutor | New subs are never picked up after spin() | Use SingleThreadedExecutor or MultiThreadedExecutor for dynamic entities |
| CPU frequency governor left on powersave/ondemand | 10-100 ms latency spikes in RT path | Set performance governor, disable turbo boost (see references/realtime.md) |
These are mistakes AI agents repeatedly make when generating ROS 2 code. Add a new line here every time a failure is discovered in practice.
| # | Pitfall | What goes wrong | Correct approach |
|---|---------|----------------|-----------------|
| 1 | Using spin_until_future_complete inside a callback | Deadlocks the executor — the callback blocks waiting for a response that can never be delivered | Use async_send_request with a response callback; put the service client in a separate MutuallyExclusiveCallbackGroup |
| 2 | Generating Foxy-era API for Jazzy/Kilted | node_executable is deprecated, export_state_interfaces() signature changed in ros2_control 4.x | Always check the distro feature matrix above before generating code |
| 3 | Omitting QoS in publisher/subscriber creation | Defaults silently mismatch — publisher sends but subscriber receives nothing | Always specify QoS explicitly; use the QoS defaults table in Principle 6 |
| 4 | Creating a msg/ directory inside a non-interfaces package | Builds locally but fails in CI — interface packages need rosidl_generate_interfaces | Put messages in a dedicated *_interfaces package |
| 5 | Hardcoding /opt/ros/humble/ paths in launch files | Breaks on any other distro or install prefix | Use FindPackageShare, PathJoinSubstitution, or environment substitutions |
| 6 | Forgetting <depend> tags in package.xml | colcon build works in overlay but rosdep install and Docker builds fail | Declare every find_package() / import as <depend> in package.xml |
| 7 | Using time.sleep() for rate control in rclpy | Blocks the executor thread; timers and subscriptions stop firing | Use create_timer() or Rate with a MultiThreadedExecutor |
| 8 | Not sending zero-velocity on deactivate/shutdown | Robot holds last commanded velocity when the node crashes | Send zero-command in both on_deactivate and the destructor |
| 9 | Mixing ament_target_dependencies() and target_link_libraries() | Kilted deprecated ament_target_dependencies — mixing causes link errors | Use target_link_libraries() with modern CMake targets for Kilted+; ament_target_dependencies() for Humble/Jazzy |
| 10 | Generating rospy / roscpp code instead of rclpy / rclcpp | ROS 1 patterns in a ROS 2 context — nothing compiles | This skill is ROS 2 only — always use rclpy/rclcpp APIs |
| 11 | Ignoring use_sim_time parameter in simulation | Real clock diverges from Gazebo clock — tf lookups fail, controllers drift | Set use_sim_time:=true in launch and pass --clock to ros2 bag play |
| 12 | Publishing before subscribers connect (no TRANSIENT_LOCAL) | First N messages lost — map, URDF, or initial config never received | Use TRANSIENT_LOCAL durability for latched-style data, or publish in on_activate with a startup delay |
Maintenance rule: When you encounter a new AI failure pattern while using this skill, append it to this table with the next sequential number. The pitfall list is the single most valuable section for preventing repeated mistakes.
When upgrading between distributions, check these breaking changes first:
Foxy → Humble:
ros2_control was not bundled in Foxy — must be built separately.Humble → Jazzy:
ros2_control API changed from 2.x to 4.x — export_state_interfaces() and
export_command_interfaces() are now auto-generated by the framework. Manual
overrides use on_export_state_interfaces(). See references/hardware-interface.md.get_value() deprecated → use get_optional<T>() on LoanedStateInterface /
LoanedCommandInterface (controller side). Hardware interfaces use set_state() /
get_state() / set_command() / get_command() helpers with fully qualified names.<ros2_control> tag must exist in the URDF.--param-file with spawner.storage_id='mcap'.nav2_params.yaml schema changes — recoveries_server renamed to behavior_server.ROS_AUTOMATIC_DISCOVERY_RANGE replaces ROS_LOCALHOST_ONLY (values: LOCALHOST,
SUBNET, OFF, SYSTEM_DEFAULT).launch_ros actions have new parameter handling — test launch files explicitly.Jazzy → Kilted (non-LTS):
rmw_zenoh is production-ready.
Install: sudo apt install ros-kilted-rmw-zenoh-cpp, set
RMW_IMPLEMENTATION=rmw_zenoh_cpp. Supports router/peer/client modes.rclcpp::executors
(no experimental namespace). Also ported to rclpy.ament_target_dependencies() deprecated — use target_link_libraries() with
modern CMake targets (e.g. rclcpp::rclcpp, std_msgs::std_msgs__rosidl_typesupport_cpp).ros2 bag play.ROS 1 → ROS 2:
references/migration-ros1.md for a step-by-step strategy.See references/debugging.md §10 "Quick CLI reference" for the full
command cheat sheet (workspace, introspection, ros2_control, debugging,
lifecycle). Kept out of this always-loaded file to preserve context budget.
development
Maintainer-only workflow for handling GitHub Secret Scanning alerts on OpenClaw. Use when Codex needs to triage, redact, clean up, and resolve secret leakage found in issue comments, issue bodies, PR comments, or other GitHub content.
development
Maintainer workflow for OpenClaw releases, prereleases, changelog release notes, and publish validation. Use when Codex needs to prepare or verify stable or beta release steps, align version naming, assemble release notes, check release auth requirements, or validate publish-time commands and artifacts.
development
Run, watch, debug, and extend OpenClaw QA testing with qa-lab and qa-channel. Use when Codex needs to execute the repo-backed QA suite, inspect live QA artifacts, debug failing scenarios, add new QA scenarios, or explain the OpenClaw QA workflow. Prefer the live OpenAI lane with regular openai/gpt-5.4 in fast mode; do not use gpt-5.4-pro or gpt-5.4-mini unless the user explicitly overrides that policy.
development
End-to-end Parallels smoke, upgrade, and rerun workflow for OpenClaw across macOS, Windows, and Linux guests. Use when Codex needs to run, rerun, debug, or interpret VM-based install, onboarding, gateway smoke tests, latest-release-to-main upgrade checks, fresh snapshot retests, or optional Discord roundtrip verification under Parallels.