skills/docker-ros2-development/SKILL.md
Best practices for Docker-based ROS2 development including multi-stage Dockerfiles, docker-compose for multi-container robotic systems, DDS discovery across containers, GPU passthrough for perception, and dev-vs-deploy container patterns. Use this skill when containerizing ROS2 workspaces, setting up docker-compose for robot software stacks, debugging DDS communication between containers, configuring NVIDIA Container Toolkit for GPU workloads, forwarding X11/Wayland for rviz2 and GUI tools, or managing USB device passthrough for cameras and serial devices. Trigger whenever the user mentions Docker with ROS2, docker-compose for robots, Dockerfile for colcon workspaces, container networking for DDS, GPU containers for perception, devcontainer for ROS2, multi-stage builds for ROS2, or deploying ROS2 in containers. Also trigger for CI/CD with Docker-based ROS2 builds, CycloneDDS or FastDDS configuration in containers, shared memory in Docker, or X11 forwarding for rviz2. Covers Humble, Iron, Jazzy, and Rolling distributions across Ubuntu 22.04 and 24.04 base images.
npx skillsauth add arpitg1304/robotics-agent-skills docker-ros2-developmentInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Official OSRF images follow a layered hierarchy. Always choose the smallest base that satisfies dependencies.
┌──────────────────────────────────────────────────────────────────┐
│ ros:<distro>-desktop-full (~3.5 GB) │
│ ┌────────────────────────────────────────────────────────────┐ │
│ │ ros:<distro>-desktop (~2.8 GB) │ │
│ │ ┌──────────────────────────────────────────────────────┐ │ │
│ │ │ ros:<distro>-perception (~2.2 GB) │ │ │
│ │ │ ┌────────────────────────────────────────────────┐ │ │ │
│ │ │ │ ros:<distro>-ros-base (~1.1 GB) │ │ │ │
│ │ │ │ ┌──────────────────────────────────────────┐ │ │ │ │
│ │ │ │ │ ros:<distro>-ros-core (~700 MB) │ │ │ │ │
│ │ │ │ └──────────────────────────────────────────┘ │ │ │ │
│ │ │ └────────────────────────────────────────────────┘ │ │ │
│ │ └──────────────────────────────────────────────────────┘ │ │
│ └────────────────────────────────────────────────────────────┘ │
└──────────────────────────────────────────────────────────────────┘
| Image Tag | Base OS | Size | Contents | Use Case |
|--------------------------|----------------|---------|---------------------------------------------|-------------------------------------|
| ros:humble-ros-core | Ubuntu 22.04 | ~700 MB | rclcpp, rclpy, rosout, launch | Minimal runtime for single nodes |
| ros:humble-ros-base | Ubuntu 22.04 | ~1.1 GB | ros-core + common_interfaces, rosbag2 | Most production deployments |
| ros:humble-perception | Ubuntu 22.04 | ~2.2 GB | ros-base + image_transport, cv_bridge, PCL | Camera/lidar perception pipelines |
| ros:humble-desktop | Ubuntu 22.04 | ~2.8 GB | perception + rviz2, rqt, demos | Development with GUI tools |
| ros:jazzy-ros-core | Ubuntu 24.04 | ~750 MB | rclcpp, rclpy, rosout, launch | Minimal runtime (Jazzy/Noble) |
| ros:jazzy-ros-base | Ubuntu 24.04 | ~1.2 GB | ros-core + common_interfaces, rosbag2 | Production deployments (Jazzy) |
The development stage includes build tools, debuggers, and editor support for interactive use.
FROM ros:humble-desktop AS dev
RUN apt-get update && apt-get install -y --no-install-recommends \
build-essential cmake gdb python3-pip \
python3-colcon-common-extensions python3-rosdep \
ros-humble-ament-lint-auto ros-humble-ament-cmake-pytest \
ccache \
&& rm -rf /var/lib/apt/lists/*
ENV CCACHE_DIR=/ccache
ENV CC="ccache gcc"
ENV CXX="ccache g++"
Copies only src/ and package.xml files to maximize cache hits during dependency resolution.
FROM ros:humble-ros-base AS build
RUN apt-get update && apt-get install -y --no-install-recommends \
python3-colcon-common-extensions python3-rosdep \
&& rm -rf /var/lib/apt/lists/*
WORKDIR /ros2_ws
# Copy package manifests first for dependency caching
COPY src/my_pkg/package.xml src/my_pkg/package.xml
RUN . /opt/ros/humble/setup.sh && apt-get update && \
rosdep install --from-paths src --ignore-src -r -y && \
rm -rf /var/lib/apt/lists/*
# Source changes invalidate only this layer and below
COPY src/ src/
RUN . /opt/ros/humble/setup.sh && \
colcon build --cmake-args -DCMAKE_BUILD_TYPE=Release \
--event-handlers console_direct+
Contains only the built install space and runtime dependencies. No compilers, no source code.
FROM ros:humble-ros-core AS runtime
RUN apt-get update && apt-get install -y --no-install-recommends \
python3-yaml ros-humble-rmw-cyclonedds-cpp \
&& rm -rf /var/lib/apt/lists/*
COPY --from=build /ros2_ws/install /ros2_ws/install
RUN groupadd -r rosuser && useradd -r -g rosuser -m rosuser
USER rosuser
COPY ros_entrypoint.sh /ros_entrypoint.sh
ENTRYPOINT ["/ros_entrypoint.sh"]
CMD ["ros2", "launch", "my_pkg", "bringup.launch.py"]
# syntax=docker/dockerfile:1
# Usage:
# docker build --target dev -t my_robot:dev .
# docker build --target runtime -t my_robot:latest .
ARG ROS_DISTRO=humble
ARG BASE_IMAGE=ros:${ROS_DISTRO}-ros-base
# Stage 1: Dependency base — install apt and rosdep deps
FROM ${BASE_IMAGE} AS deps
RUN apt-get update && apt-get install -y --no-install-recommends \
python3-colcon-common-extensions python3-rosdep \
&& rm -rf /var/lib/apt/lists/*
WORKDIR /ros2_ws
# Copy only package.xml files for rosdep resolution (maximizes cache reuse)
COPY src/my_robot_bringup/package.xml src/my_robot_bringup/package.xml
COPY src/my_robot_perception/package.xml src/my_robot_perception/package.xml
COPY src/my_robot_msgs/package.xml src/my_robot_msgs/package.xml
COPY src/my_robot_navigation/package.xml src/my_robot_navigation/package.xml
RUN . /opt/ros/${ROS_DISTRO}/setup.sh && \
apt-get update && \
rosdep install --from-paths src --ignore-src -r -y && \
rm -rf /var/lib/apt/lists/*
# Stage 2: Development — full dev environment
FROM deps AS dev
RUN apt-get update && apt-get install -y --no-install-recommends \
build-essential gdb valgrind ccache python3-pip python3-pytest \
ros-${ROS_DISTRO}-ament-lint-auto \
ros-${ROS_DISTRO}-launch-testing-ament-cmake \
ros-${ROS_DISTRO}-rviz2 ros-${ROS_DISTRO}-rqt-graph \
&& rm -rf /var/lib/apt/lists/*
ENV CCACHE_DIR=/ccache CC="ccache gcc" CXX="ccache g++"
COPY src/ src/
COPY ros_entrypoint.sh /ros_entrypoint.sh
ENTRYPOINT ["/ros_entrypoint.sh"]
CMD ["bash"]
# Stage 3: Build — compile workspace
FROM deps AS build
COPY src/ src/
RUN . /opt/ros/${ROS_DISTRO}/setup.sh && \
colcon build \
--cmake-args -DCMAKE_BUILD_TYPE=Release -DBUILD_TESTING=OFF \
--event-handlers console_direct+ \
--parallel-workers $(nproc)
# Stage 4: Runtime — minimal production image
FROM ros:${ROS_DISTRO}-ros-core AS runtime
ARG ROS_DISTRO=humble
RUN apt-get update && apt-get install -y --no-install-recommends \
python3-yaml ros-${ROS_DISTRO}-rmw-cyclonedds-cpp \
&& rm -rf /var/lib/apt/lists/*
COPY --from=build /ros2_ws/install /ros2_ws/install
RUN groupadd -r rosuser && useradd -r -g rosuser -m -s /bin/bash rosuser
USER rosuser
ENV RMW_IMPLEMENTATION=rmw_cyclonedds_cpp
COPY ros_entrypoint.sh /ros_entrypoint.sh
ENTRYPOINT ["/ros_entrypoint.sh"]
CMD ["ros2", "launch", "my_robot_bringup", "robot.launch.py"]
The entrypoint script both dev and runtime stages use:
#!/bin/bash
set -e
source /opt/ros/${ROS_DISTRO}/setup.bash
if [ -f /ros2_ws/install/setup.bash ]; then
source /ros2_ws/install/setup.bash
fi
exec "$@"
Each ROS2 subsystem runs in its own container with process isolation, independent scaling, and per-service resource limits.
# docker-compose.yml
version: "3.8"
x-ros-common: &ros-common
environment:
- ROS_DOMAIN_ID=${ROS_DOMAIN_ID:-0}
- RMW_IMPLEMENTATION=rmw_cyclonedds_cpp
- CYCLONEDDS_URI=file:///cyclonedds.xml
volumes:
- ./config/cyclonedds.xml:/cyclonedds.xml:ro
- /dev/shm:/dev/shm
network_mode: host
restart: unless-stopped
services:
rosbridge:
<<: *ros-common
image: my_robot:latest
command: ros2 launch rosbridge_server rosbridge_websocket_launch.xml port:=9090
perception:
<<: *ros-common
image: my_robot_perception:latest
command: ros2 launch my_robot_perception perception.launch.py
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities: [gpu]
devices:
- /dev/video0:/dev/video0 # USB camera passthrough
navigation:
<<: *ros-common
image: my_robot_navigation:latest
command: >
ros2 launch my_robot_navigation navigation.launch.py
use_sim_time:=false map:=/maps/warehouse.yaml
volumes:
- ./maps:/maps:ro
driver:
<<: *ros-common
image: my_robot_driver:latest
command: ros2 launch my_robot_driver driver.launch.py
devices:
- /dev/ttyUSB0:/dev/ttyUSB0 # Serial motor controller
- /dev/ttyACM0:/dev/ttyACM0 # IMU over USB-serial
group_add:
- dialout
services:
driver:
<<: *ros-common
image: my_robot_driver:latest
healthcheck:
test: ["CMD", "bash", "-c",
"source /opt/ros/humble/setup.bash && ros2 topic list | grep -q /joint_states"]
interval: 5s
timeout: 10s
retries: 5
start_period: 15s
navigation:
<<: *ros-common
image: my_robot_navigation:latest
depends_on:
driver:
condition: service_healthy # Wait for driver topics
perception:
<<: *ros-common
image: my_robot_perception:latest
depends_on:
driver:
condition: service_healthy # Camera driver must be ready
services:
driver:
<<: *ros-common
image: my_robot_driver:latest
command: ros2 launch my_robot_driver driver.launch.py
rviz:
<<: *ros-common
profiles: ["dev"]
image: my_robot:dev
command: ros2 run rviz2 rviz2 -d /rviz/config.rviz
environment:
- DISPLAY=${DISPLAY}
- QT_X11_NO_MITSHM=1
volumes:
- /tmp/.X11-unix:/tmp/.X11-unix:rw
rosbag_record:
<<: *ros-common
profiles: ["dev"]
image: my_robot:dev
command: ros2 bag record -a --storage sqlite3 --max-bag-duration 300 -o /bags/session
volumes:
- ./bags:/bags
watchdog:
<<: *ros-common
profiles: ["deploy"]
image: my_robot:latest
command: ros2 launch my_robot_bringup watchdog.launch.py
restart: always
docker compose --profile dev up # Dev tools (rviz, rosbag)
docker compose --profile deploy up -d # Production (watchdog, no GUI)
When containers use bridge networking (no multicast), configure explicit unicast peer lists.
<!-- cyclonedds.xml -->
<?xml version="1.0" encoding="UTF-8"?>
<CycloneDDS xmlns="https://cdds.io/config">
<Domain>
<General>
<Interfaces>
<NetworkInterface autodetermine="true" priority="default"/>
</Interfaces>
<AllowMulticast>false</AllowMulticast>
</General>
<Discovery>
<!-- Peer list uses docker-compose service names as hostnames -->
<Peers>
<Peer address="perception"/>
<Peer address="navigation"/>
<Peer address="driver"/>
<Peer address="rosbridge"/>
</Peers>
<ParticipantIndex>auto</ParticipantIndex>
<MaxAutoParticipantIndex>120</MaxAutoParticipantIndex>
</Discovery>
<Internal>
<SocketReceiveBufferSize min="10MB"/>
</Internal>
</Domain>
</CycloneDDS>
<!-- fastdds.xml -->
<?xml version="1.0" encoding="UTF-8"?>
<dds xmlns="http://www.eprosima.com/XMLSchemas/fastRTPS_Profiles">
<profiles>
<participant profile_name="docker_participant" is_default_profile="true">
<rtps>
<builtin>
<discovery_config>
<discoveryProtocol>SIMPLE</discoveryProtocol>
<leaseDuration><sec>10</sec></leaseDuration>
</discovery_config>
<initialPeersList>
<locator>
<udpv4><address>perception</address><port>7412</port></udpv4>
</locator>
<locator>
<udpv4><address>navigation</address><port>7412</port></udpv4>
</locator>
<locator>
<udpv4><address>driver</address><port>7412</port></udpv4>
</locator>
</initialPeersList>
</builtin>
</rtps>
</participant>
</profiles>
</dds>
Mount and activate in compose:
# CycloneDDS
environment:
- RMW_IMPLEMENTATION=rmw_cyclonedds_cpp
- CYCLONEDDS_URI=file:///cyclonedds.xml
volumes:
- ./config/cyclonedds.xml:/cyclonedds.xml:ro
# FastDDS
environment:
- RMW_IMPLEMENTATION=rmw_fastrtps_cpp
- FASTRTPS_DEFAULT_PROFILES_FILE=/fastdds.xml
volumes:
- ./config/fastdds.xml:/fastdds.xml:ro
DDS shared memory (zero-copy) requires /dev/shm sharing between containers. This provides highest throughput for large messages (images, point clouds).
services:
perception:
shm_size: "512m" # Default 64 MB is too small for image topics
volumes:
- /dev/shm:/dev/shm # Share host shm for inter-container zero-copy
<!-- Enable shared memory in CycloneDDS -->
<CycloneDDS xmlns="https://cdds.io/config">
<Domain>
<SharedMemory>
<Enable>true</Enable>
</SharedMemory>
</Domain>
</CycloneDDS>
Constraints: all communicating containers must share /dev/shm or use ipc: host. Use --ipc=shareable on one container and --ipc=container:<name> on others for scoped sharing.
services:
my_node:
network_mode: host # Shares host network namespace; DDS multicast works natively
services:
my_node:
networks: [ros_net]
networks:
ros_net:
driver: bridge # DDS multicast blocked; requires unicast peer config
networks:
ros_macvlan:
driver: macvlan
driver_opts:
parent: eth0
ipam:
config:
- subnet: 192.168.1.0/24
gateway: 192.168.1.1
services:
my_node:
networks:
ros_macvlan:
ipv4_address: 192.168.1.50 # Real LAN IP; DDS multicast works natively
| Factor | Host | Bridge | Macvlan | |---------------------|------------------|-------------------------|-----------------------| | DDS discovery | Works natively | Needs unicast peers | Works natively | | Network isolation | None | Full isolation | LAN-level isolation | | Port conflicts | Yes (host ports) | No (mapped ports) | No (unique IPs) | | Performance | Native | Slight overhead | Near-native | | Multi-host support | No | With overlay networks | Yes (same LAN) | | When to use | Dev, single host | CI/CD, multi-tenant | Multi-robot on LAN |
# Install NVIDIA Container Toolkit on the host
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey \
| sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg
curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list \
| sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' \
| sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit
sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker
services:
perception:
image: my_robot_perception:latest
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1 # Number of GPUs (or "all")
capabilities: [gpu]
environment:
- NVIDIA_VISIBLE_DEVICES=all
- NVIDIA_DRIVER_CAPABILITIES=compute,utility,video
shm_size: "1g" # Large shm for GPU<->CPU transfers
For Dockerfiles that need CUDA, start from NVIDIA base and install ROS2 on top:
FROM nvidia/cuda:12.2.0-cudnn8-runtime-ubuntu22.04 AS perception-base
RUN apt-get update && apt-get install -y --no-install-recommends \
curl gnupg2 lsb-release \
&& curl -sSL https://raw.githubusercontent.com/ros/rosdistro/master/ros.key \
-o /usr/share/keyrings/ros-archive-keyring.gpg \
&& echo "deb [arch=$(dpkg --print-architecture) \
signed-by=/usr/share/keyrings/ros-archive-keyring.gpg] \
http://packages.ros.org/ros2/ubuntu $(lsb_release -cs) main" \
> /etc/apt/sources.list.d/ros2.list \
&& apt-get update && apt-get install -y --no-install-recommends \
ros-humble-ros-base ros-humble-cv-bridge ros-humble-image-transport \
&& rm -rf /var/lib/apt/lists/*
docker compose exec perception bash -c '
nvidia-smi
python3 -c "import torch; print(f\"CUDA available: {torch.cuda.is_available()}\")"
'
services:
rviz:
image: my_robot:dev
command: ros2 run rviz2 rviz2
environment:
- DISPLAY=${DISPLAY:-:0} # Forward host display
- QT_X11_NO_MITSHM=1 # Disable MIT-SHM (crashes in Docker)
volumes:
- /tmp/.X11-unix:/tmp/.X11-unix:rw # X11 socket
- ${HOME}/.Xauthority:/root/.Xauthority:ro # Auth cookie
network_mode: host
# Allow local Docker containers to access the X server
xhost +local:docker
# More secure variant:
xhost +SI:localuser:$(whoami)
services:
rviz:
image: my_robot:dev
command: ros2 run rviz2 rviz2
environment:
- WAYLAND_DISPLAY=${WAYLAND_DISPLAY:-wayland-0}
- XDG_RUNTIME_DIR=/run/user/1000
- QT_QPA_PLATFORM=wayland
volumes:
- ${XDG_RUNTIME_DIR}/${WAYLAND_DISPLAY}:/run/user/1000/${WAYLAND_DISPLAY}:rw
For CI/CD or remote machines without a physical display:
# Run rviz2 headless with Xvfb for screenshot capture or testing
docker run --rm my_robot:dev bash -c '
apt-get update && apt-get install -y xvfb mesa-utils &&
Xvfb :99 -screen 0 1920x1080x24 &
export DISPLAY=:99
source /opt/ros/humble/setup.bash
ros2 run rviz2 rviz2 -d /config/test.rviz --screenshot /output/frame.png
'
Mount only src/ during development. Let colcon write build/, install/, and log/ inside named volumes to avoid bind mount performance issues.
# BAD: mounting entire workspace — build artifacts on bind mount are slow
# volumes:
# - ./my_ros2_ws:/ros2_ws
# GOOD: mount only source, use named volumes for build artifacts
services:
dev:
image: my_robot:dev
volumes:
- ./src:/ros2_ws/src:rw # Source code (bind mount)
- build_vol:/ros2_ws/build # Build artifacts (named volume)
- install_vol:/ros2_ws/install # Install space (named volume)
- log_vol:/ros2_ws/log # Log output (named volume)
working_dir: /ros2_ws
volumes:
build_vol:
install_vol:
log_vol:
Persist ccache across container rebuilds for faster C++ compilation:
services:
dev:
volumes:
- ccache_vol:/ccache
environment:
- CCACHE_DIR=/ccache
- CCACHE_MAXSIZE=2G
- CC=ccache gcc
- CXX=ccache g++
volumes:
ccache_vol:
Keep upstream packages cached and only rebuild custom packages:
# Stage 1: upstream dependencies (rarely changes)
FROM ros:humble-ros-base AS upstream
RUN apt-get update && apt-get install -y --no-install-recommends \
ros-humble-nav2-bringup ros-humble-slam-toolbox \
ros-humble-robot-localization \
&& rm -rf /var/lib/apt/lists/*
# Stage 2: custom packages overlay on top
FROM upstream AS workspace
WORKDIR /ros2_ws
COPY src/ src/
RUN . /opt/ros/humble/setup.sh && colcon build --symlink-install
# install/setup.bash automatically sources /opt/ros/humble as underlay
services:
camera_driver:
image: my_robot_driver:latest
devices:
- /dev/video0:/dev/video0 # USB camera (V4L2)
- /dev/video1:/dev/video1
group_add:
- video # Access /dev/videoN without root
motor_driver:
image: my_robot_driver:latest
devices:
- /dev/ttyUSB0:/dev/ttyUSB0 # USB-serial motor controller
- /dev/ttyACM0:/dev/ttyACM0 # Arduino/Teensy
group_add:
- dialout # Access serial ports without root
Create stable device symlinks on the host so container paths remain consistent regardless of USB enumeration order.
# /etc/udev/rules.d/99-robot-devices.rules (host-side)
SUBSYSTEM=="tty", ATTRS{idVendor}=="0403", ATTRS{idProduct}=="6001", SYMLINK+="robot/motor_controller"
SUBSYSTEM=="tty", ATTRS{idVendor}=="10c4", ATTRS{idProduct}=="ea60", SYMLINK+="robot/lidar"
SUBSYSTEM=="video4linux", ATTRS{idVendor}=="046d", ATTRS{idProduct}=="0825", SYMLINK+="robot/camera"
sudo udevadm control --reload-rules && sudo udevadm trigger
services:
driver:
devices:
- /dev/robot/motor_controller:/dev/ttyMOTOR # Stable symlink
- /dev/robot/lidar:/dev/ttyLIDAR
- /dev/robot/camera:/dev/video0
For devices plugged in after the container starts:
services:
driver:
# Option 1: privileged (use only when necessary)
privileged: true
volumes:
- /dev:/dev
# Option 2: cgroup device rules (more secure)
# device_cgroup_rules:
# - 'c 188:* rmw' # USB-serial (major 188)
# - 'c 81:* rmw' # Video devices (major 81)
// .devcontainer/devcontainer.json
{
"name": "ROS2 Humble Dev",
"build": {
"dockerfile": "../Dockerfile",
"target": "dev",
"args": { "ROS_DISTRO": "humble" }
},
"runArgs": [
"--network=host", "--ipc=host", "--pid=host",
"--privileged", "--gpus", "all",
"-e", "DISPLAY=${localEnv:DISPLAY}",
"-e", "QT_X11_NO_MITSHM=1",
"-v", "/tmp/.X11-unix:/tmp/.X11-unix:rw",
"-v", "/dev:/dev"
],
"workspaceMount": "source=${localWorkspaceFolder},target=/ros2_ws/src,type=bind",
"workspaceFolder": "/ros2_ws",
"mounts": [
"source=ros2-build-vol,target=/ros2_ws/build,type=volume",
"source=ros2-install-vol,target=/ros2_ws/install,type=volume",
"source=ros2-log-vol,target=/ros2_ws/log,type=volume",
"source=ros2-ccache-vol,target=/ccache,type=volume"
],
"containerEnv": {
"ROS_DISTRO": "humble",
"RMW_IMPLEMENTATION": "rmw_cyclonedds_cpp",
"CCACHE_DIR": "/ccache",
"RCUTILS_COLORIZED_OUTPUT": "1"
},
"customizations": {
"vscode": {
"extensions": [
"ms-iot.vscode-ros",
"ms-vscode.cpptools",
"ms-python.python",
"ms-vscode.cmake-tools",
"smilerobotics.urdf",
"redhat.vscode-xml",
"redhat.vscode-yaml"
],
"settings": {
"ros.distro": "humble",
"python.defaultInterpreterPath": "/usr/bin/python3",
"C_Cpp.default.compileCommands": "/ros2_ws/build/compile_commands.json",
"cmake.configureOnOpen": false
}
}
},
"postCreateCommand": "sudo apt-get update && rosdep update && rosdep install --from-paths src --ignore-src -r -y",
"remoteUser": "rosuser"
}
# .github/workflows/ros2-docker-ci.yml
name: ROS2 Docker CI
on:
push:
branches: [main, develop]
pull_request:
branches: [main]
env:
REGISTRY: ghcr.io
IMAGE_NAME: ${{ github.repository }}
jobs:
build-and-test:
runs-on: ubuntu-22.04
strategy:
fail-fast: false
matrix:
ros_distro: [humble, jazzy]
include:
- ros_distro: humble
ubuntu: "22.04"
- ros_distro: jazzy
ubuntu: "24.04"
steps:
- uses: actions/checkout@v4
- uses: docker/setup-buildx-action@v3
- uses: docker/login-action@v3
with:
registry: ${{ env.REGISTRY }}
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Cache Docker layers
uses: actions/cache@v4
with:
path: /tmp/.buildx-cache
key: ${{ runner.os }}-buildx-${{ matrix.ros_distro }}-${{ hashFiles('src/**/package.xml') }}
restore-keys: ${{ runner.os }}-buildx-${{ matrix.ros_distro }}-
- name: Build and test
run: |
docker build --target dev \
--build-arg ROS_DISTRO=${{ matrix.ros_distro }} \
-t test-image:${{ matrix.ros_distro }} .
docker run --rm test-image:${{ matrix.ros_distro }} bash -c '
source /opt/ros/${{ matrix.ros_distro }}/setup.bash &&
cd /ros2_ws &&
colcon build --cmake-args -DBUILD_TESTING=ON &&
colcon test --event-handlers console_direct+ &&
colcon test-result --verbose'
- name: Push runtime image
if: github.ref == 'refs/heads/main'
uses: docker/build-push-action@v5
with:
context: .
target: runtime
build-args: ROS_DISTRO=${{ matrix.ros_distro }}
tags: |
${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ matrix.ros_distro }}-latest
${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ matrix.ros_distro }}-${{ github.sha }}
push: true
cache-from: type=local,src=/tmp/.buildx-cache
cache-to: type=local,dest=/tmp/.buildx-cache-new,mode=max
- name: Rotate cache
run: rm -rf /tmp/.buildx-cache && mv /tmp/.buildx-cache-new /tmp/.buildx-cache
Order Dockerfile instructions from least-frequently-changed to most-frequently-changed:
1. Base image (ros:humble-ros-base) — changes on distro upgrade
2. System apt packages — changes on new dependency
3. rosdep install (from package.xml) — changes on new ROS dep
4. COPY src/ src/ — changes on every code edit
5. colcon build — rebuilds on source change
strategy:
matrix:
ros_distro: [humble, iron, jazzy, rolling]
rmw: [rmw_cyclonedds_cpp, rmw_fastrtps_cpp]
exclude:
- ros_distro: iron # Iron EOL — skip
rmw: rmw_fastrtps_cpp
Problem: Putting perception, navigation, planning, and drivers in a single container defeats the purpose of containerization. A crash in one subsystem takes down everything.
Fix: Split into one service per subsystem. Use docker-compose to orchestrate.
# BAD: monolithic container
services:
robot:
image: my_robot:latest
command: ros2 launch my_robot everything.launch.py
# GOOD: one container per subsystem
services:
perception:
image: my_robot_perception:latest
command: ros2 launch my_robot_perception perception.launch.py
navigation:
image: my_robot_navigation:latest
command: ros2 launch my_robot_navigation navigation.launch.py
driver:
image: my_robot_driver:latest
command: ros2 launch my_robot_driver driver.launch.py
Problem: DDS uses multicast for discovery by default. Docker bridge networks do not forward multicast. Nodes in different containers will not discover each other.
Fix: Use network_mode: host or configure DDS unicast peers explicitly.
# BAD: bridge network with no DDS config
services:
node_a:
networks: [ros_net]
node_b:
networks: [ros_net]
# GOOD: host networking (simplest)
services:
node_a:
network_mode: host
node_b:
network_mode: host
# GOOD: bridge with CycloneDDS unicast peers
services:
node_a:
networks: [ros_net]
environment:
- RMW_IMPLEMENTATION=rmw_cyclonedds_cpp
- CYCLONEDDS_URI=file:///cyclonedds.xml
volumes:
- ./cyclonedds.xml:/cyclonedds.xml:ro
Problem: Installing compilers and build tools in the runtime image bloats it by 1-2 GB and increases attack surface.
Fix: Use multi-stage builds. Compile in a build stage, copy only the install space to runtime.
# BAD: build tools in runtime image (2.5 GB)
FROM ros:humble-ros-base
RUN apt-get update && apt-get install -y build-essential python3-colcon-common-extensions
COPY src/ /ros2_ws/src/
RUN cd /ros2_ws && colcon build
CMD ["ros2", "launch", "my_pkg", "bringup.launch.py"]
# GOOD: multi-stage build (800 MB)
FROM ros:humble-ros-base AS build
RUN apt-get update && apt-get install -y python3-colcon-common-extensions
COPY src/ /ros2_ws/src/
RUN cd /ros2_ws && . /opt/ros/humble/setup.sh && colcon build
FROM ros:humble-ros-core AS runtime
COPY --from=build /ros2_ws/install /ros2_ws/install
CMD ["ros2", "launch", "my_pkg", "bringup.launch.py"]
Problem: Mounting the full workspace means colcon writes build/, install/, and log/ to a bind mount. On macOS/Windows Docker Desktop, bind mount I/O is 10-50x slower. Builds that take 2 minutes take 20+ minutes.
Fix: Mount only src/ as a bind mount. Use named volumes for build artifacts.
# BAD:
volumes:
- ./my_ros2_ws:/ros2_ws
# GOOD:
volumes:
- ./my_ros2_ws/src:/ros2_ws/src
- build_vol:/ros2_ws/build
- install_vol:/ros2_ws/install
- log_vol:/ros2_ws/log
Problem: Running as root inside containers is a security risk. If compromised, the attacker has root access to mounted volumes and devices.
Fix: Create a non-root user with appropriate group membership.
# BAD:
FROM ros:humble-ros-base
COPY --from=build /ros2_ws/install /ros2_ws/install
CMD ["ros2", "launch", "my_pkg", "bringup.launch.py"]
# GOOD:
FROM ros:humble-ros-base
RUN groupadd -r rosuser && \
useradd -r -g rosuser -G video,dialout -m -s /bin/bash rosuser
COPY --from=build --chown=rosuser:rosuser /ros2_ws/install /ros2_ws/install
USER rosuser
CMD ["ros2", "launch", "my_pkg", "bringup.launch.py"]
Problem: Placing COPY src/ . before rosdep install means every source change invalidates the dependency cache. All apt packages are re-downloaded on every build.
Fix: Copy only package.xml files first, install dependencies, then copy source.
# BAD: source copy before rosdep
COPY src/ /ros2_ws/src/
RUN rosdep install --from-paths src --ignore-src -r -y
RUN colcon build
# GOOD: package.xml first, then rosdep, then source
COPY src/my_pkg/package.xml /ros2_ws/src/my_pkg/package.xml
RUN . /opt/ros/humble/setup.sh && rosdep install --from-paths src --ignore-src -r -y
COPY src/ /ros2_ws/src/
RUN . /opt/ros/humble/setup.sh && colcon build
Problem: Hardcoding ROS_DOMAIN_ID=42 causes conflicts when multiple robots or developers share a network. Two robots with the same domain ID will cross-talk.
Fix: Use environment variables with defaults. Set domain ID at deploy time.
# BAD:
environment:
- ROS_DOMAIN_ID=42
# GOOD:
environment:
- ROS_DOMAIN_ID=${ROS_DOMAIN_ID:-0}
ROS_DOMAIN_ID=1 docker compose up -d # Robot 1
ROS_DOMAIN_ID=2 docker compose up -d # Robot 2
Problem: The CMD runs ros2 launch ... but the shell has not sourced the ROS2 setup files. Fails with ros2: command not found.
Fix: Use an entrypoint script that sources the underlay and overlay before executing the command.
# BAD: no sourcing
FROM ros:humble-ros-core
COPY --from=build /ros2_ws/install /ros2_ws/install
CMD ["ros2", "launch", "my_pkg", "bringup.launch.py"]
# GOOD: entrypoint script handles sourcing
FROM ros:humble-ros-core
COPY --from=build /ros2_ws/install /ros2_ws/install
COPY ros_entrypoint.sh /ros_entrypoint.sh
RUN chmod +x /ros_entrypoint.sh
ENTRYPOINT ["/ros_entrypoint.sh"]
CMD ["ros2", "launch", "my_pkg", "bringup.launch.py"]
#!/bin/bash
# ros_entrypoint.sh
set -e
source /opt/ros/${ROS_DISTRO}/setup.bash
if [ -f /ros2_ws/install/setup.bash ]; then
source /ros2_ws/install/setup.bash
fi
exec "$@"
rosuser with only the group memberships needed (video, dialout) instead of privileged modecyclonedds.xml or fastdds.xml with explicit peer lists if not using host networkingmem_limit, cpus, and GPU reservations for each service to prevent resource starvationrestart: unless-stopped for all services and restart: always for critical drivers and watchdogs.env files or orchestrator secrets for ROS_DOMAIN_ID, DDS config paths, and device mappings rather than hardcodingshm_size to at least 256 MB (512 MB for image topics) and configure DDS shared memory transport for high-bandwidth topics/dev/robot/* symlinks and reference those in compose device mappingslatest and a commit SHA or semantic version; never deploy unversioned latest to productiondevelopment
Comprehensive best practices, design patterns, and common pitfalls for ROS2 (Robot Operating System 2) development. Use this skill when building ROS2 nodes, packages, launch files, components, or debugging ROS2 systems. Trigger whenever the user mentions ROS2, colcon, rclpy, rclcpp, DDS, QoS, lifecycle nodes, managed nodes, ROS2 launch, ROS2 parameters, ROS2 actions, nav2, MoveIt2, micro-ROS, or any ROS2-era robotics middleware. Also trigger for ROS2 workspace setup, DDS tuning, intra-process communication, ROS2 security, or deploying ROS2 in production. Also trigger for colcon build issues, ament_cmake, ament_python, CMakeLists.txt for ROS2, package.xml dependencies, rosdep, workspace overlays, custom message generation, or ROS2 build troubleshooting. Covers Humble, Iron, Jazzy, and Rolling distributions.
development
Patterns and best practices for integrating ROS2 systems with web technologies including REST APIs, WebSocket bridges, and browser-based robot interfaces. Use this skill when building web dashboards for robots, streaming camera feeds to browsers, exposing ROS2 services as REST endpoints, or implementing bidirectional WebSocket communication between web UIs and ROS2 nodes. Trigger whenever the user mentions rosbridge, rosbridge_suite, roslibjs, FastAPI with ROS2, Flask with rclpy, WebSocket for robot telemetry, MJPEG streaming, WebRTC for robots, REST API wrapping ROS2 services, web-based robot control, browser robot interface, robot dashboard, CORS configuration for robots, or any web-to-ROS2 bridge pattern. Also trigger for authentication on robot web interfaces, rate limiting sensor streams, video streaming from robot cameras to browsers, or running async web frameworks alongside the ROS2 executor. Covers rosbridge_suite, FastAPI, Flask, WebSocket, and WebRTC approaches.
tools
Best practices, design patterns, and common pitfalls for ROS1 (Robot Operating System 1) development. Use this skill when building ROS1 nodes, packages, launch files, or debugging ROS1 systems. Trigger whenever the user mentions ROS1, catkin, rospy, roscpp, roslaunch, roscore, rostopic, tf, actionlib, message types, services, or any ROS1-era robotics middleware. Also trigger for migrating ROS1 code to ROS2, maintaining legacy ROS1 systems, or building ROS1-ROS2 bridges. Covers catkin workspaces, nodelets, dynamic reconfigure, pluginlib, and the full ROS1 ecosystem.
tools
Testing strategies, patterns, and tools for robotics software. Use this skill when writing unit tests, integration tests, simulation tests, or hardware-in-the-loop tests for robot systems. Trigger whenever the user mentions testing ROS nodes, pytest with ROS, launch_testing, simulation testing, CI/CD for robotics, test fixtures for sensors, mock hardware, deterministic replay, regression testing for robot behaviors, or validating perception/planning/control pipelines. Also covers property-based testing for kinematics, fuzz testing for message handlers, and golden-file testing for trajectories.