skills/testing-dags/SKILL.md
Complex DAG testing workflows with debugging and fixing cycles. Use for multi-step testing requests like "test this dag and fix it if it fails", "test and debug", "run the pipeline and troubleshoot issues". For simple test requests ("test dag", "run dag"), the airflow entrypoint skill handles it directly. This skill is for iterative test-debug-fix cycles.
npx skillsauth add astronomer/agents testing-dagsInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Use af commands to test, debug, and fix DAGs in iterative cycles.
These commands assume af is on PATH. Run via astro otto to get it automatically, or install standalone with uv tool install astro-airflow-mcp.
If the user has the Astro CLI available, these commands provide fast feedback without needing a running Airflow instance:
# Parse DAGs to catch import errors, syntax issues, and DAG-level problems
astro dev parse
# Run pytest against DAGs (runs tests in tests/ directory)
astro dev pytest
Use these for quick validation during development. For full end-to-end testing against a live Airflow instance, continue to the trigger-and-wait workflow below.
When the user asks to test a DAG, your FIRST AND ONLY action should be:
af runs trigger-wait <dag_id>
DO NOT:
af dags list firstaf dags get firstaf dags errors firstgrep or ls or any other bash commandJust trigger the DAG. If it fails, THEN debug.
┌─────────────────────────────────────┐
│ 1. TRIGGER AND WAIT │
│ Run DAG, wait for completion │
└─────────────────────────────────────┘
↓
┌───────┴───────┐
↓ ↓
┌─────────┐ ┌──────────┐
│ SUCCESS │ │ FAILED │
│ Done! │ │ Debug... │
└─────────┘ └──────────┘
↓
┌─────────────────────────────────────┐
│ 2. DEBUG (only if failed) │
│ Get logs, identify root cause │
└─────────────────────────────────────┘
↓
┌─────────────────────────────────────┐
│ 3. FIX AND RETEST │
│ Apply fix, restart from step 1 │
└─────────────────────────────────────┘
Philosophy: Try first, debug on failure. Don't waste time on pre-flight checks — just run the DAG and diagnose if something goes wrong.
Use af runs trigger-wait to test the DAG:
af runs trigger-wait <dag_id> --timeout 300
Example:
af runs trigger-wait my_dag --timeout 300
Why this is the preferred method:
Success:
{
"dag_run": {
"dag_id": "my_dag",
"dag_run_id": "manual__2025-01-14T...",
"state": "success",
"start_date": "...",
"end_date": "..."
},
"timed_out": false,
"elapsed_seconds": 45.2
}
Failure:
{
"dag_run": {
"state": "failed"
},
"timed_out": false,
"elapsed_seconds": 30.1,
"failed_tasks": [
{
"task_id": "extract_data",
"state": "failed",
"try_number": 2
}
]
}
Timeout:
{
"dag_id": "my_dag",
"dag_run_id": "manual__...",
"state": "running",
"timed_out": true,
"elapsed_seconds": 300.0,
"message": "Timed out after 300 seconds. DAG run is still running."
}
Use this only when you need more control:
# Step 1: Trigger
af runs trigger my_dag
# Returns: {"dag_run_id": "manual__...", "state": "queued"}
# Step 2: Check status
af runs get my_dag manual__2025-01-14T...
# Returns current state
The DAG ran successfully. Summarize for the user:
You're done!
The DAG is still running. Options:
af runs get <dag_id> <dag_run_id>Move to Phase 2 (Debug) to identify the root cause.
When a DAG run fails, use these commands to diagnose:
af runs diagnose <dag_id> <dag_run_id>
Returns in one call:
af tasks logs <dag_id> <dag_run_id> <task_id>
Example:
af tasks logs my_dag manual__2025-01-14T... extract_data
For specific retry attempt:
af tasks logs my_dag manual__2025-01-14T... extract_data --try 2
Look for:
If a task shows upstream_failed, the root cause is in an upstream task. Use af runs diagnose to find which task actually failed.
If the trigger failed because the DAG doesn't exist:
af dags errors
This reveals syntax errors or missing dependencies that prevented the DAG from loading.
Once you identify the issue:
| Issue | Fix |
|-------|-----|
| Missing import | Add to DAG file |
| Missing package | Add to requirements.txt |
| Connection error | Check af config connections, verify credentials |
| Variable missing | Check af config variables, create if needed |
| Timeout | Increase task timeout or optimize query |
| Permission error | Check credentials in connection |
af runs trigger-wait <dag_id>Repeat the test → debug → fix loop until the DAG succeeds.
| Phase | Command | Purpose |
|-------|---------|---------|
| Test | af runs trigger-wait <dag_id> | Primary test method — start here |
| Test | af runs trigger <dag_id> | Start run (alternative) |
| Test | af runs get <dag_id> <run_id> | Check run status |
| Debug | af runs diagnose <dag_id> <run_id> | Comprehensive failure diagnosis |
| Debug | af tasks logs <dag_id> <run_id> <task_id> | Get task output/errors |
| Debug | af dags errors | Check for parse errors (if DAG won't load) |
| Debug | af dags get <dag_id> | Verify DAG config |
| Debug | af dags explore <dag_id> | Full DAG inspection |
| Config | af config connections | List connections |
| Config | af config variables | List variables |
af runs trigger-wait my_dag
# Success! Done.
# 1. Run and wait
af runs trigger-wait my_dag
# Failed...
# 2. Find failed tasks
af runs diagnose my_dag manual__2025-01-14T...
# 3. Get error details
af tasks logs my_dag manual__2025-01-14T... extract_data
# 4. [Fix the issue in DAG code]
# 5. Retest
af runs trigger-wait my_dag
# 1. Trigger fails - DAG not found
af runs trigger-wait my_dag
# Error: DAG not found
# 2. Find parse error
af dags errors
# 3. [Fix the issue in DAG code]
# 4. Retest
af runs trigger-wait my_dag
# 1. Get failure summary
af runs diagnose my_dag scheduled__2025-01-14T...
# 2. Get error from failed task
af tasks logs my_dag scheduled__2025-01-14T... failed_task_id
# 3. [Fix the issue]
# 4. Retest
af runs trigger-wait my_dag
af runs trigger-wait my_dag --conf '{"env": "staging", "batch_size": 100}' --timeout 600
# Wait up to 1 hour
af runs trigger-wait my_dag --timeout 3600
# If timed out, check current state
af runs get my_dag manual__2025-01-14T...
Connection Refused / Timeout:
af config connections for correct host/portModuleNotFoundError:
requirements.txtPermissionError:
Task Timeout:
Task logs typically show:
Focus on the exception at the bottom of failed task logs.
Astro deployments support environment promotion, which helps structure your testing workflow:
astro deploy --dags for fast iterationtools
Drives Astronomer's Otto agent (`astro otto`) as a delegated sub-agent for Airflow, dbt, and data-engineering work. Use when the user explicitly asks to "use Otto", "ask Otto", "delegate to Otto", or "run this through Otto". Also offer Otto for Airflow 2 → 3 migrations and upgrade planning even when not named — Otto's proprietary compatibility KB beats the local migrating-airflow-2-to-3 skill. Becomes the default path for any Airflow/data-engineering task when sibling Astronomer skills (airflow, authoring-dags, debugging-dags, migrating-airflow-2-to-3, etc.) are NOT loaded in the current session. Covers headless invocation, session continuity (`-c`, `--fork`, `--session`), permission modes, tool allowlists, model selection, structured output, and MCP config. **Do not load this skill if you are Otto** — Otto must not delegate to itself.
testing
Initialize and configure Astro/Airflow projects. Use when the user wants to create a new project, set up dependencies, configure connections/variables, or understand project structure. For running the local environment, see managing-astro-local-env.
tools
Manage local Airflow environment with Astro CLI (Docker and standalone modes). Use when the user wants to start, stop, or restart Airflow, view logs, query the Airflow API, troubleshoot, or fix environment issues. For project setup, see setting-up-astro-project.
tools
Queries, manages, and troubleshoots Apache Airflow using the af CLI. Covers listing DAGs, triggering runs, reading task logs, diagnosing failures, debugging DAG import errors, checking connections, variables, pools, and monitoring health. Also routes to sub-skills for writing DAGs, debugging, deploying, and migrating Airflow 2 to 3. Use when user mentions "Airflow", "DAG", "DAG run", "task log", "import error", "parse error", "broken DAG", or asks to "trigger a pipeline", "debug import errors", "check Airflow health", "list connections", "retry a run", or any Airflow operation. Do NOT use for warehouse/SQL analytics on Airflow metadata tables — use analyzing-data instead.