.claude/skills/railway-troubleshooting/SKILL.md
Railway debugging and issue resolution. Use when deployments fail, builds error, services crash, performance degrades, or networking issues occur.
npx skillsauth add adaptationio/skrillz railway-troubleshootingInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Systematic debugging and issue resolution for Railway.com deployments.
This skill provides decision trees, diagnostic workflows, and recovery procedures for Railway platform issues. It covers build failures, runtime crashes, networking problems, database issues, and performance degradation.
Use this decision tree to diagnose and resolve Railway issues:
Railway Issue?
│
├── Deployment Failed?
│ ├── Build Error → Operation 1: Diagnose Build Failures
│ ├── Deploy Error → Operation 1: Diagnose Deployment Failures
│ ├── Health Check Failed → Check service health endpoint
│ └── Timeout → Check build/deploy timeouts in settings
│
├── Service Crashing?
│ ├── Immediate crash → Operation 2: Debug Runtime Crashes
│ ├── Crash after time → Check memory limits, memory leaks
│ ├── Restart loop → Check startup command, dependencies
│ └── Exit code errors → Check application logs for specifics
│
├── Networking Issues?
│ ├── Service unreachable → Operation 3: Troubleshoot Networking
│ ├── Intermittent connectivity → Check DNS, service discovery
│ ├── SSL errors → Check domain configuration, certificates
│ └── Timeout errors → Check port configuration, firewalls
│
├── Build Issues?
│ ├── Nixpacks detection wrong → Operation 4: Fix Build Errors
│ ├── Dependencies failing → Check package.json, requirements.txt
│ ├── Build commands failing → Verify build scripts
│ └── Cache issues → Clear build cache, force rebuild
│
└── Database Problems?
├── Connection refused → Operation 5: Resolve Database Issues
├── Timeout errors → Check connection pools, query performance
├── Performance slow → Check indices, query optimization
└── Data corruption → Check backups, recovery procedures
Identify and resolve deployment failures through systematic log analysis.
When to use: Deployment status shows failed, builds succeed but deploys fail, health checks failing.
Workflow:
Check Deployment Status
# CLI approach
railway status
railway logs --deployment
# API approach (see references/debug-workflow.md for GraphQL)
# Query deployment status and recent deploys
Analyze Deploy Logs
Common Deploy Failures
process.env.PORTFix and Redeploy
See: references/common-errors.md for specific error messages and solutions.
Investigate and resolve service crashes and restart loops.
When to use: Service shows restarting, exit codes in logs, OOM errors, crash reports.
Workflow:
Gather Crash Information
# Get runtime logs
railway logs --tail 500
# Check service metrics
railway metrics
# Use diagnostic script
./scripts/diagnose.sh [service-id] --verbose
Identify Crash Pattern
Check Resource Limits
Common Crash Causes
See: references/debug-workflow.md for systematic debugging steps.
Resolve networking issues including service discovery, DNS, and connectivity.
When to use: Services can't reach each other, DNS resolution fails, external access issues, SSL errors.
Workflow:
Verify Service Discovery
# Check private networking enabled
# Services use: [service-name].[project-name].railway.internal
# Test DNS resolution
railway run nslookup [service-name].[project-name].railway.internal
Check Network Configuration
Debug External Access
Common Network Issues
See: references/common-errors.md Network Errors section.
Resolve build failures, nixpacks configuration issues, and dependency problems.
When to use: Build fails, wrong builder detected, dependencies not installing, build commands fail.
Workflow:
Check Build Logs
railway logs --build
# Identify build phase failure:
# - Detection phase: Nixpacks provider detection
# - Install phase: Dependencies installation
# - Build phase: Build commands execution
Verify Builder Configuration
Fix Dependency Issues
Force Rebuild if Needed
# Clear cache and rebuild
./scripts/force-rebuild.sh [service-id] --no-cache
# Or via CLI
railway up --detach
Common Build Errors:
See: references/common-errors.md Build Errors section.
Debug database connection problems, timeouts, and performance issues.
When to use: Connection refused, database timeouts, slow queries, connection pool exhausted.
Workflow:
Verify Database Connection
# Check database service status
railway status
# Test connection with database URL
railway run psql $DATABASE_URL -c "SELECT 1"
Check Connection Configuration
Debug Connection Issues
Performance Troubleshooting
Emergency Recovery:
railway restart [service-id]See: references/recovery-procedures.md for emergency procedures.
railway-auth: Authentication setup for Railway CLI/APIrailway-logs: Advanced log querying and analysisrailway-deployment: Deployment workflows and strategiesrailway-api: GraphQL API queries and operationsUse railway-troubleshooting when you encounter:
Run the diagnostic script for automated issue detection:
cd /mnt/c/data/github/skrillz/.claude/skills/railway-troubleshooting/scripts
./diagnose.sh [service-id] --verbose
The script will:
references/common-errors.md - 20+ documented errors with solutionsreferences/debug-workflow.md - Systematic debugging methodologyreferences/recovery-procedures.md - Emergency recovery stepsscripts/diagnose.sh - Automated diagnosticsscripts/force-rebuild.sh - Clear cache and rebuildFor issues not covered by this skill:
development
Setup secure web-based terminal access to WSL2 from mobile/tablet via ttyd + ngrok/Cloudflare/Tailscale. One-command install, start, stop, status. Use when you need remote terminal access, web terminal, browser-based shell, or mobile access to WSL2 environment.
development
Complete development workflows where Claude writes the code while Gemini and Codex provide research, planning, reviews, and different perspectives. Claude remains the main developer. Use for complex projects requiring expert planning and multi-perspective reviews.
development
Systematic progress tracking for skill development. Manages task states (pending/in_progress/completed), updates in real-time, reports progress, identifies blockers, and maintains momentum. Use when tracking skill development, coordinating work, or reporting progress.
testing
Comprehensive testing workflow orchestrating functional testing, example validation, integration testing, and usability assessment. Sequential workflow for complete skill testing from examples through scenarios to integration validation. Use when conducting thorough testing, pre-deployment validation, ensuring skill functionality, or comprehensive quality checks.