skills/microsoft-foundry/finetuning/SKILL.md
Fine-tune models on Azure AI Foundry using SFT (supervised), DPO (preference), or RFT (reinforcement with graders). Covers dataset preparation, training job submission, deployment, and evaluation. USE FOR: fine-tune, SFT, DPO, RFT, training data, grader, distillation, fine-tuned model, training job, large file upload, calibrate grader, deploy fine-tuned model, evaluate fine-tuned model. DO NOT USE FOR: general model deployment without fine-tuning (use deploy-model), agent creation (use agents), prompt optimization without training (use prompt-optimizer).
npx skillsauth add microsoft/azure-skills finetuningInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Fine-tune models using SFT (supervised), DPO (preference), or RFT (reinforcement with graders). Covers dataset prep, training, deployment, and evaluation.
Use this sub-skill when the user asks about:
Do NOT use for: General model deployment without fine-tuning (use deploy-model), agent creation (use agents), prompt optimization without training (use prompt-optimizer).
| Stage | Guide | |-------|-------| | Quick start | workflows/quickstart.md | | Full pipeline | workflows/full-pipeline.md | | Create data | workflows/dataset-creation.md | | Iterate | workflows/iterative-training.md | | Diagnose | workflows/diagnose-poor-results.md |
| Topic | File | |-------|------| | SFT vs DPO vs RFT | references/training-types.md | | Hyperparameters | references/hyperparameters.md | | Data formats | references/dataset-formats.md | | Grader design (RFT) | references/grader-design.md | | Reward hacking | references/reward-hacking.md | | Agentic RFT (tools) | references/agentic-rft.md | | Deployment | references/deployment.md | | Training curves | references/training-curves.md | | Evaluation | references/evaluation.md | | Vision fine-tuning | references/vision-fine-tuning.md | | Large file uploads | references/large-file-uploads.md | | Platform gotchas | references/platform-gotchas.md |
| Script | Purpose |
|--------|---------|
| scripts/submit_training.py | Submit SFT/DPO/RFT jobs |
| scripts/monitor_training.py | Poll job until completion |
| scripts/calibrate_grader.py | Find optimal RFT pass_threshold |
| scripts/check_training.py | Analyze curves, list checkpoints |
| scripts/deploy_model.py | Deploy via ARM REST API |
| scripts/evaluate_model.py | LLM judge evaluation |
| scripts/convert_dataset.py | Convert between SFT/DPO/RFT formats |
| scripts/generate_distillation_data.py | Generate synthetic training data |
| scripts/score_dataset.py | Quality scoring on training data |
| scripts/cleanup.py | Delete old files and deployments |
| scripts/validate/ | Data validators (SFT, DPO, RFT) + stats |
scripts/validate/validate_sft.py| Task | Command |
|------|---------|
| Validate SFT data | python scripts/validate/validate_sft.py data.jsonl |
| Submit SFT job | python scripts/submit_training.py --model gpt-4.1-mini --training-file train.jsonl --validation-file val.jsonl --type sft |
| Monitor job | python scripts/monitor_training.py --job-id ftjob-xxx |
| Analyze curves | python scripts/check_training.py --job-id ftjob-xxx |
| Deploy model | python scripts/deploy_model.py --model-id ft:gpt-4.1-mini:... --name my-eval |
| Evaluate model | python scripts/evaluate_model.py --deployment-name my-eval --test-file test.jsonl |
| Error | Cause | Fix |
|-------|-------|-----|
| "API version not supported" | Older openai SDK on /v1/ endpoint | Upgrade to openai>=1.0 |
| "does not support fine-tuning with Standard TrainingType" | OSS model needs globalStandard | Use --use-rest flag or script auto-falls back |
| Job stuck in post-training eval | Under-provisioned tool endpoint (RFT) | Scale to S2+, enable Always On |
| "DeploymentNotReady" after ARM succeeds | ARM/data-plane race condition | Delete and recreate deployment, wait 5 min |
| Content safety block at deployment | PII-dense training data | Remove problematic document types |
tools
Deploy, evaluate, fine-tune, and manage Foundry agents end-to-end: Docker build, ACR push, hosted/prompt agent create, batch eval, continuous eval, prompt optimizer, Agent Optimizer scaffold, agent.yaml, dataset curation from traces, model fine-tuning (SFT/DPO/RFT). USE FOR: deploy agent, hosted agent, create agent, add tool to agent, invoke agent, evaluate agent, continuous eval, continuous monitoring, optimize prompt, improve prompt, optimize agent instructions, agent optimizer, deploy model, Foundry project, RBAC, role assignment, permissions, quota, capacity, region, troubleshoot agent, deployment failure, AI Services, create Foundry resource, provision, knowledge index, customize deployment, onboard, availability, fine-tune, SFT, DPO, RFT, training-data, grader, distillation, fine-tuned model, large file upload. DO NOT USE FOR: Azure Functions, App Service, general Azure deploy (use azure-deploy), general Azure prep (use azure-prepare).
testing
Architect and provision enterprise Azure infrastructure from workload descriptions. For cloud architects and platform engineers planning networking, identity, security, compliance, and multi-resource topologies with WAF alignment. Generates Bicep or Terraform directly (no azd). WHEN: 'plan Azure infrastructure', 'architect Azure landing zone', 'design hub-spoke network', 'plan multi-region DR topology', 'set up VNets firewalls and private endpoints', 'subscription-scope Bicep deployment', 'Azure Backup for VM workloads'. PREFER azure-prepare FOR app-centric workflows.
testing
Azure cost management: query costs, forecast spending, optimize to reduce waste. WHEN: "Azure costs", "Azure bill", "cost breakdown", "how much am I spending", "forecast spending", "optimize costs", "reduce spending", "orphaned resources", "rightsize VMs", "cost spike", "reduce storage costs", "AKS cost". DO NOT USE FOR: deploying resources, provisioning, diagnostics, or security audits.
development
Assess and upgrade Azure workloads between plans, tiers, or SKUs, or modernize Azure SDK dependencies in source code. WHEN: upgrade Consumption to Flex Consumption, upgrade Azure Functions plan, change hosting plan, function app SKU, migrate App Service to Container Apps, modernize legacy Azure Java SDKs (com.microsoft.azure to com.azure), migrate Azure Cache for Redis (ACR/ACRE) to Azure Managed Redis (AMR).