skills/hugging-face-model-trainer/SKILL.md
Train or fine-tune TRL language models on Hugging Face Jobs, including SFT, DPO, GRPO, and GGUF export.
npx skillsauth add ranbot-ai/awesome-skills hugging-face-model-trainerInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Train language models using TRL (Transformer Reinforcement Learning) on fully managed Hugging Face infrastructure. No local GPU setup required—models train on cloud GPUs and results are automatically saved to the Hugging Face Hub.
TRL provides multiple training methods:
For detailed TRL method documentation:
hf_doc_search("your query", product="trl")
hf_doc_fetch("https://huggingface.co/docs/trl/sft_trainer") # SFT
hf_doc_fetch("https://huggingface.co/docs/trl/dpo_trainer") # DPO
# etc.
See also: references/training_methods.md for method overviews and selection guidance
Use this skill when users want to:
Use Unsloth (references/unsloth.md) instead of standard TRL when:
FastVisionModel supportSee references/unsloth.md for complete Unsloth documentation and scripts/unsloth_sft_example.py for a production-ready training script.
When assisting with training jobs:
ALWAYS use hf_jobs() MCP tool - Submit jobs using hf_jobs("uv", {...}), NOT bash trl-jobs commands. The script parameter accepts Python code directly. Do NOT save to local files unless the user explicitly requests it. Pass the script content as a string to hf_jobs(). If user asks to "train a model", "fine-tune", or similar requests, you MUST create the training script AND submit the job immediately using hf_jobs().
Always include Trackio - Every training script should include Trackio for real-time monitoring. Use example scripts in scripts/ as templates.
Provide job details after submission - After submitting, provide job ID, monitoring URL, estimated time, and note that the user can request status checks later.
Use example scripts as templates - Reference scripts/train_sft_example.py, scripts/train_dpo_example.py, etc. as starting points.
Repository scripts use PEP 723 inline dependencies. Run them with uv run:
uv run scripts/estimate_cost.py --help
uv run scripts/dataset_inspector.py --help
Before starting any training job, verify:
hf_whoami()secrets={"HF_TOKEN": "$HF_TOKEN"} in job config to make token available (the $HF_TOKEN syntax
references your actual token value)datasets.load_dataset()push_to_hub=True, hub_model_id="username/model-name"; Job: secrets={"HF_TOKEN": "$HF_TOKEN"}⚠️ IMPORTANT: Training jobs run asynchronously and can take hours
When user requests training:
scripts/train_sft_example.py as template)hf_jobs() MCP tool with script content inline - don't save to file unless user requestsdevelopment
Production-grade Android app development guide covering native (Kotlin/Java), cross-platform (Flutter, RN, KMM), and hybrid architectures.
testing
Plan, orchestrate, and adversarially verify parallel AI coding agents with a dynamic multi-agent workflow engine.
development
Generate professional, ATS-optimized CVs for FlowCV, Canva, Google Docs, or Word. Handles multi-source merging, JD targeting, seniority adaptation, and humanized rewriting. Outputs paste-ready text wi
tools
Generate hand-drawn 16:9 article illustrations with the Grav character IP, sparse annotations, and absurd but clear visual metaphors.