skills/fine-tuning-basics/SKILL.md
To adapt a pre-trained LLM to a specific task, tone, or domain by training it on a specialized dataset, ensuring rigid adherence to format or style. Use when: When "Prompt Engineering" fails to produce the desired format consistently; When you need to mimic a very specific brand voice or writing style; To reduce latency and costs by using a smaller model (e.g., GPT-4o-mini) that performs like a larger one on a specific task.
npx skillsauth add jyjeanne/ai-setup-forge fine-tuning-basicsInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
To adapt a pre-trained LLM to a specific task, tone, or domain by training it on a specialized dataset, ensuring rigid adherence to format or style.
Create a dataset in the specific format required by the provider (OpenAI example).
// training_data.jsonl
{"messages": [{"role": "system", "content": "You are a code reviewer."}, {"role": "user", "content": "Review this: const x = 1;"}, {"role": "assistant", "content": "LGTM. Consider using 'let' if reassigning."}]}
{"messages": [{"role": "system", "content": "You are a code reviewer."}, {"role": "user", "content": "Review this: alert('hi');"}, {"role": "assistant", "content": "Avoid 'alert' in production code."}]}
Always validate your JSONL before uploading to avoid costly training failures.
import json
def validate_data(file_path):
with open(file_path, 'r') as f:
for line in f:
try:
data = json.loads(line)
if "messages" not in data:
print("Missing 'messages' key")
except Exception as e:
print(f"Error parsing line: {e}")
validate_data("training_data.jsonl")
Upload the file and start the training process.
# Install CLI
pip install openai
# Set Key
export OPENAI_API_KEY="your-key"
# Upload file
openai files create -f training_data.jsonl -p fine-tune
# Start training
openai fine_tuning.jobs.create -t "file-id-from-upload" -m "gpt-4o-mini-2024-07-18"
Check the status and loss metrics.
# List jobs
openai fine_tuning.jobs.list
# Retrieve status
openai fine_tuning.jobs.retrieve -i "ft-job-id"
Once completed, use the new model ID in your application.
const completion = await openai.chat.completions.create({
model: "ft:gpt-4o-mini:your-org:custom-name:id",
messages: [{ role: "user", content: "Review this: console.log(1);" }],
});
A specialized model ID that delivers high-performance results on a specific, narrow task.
development
Generate breadboard circuit mockups and visual diagrams using HTML5 Canvas drawing techniques. Use when asked to create circuit layouts, visualize electronic component placements, draw breadboard diagrams, mockup 6502 builds, generate retro computer schematics, or design vintage electronics projects. Supports 555 timers, W65C02S microprocessors, 28C256 EEPROMs, W65C22 VIA chips, 7400-series logic gates, LEDs, resistors, capacitors, switches, buttons, crystals, and wires.
development
Apply lean thinking to UX: hypothesis-driven design, collaborative sketching, and rapid experiments instead of heavy deliverables. Use when the user mentions "Lean UX", "design hypothesis", "UX experiment", "collaborative design", or "outcome over output". Covers hypothesis statements, MVPs for UX, and cross-functional collaboration. For Build-Measure-Learn, see lean-startup. For usability audits, see ux-heuristics.
development
Design MVPs, validated learning experiments, and pivot-or-persevere decisions using Build-Measure-Learn. Use when the user mentions "MVP scope", "validated learning", "pivot or persevere", "vanity metrics", or "test assumptions". Covers innovation accounting and actionable metrics. For 5-day prototype testing, see design-sprint. For customer motivation analysis, see jobs-to-be-done.
tools
Instrument, trace, evaluate, and monitor LLM applications and AI agents with LangSmith. Use when setting up observability for LLM pipelines, running offline or online evaluations, managing prompts in the Prompt Hub, creating datasets for regression testing, or deploying agent servers. Triggers on: langsmith, langchain tracing, llm tracing, llm observability, llm evaluation, trace llm calls, @traceable, wrap_openai, langsmith evaluate, langsmith dataset, langsmith feedback, langsmith prompt hub, langsmith project, llm monitoring, llm debugging, llm quality, openevals, langsmith cli, langsmith experiment, annotate llm, llm judge.