home/dot_config/exact_agents/skills/convert-to-transformers/SKILL.md
Use this skill when converting custom PyTorch models to Hugging Face Transformers format. Helps with: (1) Creating PretrainedConfig and PreTrainedModel classes, (2) Writing ImageProcessor/Tokenizer, (3) Compatibility testing, (4) Hub upload preparation. Use when the user wants to make their model compatible with transformers library.
npx skillsauth add shunk031/dotfiles transformers-convertInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Convert custom PyTorch models to Hugging Face Transformers format while maintaining exact compatibility with the original implementation.
This skill provides a systematic workflow for transformers conversion:
Important: Use validation mode (parallel implementations) first to verify equivalence, then replace the original.
Ask the user to specify:
Then identify:
Key principle: Extract ALL hardcoded values from the model as configurable parameters.
Template:
from transformers import PretrainedConfig
from typing import List, Optional
class {ModelName}Config(PretrainedConfig):
model_type = "{model_name}"
def __init__(
self,
# Core architecture parameters
hidden_dim: int = 128,
num_layers: int = 4,
# Input/output parameters
image_size: int = 1024,
num_channels: int = 3,
num_labels: int = 1,
# Component-specific parameters (extract from original)
component_param: List[int] | None = None,
**kwargs,
):
super().__init__(**kwargs)
self.hidden_dim = hidden_dim
self.num_layers = num_layers
self.image_size = image_size
self.num_channels = num_channels
self.num_labels = num_labels
# Use default if not specified
self.component_param = (
component_param if component_param is not None else [1, 2, 4]
)
Critical: Ensure default values match what the pretrained weights expect!
Template:
from transformers import PreTrainedModel
from transformers.modeling_outputs import SemanticSegmenterOutput
class {ModelName}ForTask(PreTrainedModel):
config_class = {ModelName}Config
def __init__(self, config: {ModelName}Config):
super().__init__(config)
self.config = config
# Initialize layers using config parameters (no hardcoded values!)
self.encoder = Encoder(
hidden_dim=config.hidden_dim,
num_layers=config.num_layers,
)
def forward(
self,
pixel_values: torch.FloatTensor,
labels: Optional[torch.LongTensor] = None,
output_hidden_states: Optional[bool] = None,
return_dict: Optional[bool] = None,
) -> Union[Tuple, SemanticSegmenterOutput]:
return_dict = return_dict if return_dict is not None else self.config.use_return_dict
# Forward pass
logits = self.encoder(pixel_values)
# Calculate loss if needed
loss = None
if labels is not None:
# Compute loss
pass
if not return_dict:
output = (logits,)
return ((loss,) + output) if loss is not None else output
return SemanticSegmenterOutput(
loss=loss,
logits=logits,
hidden_states=None,
attentions=None,
)
For vision models - Create ImageProcessor:
from transformers import BaseImageProcessor
class {ModelName}ImageProcessor(BaseImageProcessor):
model_input_names = ["pixel_values"]
def __init__(
self,
size: int = 1024,
resample: str = "bilinear",
do_normalize: bool = True,
image_mean: List[float] | None = None,
image_std: List[float] | None = None,
**kwargs,
):
super().__init__(**kwargs)
self.size = size
self.resample = resample
self.do_normalize = do_normalize
self.image_mean = image_mean if image_mean is not None else [0.485, 0.456, 0.406]
self.image_std = image_std if image_std is not None else [0.229, 0.224, 0.225]
def preprocess(
self,
images: ImageInput,
return_tensors: Optional[Union[str, TensorType]] = None,
**kwargs,
) -> BatchFeature:
# Implement preprocessing matching original
# Return BatchFeature with pixel_values
For text models - Create tokenizer configuration.
Critical: Always use the SAME preprocessed tensor for both models when comparing outputs.
import pytest
import torch
def test_preprocessing_matches():
"""Test that preprocessing is equivalent."""
old_tensor = old_preprocessing(image)
new_tensor = processor(image, return_tensors="pt")["pixel_values"][0]
assert torch.allclose(old_tensor, new_tensor, atol=1e-6)
def test_single_image_output_matches():
"""Test that model outputs match."""
# Load models
old_model = OldModel()
new_model = NewModel(NewConfig())
new_model.load_state_dict(old_model.state_dict())
# Prepare SAME input
preprocessed = preprocess(image)
with torch.no_grad():
old_output = old_model(preprocessed)
new_output = new_model(pixel_values=preprocessed)
# Use 0.5% tolerance for numerical differences
assert torch.allclose(old_output, new_output.logits, atol=5e-3, rtol=1e-2)
def test_batch_output_matches():
"""Test batch processing."""
# Test with batch of images
def test_state_dict_compatible():
"""Test that weights can be loaded."""
new_model.load_state_dict(old_model.state_dict())
Generate comprehensive model card following Hugging Face standards. Include:
See Hugging Face model card documentation for template.
Critical: Register classes with register_for_auto_class() before pushing.
#!/usr/bin/env python3
from huggingface_hub import HfApi
from {module}.transformers import {ModelName}Config, {ModelName}ForTask, {ModelName}ImageProcessor
def main():
# CRITICAL: Register for Auto* support
{ModelName}Config.register_for_auto_class()
{ModelName}ForTask.register_for_auto_class("AutoModel")
{ModelName}ImageProcessor.register_for_auto_class("AutoImageProcessor")
# Load original model
original_model = OriginalModel()
# Create transformers-compatible model
config = {ModelName}Config()
model = {ModelName}ForTask(config)
model.load_state_dict(original_model.state_dict())
# Save and push
model.save_pretrained(local_dir)
config.save_pretrained(local_dir)
processor = {ModelName}ImageProcessor()
processor.save_pretrained(local_dir)
api = HfApi(token=token)
api.create_repo(repo_id=repo_id, exist_ok=True)
api.upload_folder(repo_id=repo_id, folder_path=local_dir)
Create parallel implementations:
{project}/
├── {module}/
│ ├── original_model.py # Existing
│ └── transformers/ # NEW - for validation
│ ├── __init__.py
│ ├── configuration_{model}.py
│ ├── modeling_{model}.py
│ └── processing_{model}.py
└── tests/
└── test_transformers_compatibility.py
Workflow:
Once equivalence is verified:
{project}/
└── {module}/
├── configuration_{model}.py # Replaces original
├── modeling_{model}.py
└── processing_{model}.py
Workflow:
For detailed troubleshooting, see references/common-pitfalls.md.
Quick reference:
self.device from PreTrainedModelpost_process_semantic_segmentation() expects (width, height)If outputs don't match:
See references/common-pitfalls.md for detailed debugging steps.
After completing a conversion, add learnings to references/learnings.md.
This accumulates knowledge from each project to avoid repeating mistakes.
development
Manage Codex worklog files under `.agents/worklog/codex`, hard-gate stale learnings through `learn_index.md`, and audit learn metadata deterministically. Use when a Codex agent or user needs to bootstrap worklog context, maintain plan/todo/learn files, or validate stale learn state before trusting prior session knowledge.
development
Write and review shellscript documentation with shdoc annotations. Use when Codex creates, edits, or reviews `.sh` files or shell executables and should add, repair, or normalize `@file`, `@brief`, `@description`, `@arg`, `@option`, and `@example` comments to match shdoc conventions.
development
Apply Python development policy using uv-first execution, test-first behavior validation, and pre-commit quality gates. Use when implementing or refactoring Python code.
data-ai
AIくさい日本語を、人が書いたように自然でこなれた日本語へ書き換えるスキル。ブログ記事、メモ、メール、文書、SNS投稿、チャット返信などの日本語を書いたり推敲したりするときに使う。AIっぽい言い回しを消したいとき、日本語を自然にしたいとき、言い換えたいとき、硬さを抜きたいとき、トーンを整えたいとき、人間らしい声や温度を足したいときに使う。