skills/AI/AI-llm-architecture/7.1.-fine-tuning-for-classification/SKILL.md
How to fine-tune a pre-trained LLM (like GPT2) for text classification tasks. Use this skill whenever the user wants to adapt a language model for classification (spam detection, sentiment analysis, topic categorization, intent classification, or any binary/multi-class text classification). Trigger this skill when users mention fine-tuning, classification, adapting models, or need to convert a generative model into a classifier.
npx skillsauth add abelrguezr/hacktricks-skills llm-classification-finetuningInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
This skill guides you through fine-tuning a pre-trained language model for text classification tasks. Instead of generating text, the model will output probabilities for each class (e.g., spam vs. not spam).
Use this skill when you need to:
Fine-tuning for classification involves:
If your dataset has class imbalance, balance it by sampling equally from each class:
# Count examples per class
spam_count = len(spam_examples)
not_spam_count = len(not_spam_examples)
# Use equal numbers from each class
min_count = min(spam_count, not_spam_count)
balanced_spam = spam_examples[:min_count]
balanced_not_spam = not_spam_examples[:min_count]
Use a 70/10/20 split:
from sklearn.model_selection import train_test_split
# First split: separate test set (20%)
train_val, test = train_test_split(data, test_size=0.2, random_state=42)
# Second split: separate validation from training (10% of total = 12.5% of remaining)
train, val = train_test_split(train_val, test_size=0.125, random_state=42)
LLMs expect fixed-length inputs. Pad shorter entries with a special token:
# Find maximum length
max_length = max(len(text) for text in all_texts)
# Pad all entries
def pad_sequence(tokens, max_len, pad_token_id):
if len(tokens) < max_len:
return tokens + [pad_token_id] * (max_len - len(tokens))
return tokens[:max_len]
Load a pre-trained model (e.g., GPT2) with its weights:
from your_model_lib import GPTModel, GPTConfig
# Load pre-trained weights
BASE_CONFIG = GPTConfig(vocab_size=50257, emb_dim=768, n_layers=12, n_heads=12, max_seq_len=1024)
model = GPTModel(BASE_CONFIG)
model.load_state_dict(torch.load("gpt2-pretrained.pth"))
Replace the vocabulary-sized output layer with a classification head:
import torch.nn as nn
num_classes = 2 # For binary classification (spam/not spam)
# For multi-class, set to your number of classes
# Replace the output head
model.out_head = nn.Linear(
in_features=BASE_CONFIG["emb_dim"],
out_features=num_classes
)
Only fine-tune the final layers for efficiency:
# Freeze all parameters first
for param in model.parameters():
param.requires_grad = False
# Unfreeze the last transformer block
for param in model.trf_blocks[-1].parameters():
param.requires_grad = True
# Unfreeze the final layer normalization
for param in model.final_norm.parameters():
param.requires_grad = True
# Unfreeze the new classification head
for param in model.out_head.parameters():
param.requires_grad = True
For classification, only the last token matters. Modify your loss calculation:
def calc_loss_batch(input_batch, target_batch, model, device):
"""Calculate loss for classification (only last token)."""
input_batch, target_batch = input_batch.to(device), target_batch.to(device)
# Get logits for the last token only
logits = model(input_batch)[:, -1, :]
# Cross-entropy loss for classification
loss = nn.functional.cross_entropy(logits, target_batch)
return loss
def calc_accuracy_loader(data_loader, model, device, num_batches=None):
"""Calculate accuracy on a data loader."""
model.eval()
correct_predictions, num_examples = 0, 0
if num_batches is None:
num_batches = len(data_loader)
else:
num_batches = min(num_batches, len(data_loader))
for i, (input_batch, target_batch) in enumerate(data_loader):
if i < num_batches:
input_batch, target_batch = input_batch.to(device), target_batch.to(device)
with torch.no_grad():
logits = model(input_batch)[:, -1, :]
predicted_labels = torch.argmax(logits, dim=-1)
num_examples += predicted_labels.shape[0]
correct_predictions += (predicted_labels == target_batch).sum().item()
else:
break
return correct_predictions / num_examples
def train_classifier(model, train_loader, val_loader, device, epochs=10, lr=1e-5):
optimizer = torch.optim.Adam(filter(lambda p: p.requires_grad, model.parameters()), lr=lr)
for epoch in range(epochs):
model.train()
total_loss = 0
for input_batch, target_batch in train_loader:
optimizer.zero_grad()
loss = calc_loss_batch(input_batch, target_batch, model, device)
loss.backward()
optimizer.step()
total_loss += loss.item()
# Evaluate on validation set
val_accuracy = calc_accuracy_loader(val_loader, model, device)
avg_loss = total_loss / len(train_loader)
print(f"Epoch {epoch+1}: Loss={avg_loss:.4f}, Val Accuracy={val_accuracy:.4f}")
return model
Model not learning: Check that parameters are unfrozen (requires_grad=True)
Overfitting: Use validation accuracy to detect; try more regularization or early stopping
Poor accuracy: Ensure data is balanced and properly labeled; try more training epochs
Slow training: Freeze more layers; reduce batch size; use GPU
testing
How to perform a House of Lore (small bin attack) heap exploitation. Use this skill whenever the user mentions heap exploitation, small bin attacks, fake chunks, glibc heap vulnerabilities, or needs to insert fake chunks into small bins for arbitrary read/write. Trigger for CTF challenges involving heap corruption, glibc 2.31+ exploitation, or when the user needs to bypass malloc sanity checks using fake chunk linking.
testing
How to perform House of Force heap exploitation attacks. Use this skill whenever the user mentions heap exploitation, House of Force, top chunk manipulation, arbitrary memory allocation, malloc manipulation, or wants to allocate chunks at specific addresses. Also trigger for CTF challenges involving heap overflows, top chunk size overwrites, or when the user needs to calculate evil_size for heap attacks. Make sure to use this skill for any binary exploitation task involving glibc heap manipulation, even if they don't explicitly say "House of Force".
tools
How to perform House of Einherjar heap exploitation to allocate memory at arbitrary addresses. Use this skill whenever the user mentions heap exploitation, glibc heap attacks, arbitrary memory allocation, off-by-one overflow exploitation, tcache poisoning, fast bin attacks, or any CTF challenge involving heap manipulation. This is essential for binary exploitation tasks where you need to control malloc() return addresses.
testing
How to identify, analyze, and exploit heap overflow vulnerabilities in binary exploitation challenges and real-world scenarios. Use this skill whenever the user mentions heap overflows, memory corruption, heap grooming, tcache poisoning, fast-bin attacks, or any heap-related vulnerability in CTF challenges, binary analysis, or security research. This skill covers heap overflow fundamentals, exploitation techniques, heap grooming strategies, and real-world CVE analysis.