Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

abelrguezr/llm-classification-finetuning

Name: llm-classification-finetuning
Author: abelrguezr

skills/AI/AI-llm-architecture/7.1.-fine-tuning-for-classification/SKILL.md

npx skillsauth add abelrguezr/hacktricks-skills llm-classification-finetuning

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

LLM Classification Fine-Tuning

This skill guides you through fine-tuning a pre-trained language model for text classification tasks. Instead of generating text, the model will output probabilities for each class (e.g., spam vs. not spam).

When to Use This Skill

Use this skill when you need to:

Convert a pre-trained LLM into a text classifier
Perform binary classification (spam/not spam, positive/negative)
Perform multi-class classification (topic categorization, intent detection)
Adapt a model to domain-specific classification tasks
Work with labeled text data for supervised learning

Overview

Fine-tuning for classification involves:

Preparing balanced training data with proper train/validation/test splits
Modifying the model's output layer to match your number of classes
Freezing most parameters and only tuning the final layers
Adjusting the loss function to focus on the classification token

Step 1: Prepare Your Dataset

Balance Your Classes

If your dataset has class imbalance, balance it by sampling equally from each class:

# Count examples per class
spam_count = len(spam_examples)
not_spam_count = len(not_spam_examples)

# Use equal numbers from each class
min_count = min(spam_count, not_spam_count)
balanced_spam = spam_examples[:min_count]
balanced_not_spam = not_spam_examples[:min_count]

Split into Train/Validation/Test

Use a 70/10/20 split:

Training (70%): Used to update model weights
Validation (10%): Used to tune hyperparameters and prevent overfitting
Test (20%): Used only after training for final unbiased evaluation

from sklearn.model_selection import train_test_split

# First split: separate test set (20%)
train_val, test = train_test_split(data, test_size=0.2, random_state=42)

# Second split: separate validation from training (10% of total = 12.5% of remaining)
train, val = train_test_split(train_val, test_size=0.125, random_state=42)

Pad Entries to Same Length

LLMs expect fixed-length inputs. Pad shorter entries with a special token:

# Find maximum length
max_length = max(len(text) for text in all_texts)

# Pad all entries
def pad_sequence(tokens, max_len, pad_token_id):
    if len(tokens) < max_len:
        return tokens + [pad_token_id] * (max_len - len(tokens))
    return tokens[:max_len]

Step 2: Initialize the Pre-trained Model

Load a pre-trained model (e.g., GPT2) with its weights:

from your_model_lib import GPTModel, GPTConfig

# Load pre-trained weights
BASE_CONFIG = GPTConfig(vocab_size=50257, emb_dim=768, n_layers=12, n_heads=12, max_seq_len=1024)
model = GPTModel(BASE_CONFIG)
model.load_state_dict(torch.load("gpt2-pretrained.pth"))

Step 3: Replace the Output Head

Replace the vocabulary-sized output layer with a classification head:

import torch.nn as nn

num_classes = 2  # For binary classification (spam/not spam)
# For multi-class, set to your number of classes

# Replace the output head
model.out_head = nn.Linear(
    in_features=BASE_CONFIG["emb_dim"],
    out_features=num_classes
)

Step 4: Freeze Most Parameters

Only fine-tune the final layers for efficiency:

# Freeze all parameters first
for param in model.parameters():
    param.requires_grad = False

# Unfreeze the last transformer block
for param in model.trf_blocks[-1].parameters():
    param.requires_grad = True

# Unfreeze the final layer normalization
for param in model.final_norm.parameters():
    param.requires_grad = True

# Unfreeze the new classification head
for param in model.out_head.parameters():
    param.requires_grad = True

Step 5: Modify Loss Function for Classification

For classification, only the last token matters. Modify your loss calculation:

def calc_loss_batch(input_batch, target_batch, model, device):
    """Calculate loss for classification (only last token)."""
    input_batch, target_batch = input_batch.to(device), target_batch.to(device)
    
    # Get logits for the last token only
    logits = model(input_batch)[:, -1, :]
    
    # Cross-entropy loss for classification
    loss = nn.functional.cross_entropy(logits, target_batch)
    return loss


def calc_accuracy_loader(data_loader, model, device, num_batches=None):
    """Calculate accuracy on a data loader."""
    model.eval()
    correct_predictions, num_examples = 0, 0
    
    if num_batches is None:
        num_batches = len(data_loader)
    else:
        num_batches = min(num_batches, len(data_loader))
    
    for i, (input_batch, target_batch) in enumerate(data_loader):
        if i < num_batches:
            input_batch, target_batch = input_batch.to(device), target_batch.to(device)
            
            with torch.no_grad():
                logits = model(input_batch)[:, -1, :]
                predicted_labels = torch.argmax(logits, dim=-1)
            
            num_examples += predicted_labels.shape[0]
            correct_predictions += (predicted_labels == target_batch).sum().item()
        else:
            break
    
    return correct_predictions / num_examples

Step 6: Training Loop

def train_classifier(model, train_loader, val_loader, device, epochs=10, lr=1e-5):
    optimizer = torch.optim.Adam(filter(lambda p: p.requires_grad, model.parameters()), lr=lr)
    
    for epoch in range(epochs):
        model.train()
        total_loss = 0
        
        for input_batch, target_batch in train_loader:
            optimizer.zero_grad()
            loss = calc_loss_batch(input_batch, target_batch, model, device)
            loss.backward()
            optimizer.step()
            total_loss += loss.item()
        
        # Evaluate on validation set
        val_accuracy = calc_accuracy_loader(val_loader, model, device)
        avg_loss = total_loss / len(train_loader)
        
        print(f"Epoch {epoch+1}: Loss={avg_loss:.4f}, Val Accuracy={val_accuracy:.4f}")
    
    return model

Best Practices

Start with a pre-trained model - Training from scratch is expensive and unnecessary for most tasks
Balance your classes - Imbalanced data leads to biased models
Use validation data - Monitor for overfitting during training
Freeze early layers - They capture general language patterns; only tune final layers
Use small learning rates - Fine-tuning requires gentle updates (1e-5 to 1e-4)
Monitor both loss and accuracy - Loss can decrease while accuracy plateaus

Common Classification Tasks

Spam detection: Binary (spam/not spam)
Sentiment analysis: Binary or multi-class (positive/neutral/negative)
Intent classification: Multi-class (book_flight, check_balance, etc.)
Topic categorization: Multi-class (sports, politics, technology, etc.)
Toxicity detection: Binary (toxic/not toxic)

Troubleshooting

Model not learning: Check that parameters are unfrozen (requires_grad=True)

Overfitting: Use validation accuracy to detect; try more regularization or early stopping

Poor accuracy: Ensure data is balanced and properly labeled; try more training epochs

Slow training: Freeze more layers; reduce batch size; use GPU

References

LLMs from Scratch - Chapter 6
Build a Large Language Model from Scratch

abelrguezr/llm-classification-finetuning

skills/AI/AI-llm-architecture/7.1.-fine-tuning-for-classification/SKILL.md

How to fine-tune a pre-trained LLM (like GPT2) for text classification tasks. Use this skill whenever the user wants to adapt a language model for classification (spam detection, sentiment analysis, topic categorization, intent classification, or any binary/multi-class text classification). Trigger this skill when users mention fine-tuning, classification, adapting models, or need to convert a generative model into a classifier.

5 stars

development

Updated Apr 16, 2026

$ install --global

skillsauth

npx skillsauth add abelrguezr/hacktricks-skills llm-classification-finetuning

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 16, 2026, 2:05 AM23.7s2 files scanned

SKILL.md

name:: llm-classification-finetuning
description:: How to fine-tune a pre-trained LLM (like GPT2) for text classification tasks. Use this skill whenever the user wants to adapt a language model for classification (spam detection, sentiment analysis, topic categorization, intent classification, or any binary/multi-class text classification). Trigger this skill when users mention fine-tuning, classification, adapting models, or need to convert a generative model into a classifier.

LLM Classification Fine-Tuning

When to Use This Skill

Use this skill when you need to:

Convert a pre-trained LLM into a text classifier
Perform binary classification (spam/not spam, positive/negative)
Perform multi-class classification (topic categorization, intent detection)
Adapt a model to domain-specific classification tasks
Work with labeled text data for supervised learning

Overview

Fine-tuning for classification involves:

Preparing balanced training data with proper train/validation/test splits
Modifying the model's output layer to match your number of classes
Freezing most parameters and only tuning the final layers
Adjusting the loss function to focus on the classification token

Step 1: Prepare Your Dataset

Balance Your Classes

If your dataset has class imbalance, balance it by sampling equally from each class:

# Count examples per class
spam_count = len(spam_examples)
not_spam_count = len(not_spam_examples)

# Use equal numbers from each class
min_count = min(spam_count, not_spam_count)
balanced_spam = spam_examples[:min_count]
balanced_not_spam = not_spam_examples[:min_count]

Split into Train/Validation/Test

Use a 70/10/20 split:

Training (70%): Used to update model weights
Validation (10%): Used to tune hyperparameters and prevent overfitting
Test (20%): Used only after training for final unbiased evaluation

from sklearn.model_selection import train_test_split

# First split: separate test set (20%)
train_val, test = train_test_split(data, test_size=0.2, random_state=42)

# Second split: separate validation from training (10% of total = 12.5% of remaining)
train, val = train_test_split(train_val, test_size=0.125, random_state=42)

Pad Entries to Same Length

LLMs expect fixed-length inputs. Pad shorter entries with a special token:

# Find maximum length
max_length = max(len(text) for text in all_texts)

# Pad all entries
def pad_sequence(tokens, max_len, pad_token_id):
    if len(tokens) < max_len:
        return tokens + [pad_token_id] * (max_len - len(tokens))
    return tokens[:max_len]

Step 2: Initialize the Pre-trained Model

Load a pre-trained model (e.g., GPT2) with its weights:

from your_model_lib import GPTModel, GPTConfig

# Load pre-trained weights
BASE_CONFIG = GPTConfig(vocab_size=50257, emb_dim=768, n_layers=12, n_heads=12, max_seq_len=1024)
model = GPTModel(BASE_CONFIG)
model.load_state_dict(torch.load("gpt2-pretrained.pth"))

Step 3: Replace the Output Head

Replace the vocabulary-sized output layer with a classification head:

import torch.nn as nn

num_classes = 2  # For binary classification (spam/not spam)
# For multi-class, set to your number of classes

# Replace the output head
model.out_head = nn.Linear(
    in_features=BASE_CONFIG["emb_dim"],
    out_features=num_classes
)

Step 4: Freeze Most Parameters

Only fine-tune the final layers for efficiency:

# Freeze all parameters first
for param in model.parameters():
    param.requires_grad = False

# Unfreeze the last transformer block
for param in model.trf_blocks[-1].parameters():
    param.requires_grad = True

# Unfreeze the final layer normalization
for param in model.final_norm.parameters():
    param.requires_grad = True

# Unfreeze the new classification head
for param in model.out_head.parameters():
    param.requires_grad = True

Step 5: Modify Loss Function for Classification

For classification, only the last token matters. Modify your loss calculation:

def calc_loss_batch(input_batch, target_batch, model, device):
    """Calculate loss for classification (only last token)."""
    input_batch, target_batch = input_batch.to(device), target_batch.to(device)
    
    # Get logits for the last token only
    logits = model(input_batch)[:, -1, :]
    
    # Cross-entropy loss for classification
    loss = nn.functional.cross_entropy(logits, target_batch)
    return loss


def calc_accuracy_loader(data_loader, model, device, num_batches=None):
    """Calculate accuracy on a data loader."""
    model.eval()
    correct_predictions, num_examples = 0, 0
    
    if num_batches is None:
        num_batches = len(data_loader)
    else:
        num_batches = min(num_batches, len(data_loader))
    
    for i, (input_batch, target_batch) in enumerate(data_loader):
        if i < num_batches:
            input_batch, target_batch = input_batch.to(device), target_batch.to(device)
            
            with torch.no_grad():
                logits = model(input_batch)[:, -1, :]
                predicted_labels = torch.argmax(logits, dim=-1)
            
            num_examples += predicted_labels.shape[0]
            correct_predictions += (predicted_labels == target_batch).sum().item()
        else:
            break
    
    return correct_predictions / num_examples

Step 6: Training Loop

def train_classifier(model, train_loader, val_loader, device, epochs=10, lr=1e-5):
    optimizer = torch.optim.Adam(filter(lambda p: p.requires_grad, model.parameters()), lr=lr)
    
    for epoch in range(epochs):
        model.train()
        total_loss = 0
        
        for input_batch, target_batch in train_loader:
            optimizer.zero_grad()
            loss = calc_loss_batch(input_batch, target_batch, model, device)
            loss.backward()
            optimizer.step()
            total_loss += loss.item()
        
        # Evaluate on validation set
        val_accuracy = calc_accuracy_loader(val_loader, model, device)
        avg_loss = total_loss / len(train_loader)
        
        print(f"Epoch {epoch+1}: Loss={avg_loss:.4f}, Val Accuracy={val_accuracy:.4f}")
    
    return model

Best Practices

Start with a pre-trained model - Training from scratch is expensive and unnecessary for most tasks
Balance your classes - Imbalanced data leads to biased models
Use validation data - Monitor for overfitting during training
Freeze early layers - They capture general language patterns; only tune final layers
Use small learning rates - Fine-tuning requires gentle updates (1e-5 to 1e-4)
Monitor both loss and accuracy - Loss can decrease while accuracy plateaus

Common Classification Tasks

Spam detection: Binary (spam/not spam)
Sentiment analysis: Binary or multi-class (positive/neutral/negative)
Intent classification: Multi-class (book_flight, check_balance, etc.)
Topic categorization: Multi-class (sports, politics, technology, etc.)
Toxicity detection: Binary (toxic/not toxic)

Troubleshooting

Model not learning: Check that parameters are unfrozen (requires_grad=True)

Overfitting: Use validation accuracy to detect; try more regularization or early stopping

Poor accuracy: Ensure data is balanced and properly labeled; try more training epochs

Slow training: Freeze more layers; reduce batch size; use GPU

References

LLMs from Scratch - Chapter 6
Build a Large Language Model from Scratch

Related Skills

abelrguezr/house-of-lore-exploit

testing

VerifiedTrustedCommunity

How to perform a House of Lore (small bin attack) heap exploitation. Use this skill whenever the user mentions heap exploitation, small bin attacks, fake chunks, glibc heap vulnerabilities, or needs to insert fake chunks into small bins for arbitrary read/write. Trigger for CTF challenges involving heap corruption, glibc 2.31+ exploitation, or when the user needs to bypass malloc sanity checks using fake chunk linking.

5SKILL.mdUpdated Apr 16, 2026

abelrguezr/house-of-lore-exploit

abelrguezr/house-of-force-exploit

testing

VerifiedTrustedCommunity

How to perform House of Force heap exploitation attacks. Use this skill whenever the user mentions heap exploitation, House of Force, top chunk manipulation, arbitrary memory allocation, malloc manipulation, or wants to allocate chunks at specific addresses. Also trigger for CTF challenges involving heap overflows, top chunk size overwrites, or when the user needs to calculate evil_size for heap attacks. Make sure to use this skill for any binary exploitation task involving glibc heap manipulation, even if they don't explicitly say "House of Force".

5SKILL.mdUpdated Apr 16, 2026

abelrguezr/house-of-force-exploit

abelrguezr/house-of-einherjar

tools

VerifiedTrustedCommunity

How to perform House of Einherjar heap exploitation to allocate memory at arbitrary addresses. Use this skill whenever the user mentions heap exploitation, glibc heap attacks, arbitrary memory allocation, off-by-one overflow exploitation, tcache poisoning, fast bin attacks, or any CTF challenge involving heap manipulation. This is essential for binary exploitation tasks where you need to control malloc() return addresses.

5SKILL.mdUpdated Apr 16, 2026

abelrguezr/house-of-einherjar

abelrguezr/heap-overflow-exploitation

testing

VerifiedTrustedCommunity

How to identify, analyze, and exploit heap overflow vulnerabilities in binary exploitation challenges and real-world scenarios. Use this skill whenever the user mentions heap overflows, memory corruption, heap grooming, tcache poisoning, fast-bin attacks, or any heap-related vulnerability in CTF challenges, binary analysis, or security research. This skill covers heap overflow fundamentals, exploitation techniques, heap grooming strategies, and real-world CVE analysis.

5SKILL.mdUpdated Apr 16, 2026

abelrguezr/heap-overflow-exploitation

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/abelrguezr/hacktricks-skills.git

# Copy into Claude Code skills folder (global)
cp -r hacktricks-skills/skills/AI/AI-llm-architecture/7.1.-fine-tuning-for-classification ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

abelrguezr/hacktricks-skills

5 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT