Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

abelrguezr/llm-architecture

Name: llm-architecture
Author: abelrguezr

skills/AI/AI-llm-architecture/5.-llm-architecture/SKILL.md

npx skillsauth add abelrguezr/hacktricks-skills llm-architecture

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

LLM Architecture Builder

A skill for building and understanding Large Language Model architecture from scratch, following the GPT-style transformer design.

When to Use This Skill

Use this skill when the user needs to:

Build a GPT model from scratch
Implement transformer components (attention, feedforward, layer normalization)
Calculate the number of parameters in an LLM
Generate text using a trained model
Understand how token and positional embeddings work
Create or modify LLM architecture configurations

Core Components

1. GELU Activation Function

GELU (Gaussian Error Linear Unit) introduces non-linearity into the model. Unlike ReLU which zeroes out negative inputs, GELU smoothly maps inputs to outputs, allowing for small non-zero values for negative inputs.

Use the bundled script: scripts/gelu.py

2. FeedForward Network

A position-wise feedforward network that applies a two-layer fully connected network to each position:

First linear layer: expands dimensionality from emb_dim to 4 * emb_dim
GELU activation: applies non-linearity
Second linear layer: reduces dimensionality back to emb_dim

Use the bundled script: scripts/feedforward.py

3. Multi-Head Attention

Allows the model to focus on different positions within the input sequence:

Queries, Keys, Values: Linear projections of the input
Heads: Multiple attention mechanisms running in parallel
Causal Mask: Prevents attending to future tokens (autoregressive)
Dropout: Prevents overfitting

Use the bundled script: scripts/multihead_attention.py

4. Layer Normalization

Normalizes inputs across features for each example in a batch:

Computes mean and variance across embedding dimension
Normalizes to mean=0, variance=1
Applies learnable scale and shift parameters
Stabilizes training of deep networks

Use the bundled script: scripts/layernorm.py

5. Transformer Block

Combines all components with residual connections:

First residual path: LayerNorm → Multi-Head Attention → Dropout → Add residual
Second residual path: LayerNorm → FeedForward → Dropout → Add residual

Use the bundled script: scripts/transformer_block.py

6. GPTModel

The complete model that:

Converts token indices to embeddings
Adds positional embeddings
Passes through multiple transformer blocks
Applies final normalization
Projects to vocabulary size for token prediction

Use the bundled script: scripts/gpt_model.py

Standard Configuration

The default 124M parameter configuration:

GPT_CONFIG_124M = {
    "vocab_size": 50257,    # Vocabulary size
    "context_length": 1024, # Context length
    "emb_dim": 768,         # Embedding dimension
    "n_heads": 12,          # Number of attention heads
    "n_layers": 12,         # Number of layers
    "drop_rate": 0.1,       # Dropout rate
    "qkv_bias": False       # Query-Key-Value bias
}

Parameter Calculation

To calculate the number of parameters in your model:

Use the bundled script: scripts/calculate_params.py

This script breaks down parameters by component:

Token embeddings: vocab_size * emb_dim
Position embeddings: context_length * emb_dim
Multi-head attention per block: Q, K, V projections + output projection
Feedforward per block: two linear layers
Layer normalizations: scale and shift parameters
Output projection: emb_dim * vocab_size

Text Generation

To generate text with a trained model:

Use the bundled script: scripts/generate_text.py

The generation process:

Encode the starting text to token indices
Pass through the model to get logits
Apply softmax to get probabilities
Select the token with highest probability
Append to sequence and repeat

Workflow

Building a Complete Model

Define configuration - Set vocab_size, context_length, emb_dim, n_heads, n_layers
Create model - Use scripts/create_gpt_model.py with your config
Calculate parameters - Use scripts/calculate_params.py to verify
Test generation - Use scripts/generate_text.py with sample input

Understanding Components

Read component documentation - Each script has docstrings explaining its purpose
Run with sample data - Scripts include example usage
Inspect shapes - Comments show tensor shapes at each step

Examples

Example 1: Create a Small Model

python scripts/create_gpt_model.py --emb-dim 256 --n-layers 4 --n-heads 4

This creates a smaller model for testing/learning.

Example 2: Calculate Parameters

python scripts/calculate_params.py --config GPT_CONFIG_124M

Output shows breakdown by component and total (163,009,536 for 124M config).

Example 3: Generate Text

python scripts/generate_text.py --model checkpoint.pt --prompt "Hello, I am" --max-tokens 10

Key Concepts

Token Embeddings

Convert token indices to dense vectors
Shape: (vocab_size, emb_dim)
Learnable parameters that represent each token

Positional Embeddings

Add position information to token embeddings
Shape: (context_length, emb_dim)
Critical for understanding word order in sequences

Residual Connections

Add input to output of each sub-layer
Prevent vanishing gradients in deep networks
Enable training of many transformer blocks

Causal Masking

Masks future tokens during training
Ensures autoregressive property (can't see future)
Applied in multi-head attention

Best Practices

Start small - Use smaller configs for testing before scaling up
Check shapes - Verify tensor shapes match expected dimensions
Use dropout - Essential for preventing overfitting
LayerNorm before - Apply normalization before attention/feedforward
Seed for reproducibility - Set random seed for consistent results

References

LLMs from Scratch by Sebastian Raschka
Build a Large Language Model from Scratch (Manning)

abelrguezr/llm-architecture

skills/AI/AI-llm-architecture/5.-llm-architecture/SKILL.md

Build and understand LLM architecture from scratch. Use this skill whenever the user needs to create GPT models, implement transformer components (attention, feedforward, layer norm), calculate model parameters, or generate text with a trained model. Trigger for any request about LLM architecture, transformer blocks, GPT implementation, token embeddings, positional embeddings, or building neural networks for language modeling.

5 stars

development

Updated Apr 16, 2026

$ install --global

skillsauth

npx skillsauth add abelrguezr/hacktricks-skills llm-architecture

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 16, 2026, 2:08 AM236.3s10 files scanned

SKILL.md

name:: llm-architecture
description:: Build and understand LLM architecture from scratch. Use this skill whenever the user needs to create GPT models, implement transformer components (attention, feedforward, layer norm), calculate model parameters, or generate text with a trained model. Trigger for any request about LLM architecture, transformer blocks, GPT implementation, token embeddings, positional embeddings, or building neural networks for language modeling.

LLM Architecture Builder

A skill for building and understanding Large Language Model architecture from scratch, following the GPT-style transformer design.

When to Use This Skill

Use this skill when the user needs to:

Build a GPT model from scratch
Implement transformer components (attention, feedforward, layer normalization)
Calculate the number of parameters in an LLM
Generate text using a trained model
Understand how token and positional embeddings work
Create or modify LLM architecture configurations

Core Components

1. GELU Activation Function

Use the bundled script: scripts/gelu.py

2. FeedForward Network

A position-wise feedforward network that applies a two-layer fully connected network to each position:

First linear layer: expands dimensionality from emb_dim to 4 * emb_dim
GELU activation: applies non-linearity
Second linear layer: reduces dimensionality back to emb_dim

Use the bundled script: scripts/feedforward.py

3. Multi-Head Attention

Allows the model to focus on different positions within the input sequence:

Queries, Keys, Values: Linear projections of the input
Heads: Multiple attention mechanisms running in parallel
Causal Mask: Prevents attending to future tokens (autoregressive)
Dropout: Prevents overfitting

Use the bundled script: scripts/multihead_attention.py

4. Layer Normalization

Normalizes inputs across features for each example in a batch:

Computes mean and variance across embedding dimension
Normalizes to mean=0, variance=1
Applies learnable scale and shift parameters
Stabilizes training of deep networks

Use the bundled script: scripts/layernorm.py

5. Transformer Block

Combines all components with residual connections:

First residual path: LayerNorm → Multi-Head Attention → Dropout → Add residual
Second residual path: LayerNorm → FeedForward → Dropout → Add residual

Use the bundled script: scripts/transformer_block.py

6. GPTModel

The complete model that:

Converts token indices to embeddings
Adds positional embeddings
Passes through multiple transformer blocks
Applies final normalization
Projects to vocabulary size for token prediction

Use the bundled script: scripts/gpt_model.py

Standard Configuration

The default 124M parameter configuration:

GPT_CONFIG_124M = {
    "vocab_size": 50257,    # Vocabulary size
    "context_length": 1024, # Context length
    "emb_dim": 768,         # Embedding dimension
    "n_heads": 12,          # Number of attention heads
    "n_layers": 12,         # Number of layers
    "drop_rate": 0.1,       # Dropout rate
    "qkv_bias": False       # Query-Key-Value bias
}

Parameter Calculation

To calculate the number of parameters in your model:

Use the bundled script: scripts/calculate_params.py

This script breaks down parameters by component:

Token embeddings: vocab_size * emb_dim
Position embeddings: context_length * emb_dim
Multi-head attention per block: Q, K, V projections + output projection
Feedforward per block: two linear layers
Layer normalizations: scale and shift parameters
Output projection: emb_dim * vocab_size

Text Generation

To generate text with a trained model:

Use the bundled script: scripts/generate_text.py

The generation process:

Encode the starting text to token indices
Pass through the model to get logits
Apply softmax to get probabilities
Select the token with highest probability
Append to sequence and repeat

Workflow

Building a Complete Model

Define configuration - Set vocab_size, context_length, emb_dim, n_heads, n_layers
Create model - Use scripts/create_gpt_model.py with your config
Calculate parameters - Use scripts/calculate_params.py to verify
Test generation - Use scripts/generate_text.py with sample input

Understanding Components

Read component documentation - Each script has docstrings explaining its purpose
Run with sample data - Scripts include example usage
Inspect shapes - Comments show tensor shapes at each step

Examples

Example 1: Create a Small Model

python scripts/create_gpt_model.py --emb-dim 256 --n-layers 4 --n-heads 4

This creates a smaller model for testing/learning.

Example 2: Calculate Parameters

python scripts/calculate_params.py --config GPT_CONFIG_124M

Output shows breakdown by component and total (163,009,536 for 124M config).

Example 3: Generate Text

python scripts/generate_text.py --model checkpoint.pt --prompt "Hello, I am" --max-tokens 10

Key Concepts

Token Embeddings

Convert token indices to dense vectors
Shape: (vocab_size, emb_dim)
Learnable parameters that represent each token

Positional Embeddings

Add position information to token embeddings
Shape: (context_length, emb_dim)
Critical for understanding word order in sequences

Residual Connections

Add input to output of each sub-layer
Prevent vanishing gradients in deep networks
Enable training of many transformer blocks

Causal Masking

Masks future tokens during training
Ensures autoregressive property (can't see future)
Applied in multi-head attention

Best Practices

Start small - Use smaller configs for testing before scaling up
Check shapes - Verify tensor shapes match expected dimensions
Use dropout - Essential for preventing overfitting
LayerNorm before - Apply normalization before attention/feedforward
Seed for reproducibility - Set random seed for consistent results

References

LLMs from Scratch by Sebastian Raschka
Build a Large Language Model from Scratch (Manning)

Related Skills

abelrguezr/house-of-lore-exploit

testing

VerifiedTrustedCommunity

How to perform a House of Lore (small bin attack) heap exploitation. Use this skill whenever the user mentions heap exploitation, small bin attacks, fake chunks, glibc heap vulnerabilities, or needs to insert fake chunks into small bins for arbitrary read/write. Trigger for CTF challenges involving heap corruption, glibc 2.31+ exploitation, or when the user needs to bypass malloc sanity checks using fake chunk linking.

5SKILL.mdUpdated Apr 16, 2026

abelrguezr/house-of-lore-exploit

abelrguezr/house-of-force-exploit

testing

VerifiedTrustedCommunity

How to perform House of Force heap exploitation attacks. Use this skill whenever the user mentions heap exploitation, House of Force, top chunk manipulation, arbitrary memory allocation, malloc manipulation, or wants to allocate chunks at specific addresses. Also trigger for CTF challenges involving heap overflows, top chunk size overwrites, or when the user needs to calculate evil_size for heap attacks. Make sure to use this skill for any binary exploitation task involving glibc heap manipulation, even if they don't explicitly say "House of Force".

5SKILL.mdUpdated Apr 16, 2026

abelrguezr/house-of-force-exploit

abelrguezr/house-of-einherjar

tools

VerifiedTrustedCommunity

How to perform House of Einherjar heap exploitation to allocate memory at arbitrary addresses. Use this skill whenever the user mentions heap exploitation, glibc heap attacks, arbitrary memory allocation, off-by-one overflow exploitation, tcache poisoning, fast bin attacks, or any CTF challenge involving heap manipulation. This is essential for binary exploitation tasks where you need to control malloc() return addresses.

5SKILL.mdUpdated Apr 16, 2026

abelrguezr/house-of-einherjar

abelrguezr/heap-overflow-exploitation

testing

VerifiedTrustedCommunity

How to identify, analyze, and exploit heap overflow vulnerabilities in binary exploitation challenges and real-world scenarios. Use this skill whenever the user mentions heap overflows, memory corruption, heap grooming, tcache poisoning, fast-bin attacks, or any heap-related vulnerability in CTF challenges, binary analysis, or security research. This skill covers heap overflow fundamentals, exploitation techniques, heap grooming strategies, and real-world CVE analysis.

5SKILL.mdUpdated Apr 16, 2026

abelrguezr/heap-overflow-exploitation

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/abelrguezr/hacktricks-skills.git

# Copy into Claude Code skills folder (global)
cp -r hacktricks-skills/skills/AI/AI-llm-architecture/5.-llm-architecture ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

abelrguezr/hacktricks-skills

5 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT