Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

abelrguezr/deep-learning-helper

Name: deep-learning-helper
Author: abelrguezr

skills/AI/AI-Deep-Learning/SKILL.md

npx skillsauth add abelrguezr/hacktricks-skills deep-learning-helper

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Deep Learning Helper

A comprehensive guide to deep learning concepts and PyTorch implementation.

Core Concepts

Neural Networks

Neural networks are the foundation of deep learning. They consist of interconnected neurons organized in layers:

Input Layer: Receives raw data
Hidden Layers: Perform transformations (can have multiple layers)
Output Layer: Produces final predictions

Each neuron computes: z = w * x + b then applies an activation function.

Activation Functions

Activation functions introduce non-linearity, enabling networks to learn complex patterns:

| Function | Range | Use Case | |----------|-------|----------| | Sigmoid | 0 to 1 | Binary classification output | | ReLU | 0 to ∞ | Hidden layers (most common) | | Tanh | -1 to 1 | Hidden layers | | Softmax | 0 to 1 (sums to 1) | Multi-class classification output |

Key insight: Without activation functions, a neural network is just a linear transformation regardless of depth.

Backpropagation

The training algorithm that adjusts weights to minimize loss:

Forward Pass: Compute output through the network
Loss Calculation: Compare prediction to target
Backward Pass: Compute gradients using chain rule
Weight Update: Adjust weights in opposite direction of gradient

Convolutional Neural Networks (CNNs)

CNNs excel at processing grid-like data (images) by learning spatial hierarchies of features.

CNN Components

Convolutional Layers: Apply learnable filters to extract features

Initial layers detect edges and textures
Intermediate layers detect shapes and patterns
Final layers detect complex objects

Pooling Layers: Downsample feature maps

Max pooling: keeps strongest activations
Reduces parameters and computational cost
Provides translation invariance

Fully Connected Layers: Final classification

Connects all neurons between layers
Typically at the end of the network

CNN Design Pattern

# Standard pattern: Conv → ReLU → Conv → ReLU → Pool
# Repeat, then flatten → FC → Output

Parameter Calculation

For a convolutional layer:

Parameters = (kernel_height × kernel_width × in_channels + 1) × out_channels

The +1 is for the bias term per output channel.

For a fully connected layer:

Parameters = (input_features + 1) × output_features

CNN Implementation Template

See scripts/cnn_template.py for a complete CNN implementation.

Key considerations:

Start with 32-64 filters, double every 2-3 layers
Use 3×3 kernels with padding=1 to preserve spatial dimensions
Apply max pooling (2×2, stride=2) after every 1-2 conv layers
Add dropout (0.5) before fully connected layers to prevent overfitting
Flatten after final pooling, before FC layers

Recurrent Neural Networks (RNNs)

RNNs process sequential data by maintaining a hidden state across time steps.

RNN Components

Recurrent Layers: Process sequences one step at a time
Hidden State: Vector summarizing past information
Output Layer: Produces predictions from hidden state

LSTM and GRU

Standard RNNs struggle with long-range dependencies due to vanishing gradients. LSTMs and GRUs solve this with gating mechanisms:

LSTM (Long Short-Term Memory):

Input gate: controls new information
Forget gate: controls what to discard
Output gate: controls what to output
Cell state: carries information across time steps

GRU (Gated Recurrent Unit):

Simpler than LSTM (combines input/forget gates)
Update gate: controls state updates
Reset gate: controls how much past to forget
More computationally efficient

Large Language Models (LLMs)

LLMs use transformer architecture for natural language tasks.

Transformer Architecture

Self-Attention: Weighs importance of different words in context

Computes attention scores between all word pairs
Allows model to focus on relevant context

Multi-Head Attention: Multiple attention mechanisms in parallel

Each head captures different relationships
Combined for richer representations

Positional Encoding: Adds position information

Transformers have no inherent order
Encoding provides sequence position context

Diffusion Models

Generative models that create data by reversing a noise-adding process.

How Diffusion Works

Forward Process: Gradually add noise to data

Transforms data into simple noise distribution
Defined by noise schedule

Reverse Process: Learn to denoise

Trained to reconstruct data from noisy samples
Generates new samples by starting from noise

Image Generation Pipeline:

Encode text prompt to latent representation
Sample random noise from Gaussian distribution
Apply diffusion steps to transform noise into image
Each step denoises based on text conditioning

Training Best Practices

Hyperparameters

| Parameter | Typical Range | Notes | |-----------|---------------|-------| | Learning Rate | 1e-4 to 1e-3 | Adam optimizer | | Batch Size | 32 to 256 | Depends on GPU memory | | Epochs | 5 to 100 | Monitor for overfitting | | Weight Decay | 1e-4 to 1e-5 | L2 regularization | | Dropout | 0.2 to 0.5 | Before FC layers |

Training Loop Pattern

See scripts/training_loop_template.py for a complete training implementation.

Essential steps:

Set model to train mode (model.train())
Zero gradients (optimizer.zero_grad())
Forward pass to get predictions
Compute loss
Backward pass (loss.backward())
Update weights (optimizer.step())

For evaluation:

Set model to eval mode (model.eval())
Use torch.no_grad() to disable gradient computation
Compute metrics without updating weights

Loss Functions

| Task | Loss Function | |------|---------------| | Multi-class classification | nn.CrossEntropyLoss() | | Binary classification | nn.BCEWithLogitsLoss() | | Regression | nn.MSELoss() |

Optimizers

Adam: Adaptive learning rates, good default choice
SGD: Stochastic gradient descent, can work well with momentum
RMSprop: Good for RNNs

Common Pitfalls

Forgetting to zero gradients: Gradients accumulate by default
Not setting train/eval mode: Dropout and batch norm behave differently
Mismatched input/output shapes: Verify tensor dimensions at each layer
Overfitting: Use dropout, data augmentation, weight decay
Vanishing gradients: Use ReLU, batch norm, or LSTM/GRU for sequences

When to Use Each Architecture

| Task | Recommended Architecture | |------|-------------------------| | Image classification | CNN | | Object detection | CNN + additional heads | | Image segmentation | CNN with skip connections | | Time series | RNN, LSTM, or GRU | | Text generation | Transformer (LLM) | | Machine translation | Transformer encoder-decoder | | Image generation | Diffusion model | | Text-to-image | Diffusion + text encoder |

Quick Reference

PyTorch Layer Instantiation

# Convolutional layer
nn.Conv2d(in_channels, out_channels, kernel_size, padding=0)

# Max pooling
nn.MaxPool2d(kernel_size=2, stride=2)

# Fully connected
nn.Linear(in_features, out_features)

# Dropout
nn.Dropout(p=0.5)

# RNN variants
nn.LSTM(input_size, hidden_size, num_layers)
n.GRU(input_size, hidden_size, num_layers)

Common Transformations

# Resize images
transforms.Resize((height, width))

# Convert to tensor
transforms.ToTensor()

# Normalize
transforms.Normalize(mean, std)

# Data augmentation
transforms.RandomRotation(degrees)
transforms.ColorJitter(brightness, contrast)

Next Steps

For implementation help:

Use scripts/cnn_template.py for image tasks
Use scripts/training_loop_template.py for training
Use scripts/parameter_calculator.py to estimate model size

For concept questions, refer to the relevant section above.

abelrguezr/deep-learning-helper

skills/AI/AI-Deep-Learning/SKILL.md

Help users understand and implement deep learning concepts including neural networks, CNNs, RNNs, LLMs, and diffusion models. Use this skill whenever the user asks about deep learning architectures, wants to build neural networks in PyTorch, needs help with training loops, or wants to understand concepts like backpropagation, activation functions, attention mechanisms, or generative models. Make sure to use this skill for any deep learning related questions, code reviews, architecture design, or implementation help.

5 stars

tools

Updated Apr 16, 2026

$ install --global

skillsauth

npx skillsauth add abelrguezr/hacktricks-skills deep-learning-helper

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 16, 2026, 2:33 AM37.4s2 files scanned

SKILL.md

name:: deep-learning-helper
description:: Help users understand and implement deep learning concepts including neural networks, CNNs, RNNs, LLMs, and diffusion models. Use this skill whenever the user asks about deep learning architectures, wants to build neural networks in PyTorch, needs help with training loops, or wants to understand concepts like backpropagation, activation functions, attention mechanisms, or generative models. Make sure to use this skill for any deep learning related questions, code reviews, architecture design, or implementation help.

Deep Learning Helper

A comprehensive guide to deep learning concepts and PyTorch implementation.

Core Concepts

Neural Networks

Neural networks are the foundation of deep learning. They consist of interconnected neurons organized in layers:

Input Layer: Receives raw data
Hidden Layers: Perform transformations (can have multiple layers)
Output Layer: Produces final predictions

Each neuron computes: z = w * x + b then applies an activation function.

Activation Functions

Activation functions introduce non-linearity, enabling networks to learn complex patterns:

Key insight: Without activation functions, a neural network is just a linear transformation regardless of depth.

Backpropagation

The training algorithm that adjusts weights to minimize loss:

Forward Pass: Compute output through the network
Loss Calculation: Compare prediction to target
Backward Pass: Compute gradients using chain rule
Weight Update: Adjust weights in opposite direction of gradient

Convolutional Neural Networks (CNNs)

CNNs excel at processing grid-like data (images) by learning spatial hierarchies of features.

CNN Components

Convolutional Layers: Apply learnable filters to extract features

Initial layers detect edges and textures
Intermediate layers detect shapes and patterns
Final layers detect complex objects

Pooling Layers: Downsample feature maps

Max pooling: keeps strongest activations
Reduces parameters and computational cost
Provides translation invariance

Fully Connected Layers: Final classification

Connects all neurons between layers
Typically at the end of the network

CNN Design Pattern

# Standard pattern: Conv → ReLU → Conv → ReLU → Pool
# Repeat, then flatten → FC → Output

Parameter Calculation

For a convolutional layer:

Parameters = (kernel_height × kernel_width × in_channels + 1) × out_channels

The +1 is for the bias term per output channel.

For a fully connected layer:

Parameters = (input_features + 1) × output_features

CNN Implementation Template

See scripts/cnn_template.py for a complete CNN implementation.

Key considerations:

Start with 32-64 filters, double every 2-3 layers
Use 3×3 kernels with padding=1 to preserve spatial dimensions
Apply max pooling (2×2, stride=2) after every 1-2 conv layers
Add dropout (0.5) before fully connected layers to prevent overfitting
Flatten after final pooling, before FC layers

Recurrent Neural Networks (RNNs)

RNNs process sequential data by maintaining a hidden state across time steps.

RNN Components

Recurrent Layers: Process sequences one step at a time
Hidden State: Vector summarizing past information
Output Layer: Produces predictions from hidden state

LSTM and GRU

Standard RNNs struggle with long-range dependencies due to vanishing gradients. LSTMs and GRUs solve this with gating mechanisms:

LSTM (Long Short-Term Memory):

Input gate: controls new information
Forget gate: controls what to discard
Output gate: controls what to output
Cell state: carries information across time steps

GRU (Gated Recurrent Unit):

Simpler than LSTM (combines input/forget gates)
Update gate: controls state updates
Reset gate: controls how much past to forget
More computationally efficient

Large Language Models (LLMs)

LLMs use transformer architecture for natural language tasks.

Transformer Architecture

Self-Attention: Weighs importance of different words in context

Computes attention scores between all word pairs
Allows model to focus on relevant context

Multi-Head Attention: Multiple attention mechanisms in parallel

Each head captures different relationships
Combined for richer representations

Positional Encoding: Adds position information

Transformers have no inherent order
Encoding provides sequence position context

Diffusion Models

Generative models that create data by reversing a noise-adding process.

How Diffusion Works

Forward Process: Gradually add noise to data

Transforms data into simple noise distribution
Defined by noise schedule

Reverse Process: Learn to denoise

Trained to reconstruct data from noisy samples
Generates new samples by starting from noise

Image Generation Pipeline:

Encode text prompt to latent representation
Sample random noise from Gaussian distribution
Apply diffusion steps to transform noise into image
Each step denoises based on text conditioning

Training Best Practices

Hyperparameters

Training Loop Pattern

See scripts/training_loop_template.py for a complete training implementation.

Essential steps:

Set model to train mode (model.train())
Zero gradients (optimizer.zero_grad())
Forward pass to get predictions
Compute loss
Backward pass (loss.backward())
Update weights (optimizer.step())

For evaluation:

Set model to eval mode (model.eval())
Use torch.no_grad() to disable gradient computation
Compute metrics without updating weights

Loss Functions

| Task | Loss Function | |------|---------------| | Multi-class classification | nn.CrossEntropyLoss() | | Binary classification | nn.BCEWithLogitsLoss() | | Regression | nn.MSELoss() |

Optimizers

Adam: Adaptive learning rates, good default choice
SGD: Stochastic gradient descent, can work well with momentum
RMSprop: Good for RNNs

Common Pitfalls

Forgetting to zero gradients: Gradients accumulate by default
Not setting train/eval mode: Dropout and batch norm behave differently
Mismatched input/output shapes: Verify tensor dimensions at each layer
Overfitting: Use dropout, data augmentation, weight decay
Vanishing gradients: Use ReLU, batch norm, or LSTM/GRU for sequences

When to Use Each Architecture

Quick Reference

PyTorch Layer Instantiation

# Convolutional layer
nn.Conv2d(in_channels, out_channels, kernel_size, padding=0)

# Max pooling
nn.MaxPool2d(kernel_size=2, stride=2)

# Fully connected
nn.Linear(in_features, out_features)

# Dropout
nn.Dropout(p=0.5)

# RNN variants
nn.LSTM(input_size, hidden_size, num_layers)
n.GRU(input_size, hidden_size, num_layers)

Common Transformations

# Resize images
transforms.Resize((height, width))

# Convert to tensor
transforms.ToTensor()

# Normalize
transforms.Normalize(mean, std)

# Data augmentation
transforms.RandomRotation(degrees)
transforms.ColorJitter(brightness, contrast)

Next Steps

For implementation help:

Use scripts/cnn_template.py for image tasks
Use scripts/training_loop_template.py for training
Use scripts/parameter_calculator.py to estimate model size

For concept questions, refer to the relevant section above.

Related Skills

abelrguezr/house-of-lore-exploit

testing

VerifiedTrustedCommunity

How to perform a House of Lore (small bin attack) heap exploitation. Use this skill whenever the user mentions heap exploitation, small bin attacks, fake chunks, glibc heap vulnerabilities, or needs to insert fake chunks into small bins for arbitrary read/write. Trigger for CTF challenges involving heap corruption, glibc 2.31+ exploitation, or when the user needs to bypass malloc sanity checks using fake chunk linking.

5SKILL.mdUpdated Apr 16, 2026

abelrguezr/house-of-lore-exploit

abelrguezr/house-of-force-exploit

testing

VerifiedTrustedCommunity

How to perform House of Force heap exploitation attacks. Use this skill whenever the user mentions heap exploitation, House of Force, top chunk manipulation, arbitrary memory allocation, malloc manipulation, or wants to allocate chunks at specific addresses. Also trigger for CTF challenges involving heap overflows, top chunk size overwrites, or when the user needs to calculate evil_size for heap attacks. Make sure to use this skill for any binary exploitation task involving glibc heap manipulation, even if they don't explicitly say "House of Force".

5SKILL.mdUpdated Apr 16, 2026

abelrguezr/house-of-force-exploit

abelrguezr/house-of-einherjar

tools

VerifiedTrustedCommunity

How to perform House of Einherjar heap exploitation to allocate memory at arbitrary addresses. Use this skill whenever the user mentions heap exploitation, glibc heap attacks, arbitrary memory allocation, off-by-one overflow exploitation, tcache poisoning, fast bin attacks, or any CTF challenge involving heap manipulation. This is essential for binary exploitation tasks where you need to control malloc() return addresses.

5SKILL.mdUpdated Apr 16, 2026

abelrguezr/house-of-einherjar

abelrguezr/heap-overflow-exploitation

testing

VerifiedTrustedCommunity

How to identify, analyze, and exploit heap overflow vulnerabilities in binary exploitation challenges and real-world scenarios. Use this skill whenever the user mentions heap overflows, memory corruption, heap grooming, tcache poisoning, fast-bin attacks, or any heap-related vulnerability in CTF challenges, binary analysis, or security research. This skill covers heap overflow fundamentals, exploitation techniques, heap grooming strategies, and real-world CVE analysis.

5SKILL.mdUpdated Apr 16, 2026

abelrguezr/heap-overflow-exploitation

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/abelrguezr/hacktricks-skills.git

# Copy into Claude Code skills folder (global)
cp -r hacktricks-skills/skills/AI/AI-Deep-Learning ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

abelrguezr/hacktricks-skills

5 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT