Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

miaodi/computational-learning-notes

Name: computational-learning-notes
Author: miaodi

skills/computational-learning-notes/SKILL.md

npx skillsauth add miaodi/llm_config computational-learning-notes

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Computational Learning Notes Skill

Purpose

Teach low-level computational and numerical concepts by turning them into small, demonstrative C++ examples with clear learning notes.

When To Use

Use when the user wants to understand a software/hardware concept through a minimal example, especially in a learning repository such as ~/repo/cxx_learn.

Typical topics include false sharing, floating-point behavior, registers, dependency chains, cache hierarchy, cache lines, locality, prefetching, branch prediction, SIMD, FMA, alignment, memory ordering, atomics, synchronization costs, compiler optimization, generated assembly, numerical accuracy, numerical stability, and benchmarking pitfalls.

Priorities

Teach one concept clearly.
Default to C++20 unless the user asks for another language.
Prefer the smallest example that exposes the mechanism.
Make the expected observation explicit.
Separate deterministic correctness behavior from hardware-sensitive performance trends.
State compiler, OS, CPU, build-mode, and benchmark caveats honestly.

Concept Coverage

CPU caches, cache lines, spatial locality, temporal locality, prefetching, TLBs, and memory latency.
Cache coherence, false sharing, true sharing, thread placement, and synchronization costs.
Registers, dependency chains, instruction latency, throughput, out-of-order execution, register pressure, and memory-level parallelism.
Branch prediction, control-flow predictability, branchless transforms, and pipeline effects.
SIMD, auto-vectorization, alignment, FMA, reductions, and data layout for vector-friendly code.
Floating-point bit layout, signed zero, subnormals, NaN, infinity, rounding, associativity failure, cancellation, and accumulation error.
Numerical stability, conditioning, summation order, Kahan summation, pairwise summation, and reproducibility tradeoffs.
C++ object layout, padding, alignment, copying, allocation, virtual dispatch, atomics, memory orders, data races, and happens-before relationships.
Compiler behavior such as inlining, dead-code elimination, constant folding, loop unrolling, aliasing assumptions, vectorization reports, and generated assembly.
Benchmark design issues such as warmup, timer resolution, CPU frequency scaling, turbo, thermal throttling, OS noise, small inputs, dead-code elimination, and hidden setup costs.
CUDA or GPU concepts only when requested or already present in the task, such as coalescing, shared memory, bank conflicts, occupancy, register pressure, warp divergence, and host-device transfer scope.

Workflow

Identify the exact concept the example should teach.
State the mechanism in plain language before writing code.
Choose the smallest C++ example that makes the mechanism visible.
Prefer one source file and one local README.md unless the existing project structure expects tests, benchmarks, or CMake targets.
Add a contrast when it improves learning, such as false sharing versus padded counters, strided versus contiguous access, branchy versus predictable branches, scalar versus vectorizable loops, naive summation versus compensated summation, or relaxed atomics versus stronger ordering.
Keep unrelated abstractions out of the example.
Preserve a simple correctness check when the example has optimized, parallel, or numerically approximate variants.
Control obvious confounders: optimization level, dead-code elimination, data size, alignment, warmup, thread scheduling, CPU frequency scaling, compiler version, and measurement overhead.
Explain what the learner should observe and why the result may vary across machines.
Write the note in the repository's learning-note style.

Demonstrative Code Heuristics

Use C++20 and the existing build style of the target repository.
Keep the code close enough to the mechanism that a reader can map source lines to cache behavior, instructions, synchronization, or floating-point operations.
Use explicit names such as naive, padded, strided, contiguous, branchy, branchless, scalar, vectorizable, relaxed, acquire_release, kahan, or pairwise.
Use deterministic inputs when possible.
Use std::atomic, std::barrier, std::thread, alignas, std::chrono, std::numeric_limits, and small standard-library facilities when they make the concept clearer.
Use intrinsics, inline assembly, compiler-specific attributes, or disassembly only when the concept cannot be demonstrated cleanly with portable C++.
For timing examples, keep setup outside the measured region and protect results from dead-code elimination.
For floating-point examples, print enough digits and explain that algebraic identities may fail under finite precision.
For concurrency examples, state whether the code is data-race-free and identify the intended happens-before relationship.
For compiler examples, explain which optimization is being invited or prevented.

Learning Note Shape

Prefer this structure for new or revised notes:

# Example Name

## Concept
What hardware, compiler, C++, or numerical concept this demonstrates.

## Minimal Example
The smallest relevant code shape, or a pointer to the source file if the code is long.

## What To Run
The configure, build, test, benchmark, or executable command.

## What To Look For
The expected output, trend, comparison, or failure mode.

## Why It Happens
The CPU, memory-system, compiler, C++, CUDA, or numerical mechanism behind the result.

## Caveats
What depends on hardware, compiler, optimization level, OS scheduling, input size, or benchmark setup.

## Extensions
Small follow-up experiments the learner can try.

The note should be precise rather than long. A short explanation that names the mechanism and limitation is better than a broad tutorial.

Review Checklist

Does the example isolate one concept?
Is C++ the default language unless another language is clearly better?
Is there a visible contrast or observation?
Can the compiler optimize away the behavior being demonstrated?
Is the measured region free of avoidable setup work?
Are results protected from dead-code elimination?
Are correctness checks present for optimized, parallel, or approximate variants?
Are cache-line, alignment, thread-scheduling, and memory-order assumptions stated when relevant?
Are floating-point rounding, overflow, underflow, NaN, infinity, and associativity caveats stated when relevant?
Are exact numbers avoided when only hardware-sensitive trends are justified?
Does the note include what to run, what to look for, why it happens, and caveats?

Constraints

Do not turn a learning example into a reusable framework unless the user explicitly asks for one.
Do not optimize production code under this skill; use cpp-performance for real hot-path optimization.
Do not claim universal benchmark numbers.
Do not hide the important mechanism behind generic abstractions, complicated templates, or large helper libraries.
Do not introduce non-portable code unless it is necessary for the lesson and clearly labeled.
Do not present undefined behavior, data races, or numerically unstable code as acceptable outside the demonstration.

Output

Provide:

the concept being taught
the minimal C++ example or changed files
how to build or run it
what observation to expect
the mechanism behind the observation
caveats and machine-specific dependencies
one or two small follow-up experiments when useful

miaodi/computational-learning-notes

skills/computational-learning-notes/SKILL.md

Use when creating C++ learning notes or minimal experiments for low-level computational, numerical, CPU/GPU, compiler, and hardware concepts such as false sharing, floating point, registers, caches, SIMD, atomics, numerical stability, and benchmarking pitfalls.

development

Updated Jun 2, 2026

$ install --global

skillsauth

npx skillsauth add miaodi/llm_config computational-learning-notes

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Jun 2, 2026, 3:01 AM37.8s1 file scanned

SKILL.md

name:: computational-learning-notes
description:: Use when creating C++ learning notes or minimal experiments for low-level computational, numerical, CPU/GPU, compiler, and hardware concepts such as false sharing, floating point, registers, caches, SIMD, atomics, numerical stability, and benchmarking pitfalls.

Computational Learning Notes Skill

Purpose

Teach low-level computational and numerical concepts by turning them into small, demonstrative C++ examples with clear learning notes.

When To Use

Use when the user wants to understand a software/hardware concept through a minimal example, especially in a learning repository such as ~/repo/cxx_learn.

Priorities

Teach one concept clearly.
Default to C++20 unless the user asks for another language.
Prefer the smallest example that exposes the mechanism.
Make the expected observation explicit.
Separate deterministic correctness behavior from hardware-sensitive performance trends.
State compiler, OS, CPU, build-mode, and benchmark caveats honestly.

Concept Coverage

CPU caches, cache lines, spatial locality, temporal locality, prefetching, TLBs, and memory latency.
Cache coherence, false sharing, true sharing, thread placement, and synchronization costs.
Registers, dependency chains, instruction latency, throughput, out-of-order execution, register pressure, and memory-level parallelism.
Branch prediction, control-flow predictability, branchless transforms, and pipeline effects.
SIMD, auto-vectorization, alignment, FMA, reductions, and data layout for vector-friendly code.
Floating-point bit layout, signed zero, subnormals, NaN, infinity, rounding, associativity failure, cancellation, and accumulation error.
Numerical stability, conditioning, summation order, Kahan summation, pairwise summation, and reproducibility tradeoffs.
C++ object layout, padding, alignment, copying, allocation, virtual dispatch, atomics, memory orders, data races, and happens-before relationships.
Compiler behavior such as inlining, dead-code elimination, constant folding, loop unrolling, aliasing assumptions, vectorization reports, and generated assembly.
Benchmark design issues such as warmup, timer resolution, CPU frequency scaling, turbo, thermal throttling, OS noise, small inputs, dead-code elimination, and hidden setup costs.
CUDA or GPU concepts only when requested or already present in the task, such as coalescing, shared memory, bank conflicts, occupancy, register pressure, warp divergence, and host-device transfer scope.

Workflow

Identify the exact concept the example should teach.
State the mechanism in plain language before writing code.
Choose the smallest C++ example that makes the mechanism visible.
Prefer one source file and one local README.md unless the existing project structure expects tests, benchmarks, or CMake targets.
Add a contrast when it improves learning, such as false sharing versus padded counters, strided versus contiguous access, branchy versus predictable branches, scalar versus vectorizable loops, naive summation versus compensated summation, or relaxed atomics versus stronger ordering.
Keep unrelated abstractions out of the example.
Preserve a simple correctness check when the example has optimized, parallel, or numerically approximate variants.
Control obvious confounders: optimization level, dead-code elimination, data size, alignment, warmup, thread scheduling, CPU frequency scaling, compiler version, and measurement overhead.
Explain what the learner should observe and why the result may vary across machines.
Write the note in the repository's learning-note style.

Demonstrative Code Heuristics

Use C++20 and the existing build style of the target repository.
Keep the code close enough to the mechanism that a reader can map source lines to cache behavior, instructions, synchronization, or floating-point operations.
Use explicit names such as naive, padded, strided, contiguous, branchy, branchless, scalar, vectorizable, relaxed, acquire_release, kahan, or pairwise.
Use deterministic inputs when possible.
Use std::atomic, std::barrier, std::thread, alignas, std::chrono, std::numeric_limits, and small standard-library facilities when they make the concept clearer.
Use intrinsics, inline assembly, compiler-specific attributes, or disassembly only when the concept cannot be demonstrated cleanly with portable C++.
For timing examples, keep setup outside the measured region and protect results from dead-code elimination.
For floating-point examples, print enough digits and explain that algebraic identities may fail under finite precision.
For concurrency examples, state whether the code is data-race-free and identify the intended happens-before relationship.
For compiler examples, explain which optimization is being invited or prevented.

Learning Note Shape

Prefer this structure for new or revised notes:

# Example Name

## Concept
What hardware, compiler, C++, or numerical concept this demonstrates.

## Minimal Example
The smallest relevant code shape, or a pointer to the source file if the code is long.

## What To Run
The configure, build, test, benchmark, or executable command.

## What To Look For
The expected output, trend, comparison, or failure mode.

## Why It Happens
The CPU, memory-system, compiler, C++, CUDA, or numerical mechanism behind the result.

## Caveats
What depends on hardware, compiler, optimization level, OS scheduling, input size, or benchmark setup.

## Extensions
Small follow-up experiments the learner can try.

The note should be precise rather than long. A short explanation that names the mechanism and limitation is better than a broad tutorial.

Review Checklist

Does the example isolate one concept?
Is C++ the default language unless another language is clearly better?
Is there a visible contrast or observation?
Can the compiler optimize away the behavior being demonstrated?
Is the measured region free of avoidable setup work?
Are results protected from dead-code elimination?
Are correctness checks present for optimized, parallel, or approximate variants?
Are cache-line, alignment, thread-scheduling, and memory-order assumptions stated when relevant?
Are floating-point rounding, overflow, underflow, NaN, infinity, and associativity caveats stated when relevant?
Are exact numbers avoided when only hardware-sensitive trends are justified?
Does the note include what to run, what to look for, why it happens, and caveats?

Constraints

Do not turn a learning example into a reusable framework unless the user explicitly asks for one.
Do not optimize production code under this skill; use cpp-performance for real hot-path optimization.
Do not claim universal benchmark numbers.
Do not hide the important mechanism behind generic abstractions, complicated templates, or large helper libraries.
Do not introduce non-portable code unless it is necessary for the lesson and clearly labeled.
Do not present undefined behavior, data races, or numerically unstable code as acceptable outside the demonstration.

Output

Provide:

the concept being taught
the minimal C++ example or changed files
how to build or run it
what observation to expect
the mechanism behind the observation
caveats and machine-specific dependencies
one or two small follow-up experiments when useful

Related Skills

miaodi/latex-project-build

development

VerifiedTrustedCommunity

Use when configuring, diagnosing, or compiling LaTeX projects, especially multi-file reports, theses, books, chapter-based projects, Overleaf exports, latexmk/arara/Makefile workflows, bibliography/index/glossary passes, or projects that require pdflatex, xelatex, lualatex, latex->dvips, biber, or bibtex.

SKILL.mdUpdated May 28, 2026

miaodi/latex-project-build

miaodi/graph-algorithms

development

VerifiedTrustedCommunity

Use when working with graph traversals (BFS, DFS, level-order), minimum spanning trees, strongly connected components, topological sort, graph coloring, bipartite detection, elimination trees, level-set extraction, parallel graph algorithms, task-tree parallelism, sparse graph representations, and exploiting graph structure for parallel sparse computations.

SKILL.mdUpdated May 21, 2026

miaodi/graph-algorithms

miaodi/git-workflow

testing

VerifiedTrustedCommunity

Use when planning or executing Git branch workflows, especially merge/rebase across branches, conflict resolution, safe history rewriting, and recovery from mistakes.

SKILL.mdUpdated May 21, 2026

miaodi/commit-message

tools

VerifiedTrustedCommunity

Use when drafting, reviewing, or normalizing Git commit messages or Perforce changelist descriptions from the shared Git/P4 commit message template.

SKILL.mdUpdated May 21, 2026

miaodi/commit-message

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/miaodi/llm_config.git

# Copy into Claude Code skills folder (global)
cp -r llm_config/skills/computational-learning-notes ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

miaodi/llm_config

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT