Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

genomewalker/hint-corpus

Name: hint-corpus
Author: genomewalker

codex-plugin/skills/hint-corpus/SKILL.md

npx skillsauth add genomewalker/cc-soul hint-corpus

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

hint-corpus

Full pipeline to produce chitta-hint-tuned (Qwen3-0.6B Q4_K_M) from scratch.

Quick Start

# 1. Generate corpus (requires Ollama + gemma4:26b or any capable model)
python3 $PLUGIN_DIR/scripts/generate_hint_corpus.py \
    --out /maps/projects/caeg/scratch/kbd606/tmp/hint_corpus_raw.jsonl \
    --model gemma4:26b \
    --target 3000

# 2. Convert to Qwen3 ChatML for unsloth
python3 $PLUGIN_DIR/scripts/convert_to_chatml.py \
    --in  /maps/projects/caeg/scratch/kbd606/tmp/hint_corpus_raw.jsonl \
    --out /maps/projects/caeg/scratch/kbd606/tmp/hint_corpus_chatml.jsonl \
    --split 0.1

# 3. Fine-tune Qwen3-0.6B + export GGUF
bash $PLUGIN_DIR/scripts/finetune_hint_qwen.sh \
    --data /maps/projects/caeg/scratch/kbd606/tmp/hint_corpus_chatml.jsonl \
    --steps 300

# 4. Register with Ollama
bash $PLUGIN_DIR/chitta-mcp/enrichers/setup_hint_model.sh

Where $PLUGIN_DIR = /maps/projects/fernandezguerra/apps/repos/cc-soul (or installed plugin path).

Stage 1 — Corpus Generation

generate_hint_corpus.py builds diverse synthetic conversation excerpts and labels them via a teacher LLM. It covers:

| Axis | Examples | |------|---------| | Profession | bioinformatician, nurse, teacher, architect, chef... | | Location | city, country, living situation | | Language background | native/non-native/bilingual | | Relationships | partner, children, pets | | Health | dietary restrictions, exercise habits | | Hobbies | sports, arts, gaming, gardening... | | Preferences | dark mode, editors, morning/evening person | | Education | PhD, self-taught, vocational |

35% hard negatives (questions, debugging requests, factual queries — output: -).

Key flags:

--target N        # examples to generate (default: 1500; recommend 3000)
--model MODEL     # teacher model (default: llama3.3:70b; gemma4:26b works well)
--neg-ratio 0.35  # fraction of negatives
--dry-run         # preview templates, no LLM calls

Expected runtime: ~2h for 3000 examples with gemma4:26b on a single GPU node.

Stage 2 — ChatML Conversion

convert_to_chatml.py wraps each {"input", "output"} row in a ShareGPT conversation with the system prompt baked in.

System prompt (fixed, version-controlled):

You extract personal facts from conversation excerpts. Given a message or conversation, output a single concise third-person sentence about the user (e.g. "User lives in Copenhagen.", "User has two cats."). If no stable personal fact is present, output exactly: -

--split 0.1 writes a 10% eval holdout to <out>_eval.jsonl.

Stage 3 — Fine-tuning

finetune_hint_qwen.sh runs QLoRA via unsloth:

| Hyperparameter | Default | |----------------|---------| | Base model | Qwen/Qwen3-0.6B | | LoRA rank | 16 | | LoRA alpha | 32 | | Max steps | 200 | | Batch size | 4 × grad_accum 4 = 16 effective | | Learning rate | 2e-4 | | Quantisation | 4-bit QLoRA (bitsandbytes) |

Requirements:

pip install "unsloth[colab-new]" xformers trl peft accelerate bitsandbytes

GPU note: Qwen3-0.6B fits in ~4 GB VRAM at 4-bit. CPU training is possible but slow (~30 min/100 steps).

After training, the script:

Merges LoRA → fp16 safetensors ($OUT_DIR)
Converts to F16 GGUF via convert_hf_to_gguf.py (needs llama.cpp)
Quantises to Q4_K_M via llama-quantize (~480 MB)

Override paths via environment:

CHITTA_HINT_DATA=/path/to/corpus.jsonl
CHITTA_HINT_MODEL_DIR=/path/to/merged_output
CHITTA_HINT_GGUF_DIR=/path/to/gguf_output
LLAMA_CONVERT=/path/to/llama.cpp/convert_hf_to_gguf.py
LLAMA_QUANTIZE=/path/to/llama-quantize

Stage 4 — Ollama Registration

setup_hint_model.sh registers the Q4_K_M GGUF with Ollama as chitta-hint-tuned.

It checks $CHITTA_HINT_GGUF_DIR for the GGUF, falls back to F16, then safetensors.

After registration, test with:

chitta hint_enrich --dry-run
# or via MCP:
chitta run_hint_enricher --dry_run true --limit 10

Embedding Quality Check

After registration, run the embedding benchmark:

python3 /maps/projects/caeg/scratch/kbd606/tmp/test_embeddings.py

Target metrics vs Qwen2.5-0.5B baseline: | Metric | Baseline | Target | |--------|----------|--------| | Personal↔Personal cosine | 0.76 | >0.85 | | Separation ratio (pp−pn) | 0.28 | >0.40 | | NN accuracy | 5/8 | 7/8+ |

Qwen3-0.6B shares its architecture with Qwen3-Embedding-0.6B (MTEB STS 86.57) — use --pooling last and L2-normalize embeddings.

Notes

Single GGUF, dual use: same checkpoint serves generation (personal fact extraction) and embedding (last-token pooling + L2 norm). Append <|endoftext|> as final token for embedding mode.
Corpus is general-purpose — not specific to any user. Covers 10+ diversity axes so the model generalises across professions, cultures, and relationship types.
Iterative improvement: run /hint-corpus again after accumulating new session data. Use --target 5000 if separation metrics plateau at 3k.

genomewalker/hint-corpus

codex-plugin/skills/hint-corpus/SKILL.md

Build, convert, and fine-tune the Qwen3-0.6B hint model for personal fact extraction. Covers corpus generation, ChatML conversion, LoRA fine-tuning with unsloth, GGUF export, and Ollama registration.

2 stars

development

Updated May 24, 2026

$ install --global

skillsauth

npx skillsauth add genomewalker/cc-soul hint-corpus

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: May 24, 2026, 6:45 AM21.6s1 file scanned

SKILL.md

name:: hint-corpus
description:: Build, convert, and fine-tune the Qwen3-0.6B hint model for personal fact extraction. Covers corpus generation, ChatML conversion, LoRA fine-tuning with unsloth, GGUF export, and Ollama registration.
execution:: direct
aliases:: [hint-finetune, corpus-gen, build-hint-model]

hint-corpus

Full pipeline to produce chitta-hint-tuned (Qwen3-0.6B Q4_K_M) from scratch.

Quick Start

# 1. Generate corpus (requires Ollama + gemma4:26b or any capable model)
python3 $PLUGIN_DIR/scripts/generate_hint_corpus.py \
    --out /maps/projects/caeg/scratch/kbd606/tmp/hint_corpus_raw.jsonl \
    --model gemma4:26b \
    --target 3000

# 2. Convert to Qwen3 ChatML for unsloth
python3 $PLUGIN_DIR/scripts/convert_to_chatml.py \
    --in  /maps/projects/caeg/scratch/kbd606/tmp/hint_corpus_raw.jsonl \
    --out /maps/projects/caeg/scratch/kbd606/tmp/hint_corpus_chatml.jsonl \
    --split 0.1

# 3. Fine-tune Qwen3-0.6B + export GGUF
bash $PLUGIN_DIR/scripts/finetune_hint_qwen.sh \
    --data /maps/projects/caeg/scratch/kbd606/tmp/hint_corpus_chatml.jsonl \
    --steps 300

# 4. Register with Ollama
bash $PLUGIN_DIR/chitta-mcp/enrichers/setup_hint_model.sh

Where $PLUGIN_DIR = /maps/projects/fernandezguerra/apps/repos/cc-soul (or installed plugin path).

Stage 1 — Corpus Generation

generate_hint_corpus.py builds diverse synthetic conversation excerpts and labels them via a teacher LLM. It covers:

35% hard negatives (questions, debugging requests, factual queries — output: -).

Key flags:

--target N        # examples to generate (default: 1500; recommend 3000)
--model MODEL     # teacher model (default: llama3.3:70b; gemma4:26b works well)
--neg-ratio 0.35  # fraction of negatives
--dry-run         # preview templates, no LLM calls

Expected runtime: ~2h for 3000 examples with gemma4:26b on a single GPU node.

Stage 2 — ChatML Conversion

convert_to_chatml.py wraps each {"input", "output"} row in a ShareGPT conversation with the system prompt baked in.

System prompt (fixed, version-controlled):

You extract personal facts from conversation excerpts. Given a message or conversation, output a single concise third-person sentence about the user (e.g. "User lives in Copenhagen.", "User has two cats."). If no stable personal fact is present, output exactly: -

--split 0.1 writes a 10% eval holdout to <out>_eval.jsonl.

Stage 3 — Fine-tuning

finetune_hint_qwen.sh runs QLoRA via unsloth:

Requirements:

pip install "unsloth[colab-new]" xformers trl peft accelerate bitsandbytes

GPU note: Qwen3-0.6B fits in ~4 GB VRAM at 4-bit. CPU training is possible but slow (~30 min/100 steps).

After training, the script:

Merges LoRA → fp16 safetensors ($OUT_DIR)
Converts to F16 GGUF via convert_hf_to_gguf.py (needs llama.cpp)
Quantises to Q4_K_M via llama-quantize (~480 MB)

Override paths via environment:

CHITTA_HINT_DATA=/path/to/corpus.jsonl
CHITTA_HINT_MODEL_DIR=/path/to/merged_output
CHITTA_HINT_GGUF_DIR=/path/to/gguf_output
LLAMA_CONVERT=/path/to/llama.cpp/convert_hf_to_gguf.py
LLAMA_QUANTIZE=/path/to/llama-quantize

Stage 4 — Ollama Registration

setup_hint_model.sh registers the Q4_K_M GGUF with Ollama as chitta-hint-tuned.

It checks $CHITTA_HINT_GGUF_DIR for the GGUF, falls back to F16, then safetensors.

After registration, test with:

chitta hint_enrich --dry-run
# or via MCP:
chitta run_hint_enricher --dry_run true --limit 10

Embedding Quality Check

After registration, run the embedding benchmark:

python3 /maps/projects/caeg/scratch/kbd606/tmp/test_embeddings.py

Qwen3-0.6B shares its architecture with Qwen3-Embedding-0.6B (MTEB STS 86.57) — use --pooling last and L2-normalize embeddings.

Notes

Single GGUF, dual use: same checkpoint serves generation (personal fact extraction) and embedding (last-token pooling + L2 norm). Append <|endoftext|> as final token for embedding mode.
Corpus is general-purpose — not specific to any user. Covers 10+ diversity axes so the model generalises across professions, cultures, and relationship types.
Iterative improvement: run /hint-corpus again after accumulating new session data. Use --target 5000 if separation metrics plateau at 3k.

Related Skills