plugins/vector-db/skills/vector-db-init/SKILL.md
Interactively initializes the Vector DB plugin. Guided discovery asks which folders to index, confirms the manifest, then scaffolds vector_profiles.json for high-performance In-Process or Native Server connections. Mandatory first step before ingestion or search.
npx skillsauth add richfrem/agent-plugins-skills vector-db-initInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
This skill requires Python 3.8+ and standard library for initialization. Performance operations require chromadb and langchain as defined in the plugin root requirements.
To install this skill's dependencies:
python -m piptools compile requirements.in --output-file requirements.txt
pip install -r requirements.txt
The vector-db-init skill is an interactive setup routine that prepares the environment for the Vector database. It follows the same pattern as rlm-init and wiki-init for a consistent experience across all three retrieval plugins.
All operational settings live in .agent/learning/vector_profiles.json. These control performance and connection mode.
| Parameter | Default | Purpose |
|:-----|:--------|:--------|
| chroma_host | "" | Empty = In-Process (Direct Disk); IP = Server mode. |
| batch_size | 1000 | Files processed per embedding batch. |
| embedding_model | nomic-ai/nomic-embed-text-v1.5 | Semantic model for indexing. |
| device | cpu | Hardware: cpu or cuda (NVIDIA GPU). |
| parent_chunk_size | 2000 | Parent chunk granularity. |
| child_chunk_size | 400 | Child chunk granularity. |
vector-db plugin..agent/learning/vector_profiles.json.Vector-db runs In-Process by default — ChromaDB persists directly to a local directory
(configured as chroma_data_path in vector_profiles.json). No server process is needed.
When running ingest.py or query.py you will see:
[WARN] Failed to connect to remote ChromaDB ... Falling back to local.
[DIR] Connecting to local persistent ChromaDB at .agent/learning/vector_wiki_db...
This is expected and correct. The remote-server check (127.0.0.1:8110) happens
automatically in case a server IS running, but falls back gracefully. Only switch to
server mode (vector-db-launch skill) if you need multiple concurrent writers.
Before anything else, install the plugin's Python dependencies from the lockfile.
Run from the project root:
# Regenerate the lockfile from the intent file (only needed when requirements.in changes):
python -m piptools compile plugins/vector-db/requirements.in \
--output-file plugins/vector-db/requirements.txt
# Install all dependencies (always run this on first setup):
python -m pip install -r plugins/vector-db/requirements.txt
Note:
pip-toolsitself must be installed first if not present:python -m pip install pip-toolsKnown gotcha: The system
pipcommand may not be available on macOS. Always usepython -m pip install ...rather than barepip install ....
Verify the critical packages are installed:
python -c "import chromadb; print('chromadb:', chromadb.__version__)"
python -c "import einops; print('einops: OK')"
python -c "from sentence_transformers import SentenceTransformer; print('sentence-transformers: OK')"
If any check fails, the install step above will fix it.
Ask this after dependencies are installed.
First, check what other plugins are installed:
ls .agents/skills/rlm-init/ 2>/dev/null && echo "rlm-factory: INSTALLED" || echo "rlm-factory: NOT FOUND"
ls .agents/skills/obsidian-wiki-builder/ 2>/dev/null && echo "obsidian-wiki-engine: INSTALLED" || echo "obsidian-wiki-engine: NOT FOUND"
Then ask:
Vector DB works standalone with zero external dependencies. You can also combine it with
other plugins for a more powerful retrieval stack. What setup would you like?
A) Vector DB only (standalone)
- Semantic search over any indexed folders
- No other plugins needed — works right now
B) Vector DB + RLM Phase 1 pre-filter [requires: rlm-factory in .agents/]
- RLM keyword pre-filter -> vector semantic search
- Reduces noise, improves precision for large corpora
C) Vector DB as wiki Phase 2 search [requires: obsidian-wiki-engine in .agents/]
- Adds vector semantic search to /wiki-query
- /wiki-query: RLM keyword (O(1)) -> vector (O(log N)) -> grep exact
D) Full Super-RAG [requires: rlm-factory + obsidian-wiki-engine]
- All three phases: RLM keyword -> vector semantic -> wiki concept nodes
Enter A, B, C, or D (default: A):
If required plugins are NOT installed for the chosen mode:
[plugin-name] is not installed in .agents/.
To install it:
# Recommended (uvx -- works on Mac, Linux, Windows)
uvx --from git+https://github.com/richfrem/agent-plugins-skills plugin-add richfrem/agent-plugins-skills
# See full install guide
cat INSTALL.md
After installing, re-run /vector-db:init and choose your desired mode.
Continue with Mode A (standalone) for now? (y) or abort and install first? (n)
Scan the project root and present a numbered table of candidate directories:
find . -maxdepth 1 -type d | grep -v '^\.$' | grep -v -E '\.(git|venv|vscode|windsurf|claude|agents|agent|knowledge_vector_data|wiki|vector_data)$' | sort
Present results as a numbered table with a one-line description of each folder. Then ask:
Which folders should be treated as raw content sources for vector indexing?
Enter numbers separated by commas (e.g. 1, 3, 5)
or type custom paths (relative or absolute)
or both (e.g. 1, 2, /path/to/other/dir)
You can specify all sources now in one go.
Resolve all selected paths to their relative form from the project root (e.g. plugins/, plugin-research/).
Validate each path exists. Warn if a path does not exist -- ask the user to confirm or skip it.
Then ask once, globally:
Any subdirectory patterns or file types to exclude beyond the defaults?
Defaults: .git/, node_modules/, .venv/, __pycache__/, requirements.in, requirements.txt
Press Enter to accept defaults, or type additions (e.g. temp/, *.tmp):
Display the complete manifest before writing, using the same flat schema as rlm-factory and obsidian-wiki-engine:
{
"description": "Globs tracking project documentation and knowledge records.",
"include": [
"<folder_1>/",
"<folder_2>/"
],
"exclude": [
"/.git/",
"/node_modules/",
"/.venv/",
"/__pycache__/",
"requirements.in",
"requirements.txt"
]
}
Ask: "Does this look correct? (y to write, e to edit, q to abort)"
If .agent/learning/vector_knowledge_manifest.json already exists:
include array; never remove existing entriesWrite to: .agent/learning/vector_knowledge_manifest.json
Create parent directories if needed.
Note on manifest naming:
vector_profiles.jsonmay referencevector_wiki_manifest.json(legacy name). The canonical filename going forward isvector_knowledge_manifest.json. If the profile still points to the old name, update themanifestfield invector_profiles.jsonto match.
After the manifest is confirmed, run the init script which handles profile scaffolding and dependency installation:
python ./scripts/init.py
The script will:
requirements.txt).agent/learning/vector_profiles.json with the wiki profilechroma_host: "" (In-Process mode by default — no server needed)After the script runs, verify the profile's manifest field points to vector_knowledge_manifest.json.
If it still shows vector_wiki_manifest.json, update it:
{
"version": 2,
"profiles": {
"wiki": {
"manifest": ".agent/learning/vector_knowledge_manifest.json"
}
},
"default_profile": "wiki"
}
Confirm the files written, then print:
=== Vector DB Setup Complete (Mode <X>) ===
Files written:
- .agent/learning/vector_knowledge_manifest.json (<N> sources)
- .agent/learning/vector_profiles.json (wiki profile ready)
Next steps:
/vector-db:ingest <- build the semantic index from your sources
/vector-db:search <- run semantic queries
/vector-db:audit <- check index coverage
[Mode B/C/D] To activate the full retrieval stack:
/rlm-factory:init <- set up RLM Phase 1 keyword pre-filter
/wiki-init <- set up wiki concept node layer
data-ai
Task management agent. Auto-invoked for task creation, status tracking, and kanban board operations using Markdown files across lane directories. V2 enforces Kanban Sovereignty constraints preventing manual task file edits.
development
Create, audit, repair, and document cross-platform symlinks that work correctly on both Windows and macOS/Linux. Use this skill whenever the user mentions symlinks, symbolic links, junction points, .gitconfig symlinks, broken links after git pull, cross-platform path issues, or needs help with ln -s equivalents on Windows. Also trigger when the user reports that files are missing or wrong after switching between Mac and Windows machines using Git. This skill solves the common problem where symlinks committed on macOS show up as plain text files on Windows (and vice versa) because of Git's core.symlinks setting or missing Developer Mode / elevated permissions. **IMPORTANT FOR WINDOWS USERS:** Developer Mode must be enabled before creating symlinks. Without it, Git will check out symlinks as plain-text files or hardlinks, breaking cross-platform workflows.
development
Interactively prepares a targeted Red Team Review package. It conducts a brief discovery interview to determine the threat model, generates a strict security auditor prompt, compiles a manifest of relevant project files, and bundles them into a single Markdown artifact or ZIP archive ready for an external LLM (like Grok, ChatGPT, or Gemini) or a human reviewer.
tools
Reduces AI agent context bloat across three dimensions: (1) duplicate skill deduplication — clears stale agent directory copies since the IDE already reads from plugins/ directly; (2) instruction file optimization — rewrites CLAUDE.md, GEMINI.md, or .github/copilot-instructions.md to under ~80 lines, keeping only rules that directly change agent behaviour; (3) session token efficiency — guidance on cheap subagent delegation, context compounding across turns, and session hygiene. Trigger with "optimize context", "reduce context bloat", "deduplicate skills", "trim CLAUDE.md", "trim GEMINI.md", "fix my context usage", "why are my skills loading twice", "how do I reduce token usage", or "clean up agent directories".