Dependencies

This skill requires Python 3.8+ and standard library for initialization. Performance operations require chromadb and langchain as defined in the plugin root requirements.

To install this skill's dependencies:

python -m piptools compile requirements.in --output-file requirements.txt
pip install -r requirements.txt

Vector DB Initialization

The vector-db-init skill is an interactive setup routine that prepares the environment for the Vector database. It follows the same pattern as rlm-init and wiki-init for a consistent experience across all three retrieval plugins.

Profile Configuration Reference

All operational settings live in .agent/learning/vector_profiles.json. These control performance and connection mode.

| Parameter | Default | Purpose | |:-----|:--------|:--------| | chroma_host | "" | Empty = In-Process (Direct Disk); IP = Server mode. | | batch_size | 1000 | Files processed per embedding batch. | | embedding_model | nomic-ai/nomic-embed-text-v1.5 | Semantic model for indexing. | | device | cpu | Hardware: cpu or cuda (NVIDIA GPU). | | parent_chunk_size | 2000 | Parent chunk granularity. | | child_chunk_size | 400 | Child chunk granularity. |

When to Use This

When a user first installs the vector-db plugin.
If the Vector DB profile is missing from .agent/learning/vector_profiles.json.
If you need to add a new manifest profile or update which folders are indexed.
If you need to re-scaffold a clean configuration after a plugin upgrade.

Default: In-Process (Filesystem) Mode

Vector-db runs In-Process by default — ChromaDB persists directly to a local directory (configured as chroma_data_path in vector_profiles.json). No server process is needed.

When running ingest.py or query.py you will see:

[WARN] Failed to connect to remote ChromaDB ... Falling back to local.
[DIR] Connecting to local persistent ChromaDB at .agent/learning/vector_wiki_db...

This is expected and correct. The remote-server check (127.0.0.1:8110) happens automatically in case a server IS running, but falls back gracefully. Only switch to server mode (vector-db-launch skill) if you need multiple concurrent writers.

Step 0: Install Dependencies (MANDATORY — do this first)

Before anything else, install the plugin's Python dependencies from the lockfile.

Run from the project root:

# Regenerate the lockfile from the intent file (only needed when requirements.in changes):
python -m piptools compile plugins/vector-db/requirements.in \
    --output-file plugins/vector-db/requirements.txt

# Install all dependencies (always run this on first setup):
python -m pip install -r plugins/vector-db/requirements.txt

Note: pip-tools itself must be installed first if not present:
python -m pip install pip-tools
Known gotcha: The system pip command may not be available on macOS. Always use python -m pip install ... rather than bare pip install ....

Verify the critical packages are installed:

python -c "import chromadb; print('chromadb:', chromadb.__version__)"
python -c "import einops; print('einops: OK')"
python -c "from sentence_transformers import SentenceTransformer; print('sentence-transformers: OK')"

If any check fails, the install step above will fix it.

Step 1: Setup Mode Selection

Ask this after dependencies are installed.

First, check what other plugins are installed:

ls .agents/skills/rlm-init/              2>/dev/null && echo "rlm-factory: INSTALLED"          || echo "rlm-factory: NOT FOUND"
ls .agents/skills/obsidian-wiki-builder/ 2>/dev/null && echo "obsidian-wiki-engine: INSTALLED"  || echo "obsidian-wiki-engine: NOT FOUND"

Then ask:

Vector DB works standalone with zero external dependencies. You can also combine it with
other plugins for a more powerful retrieval stack. What setup would you like?

  A) Vector DB only (standalone)
     - Semantic search over any indexed folders
     - No other plugins needed — works right now

  B) Vector DB + RLM Phase 1 pre-filter                [requires: rlm-factory in .agents/]
     - RLM keyword pre-filter -> vector semantic search
     - Reduces noise, improves precision for large corpora

  C) Vector DB as wiki Phase 2 search                  [requires: obsidian-wiki-engine in .agents/]
     - Adds vector semantic search to /wiki-query
     - /wiki-query: RLM keyword (O(1)) -> vector (O(log N)) -> grep exact

  D) Full Super-RAG                                    [requires: rlm-factory + obsidian-wiki-engine]
     - All three phases: RLM keyword -> vector semantic -> wiki concept nodes

Enter A, B, C, or D (default: A):

If required plugins are NOT installed for the chosen mode:

[plugin-name] is not installed in .agents/.

To install it:

  # Recommended (uvx -- works on Mac, Linux, Windows)
  uvx --from git+https://github.com/richfrem/agent-plugins-skills plugin-add richfrem/agent-plugins-skills

  # See full install guide
  cat INSTALL.md

After installing, re-run /vector-db:init and choose your desired mode.

Continue with Mode A (standalone) for now? (y) or abort and install first? (n)

Step 1: Guided Source Discovery

Scan the project root and present a numbered table of candidate directories:

find . -maxdepth 1 -type d | grep -v '^\.$' | grep -v -E '\.(git|venv|vscode|windsurf|claude|agents|agent|knowledge_vector_data|wiki|vector_data)$' | sort

Present results as a numbered table with a one-line description of each folder. Then ask:

Which folders should be treated as raw content sources for vector indexing?

  Enter numbers separated by commas (e.g. 1, 3, 5)
  or type custom paths (relative or absolute)
  or both (e.g. 1, 2, /path/to/other/dir)

  You can specify all sources now in one go.

Resolve all selected paths to their relative form from the project root (e.g. plugins/, plugin-research/). Validate each path exists. Warn if a path does not exist -- ask the user to confirm or skip it.

Then ask once, globally:

Any subdirectory patterns or file types to exclude beyond the defaults?
Defaults: .git/, node_modules/, .venv/, __pycache__/, requirements.in, requirements.txt

Press Enter to accept defaults, or type additions (e.g. temp/, *.tmp):

Step 2: Confirm and Write Manifest

Display the complete manifest before writing, using the same flat schema as rlm-factory and obsidian-wiki-engine:

{
  "description": "Globs tracking project documentation and knowledge records.",
  "include": [
    "<folder_1>/",
    "<folder_2>/"
  ],
  "exclude": [
    "/.git/",
    "/node_modules/",
    "/.venv/",
    "/__pycache__/",
    "requirements.in",
    "requirements.txt"
  ]
}

Ask: "Does this look correct? (y to write, e to edit, q to abort)"

If .agent/learning/vector_knowledge_manifest.json already exists:

Ask: "A manifest already exists. Overwrite, merge (add new includes only), or abort? (o/m/q)"
Merge: append new paths to the existing include array; never remove existing entries
Overwrite: replace entirely with the new manifest

Write to: .agent/learning/vector_knowledge_manifest.json Create parent directories if needed.

Note on manifest naming: vector_profiles.json may reference vector_wiki_manifest.json (legacy name). The canonical filename going forward is vector_knowledge_manifest.json. If the profile still points to the old name, update the manifest field in vector_profiles.json to match.

Step 3: Scaffold Profile and Install Dependencies

After the manifest is confirmed, run the init script which handles profile scaffolding and dependency installation:

python ./scripts/init.py

The script will:

Install Python dependencies from the lockfile (requirements.txt)
Scaffold or update .agent/learning/vector_profiles.json with the wiki profile
Set chroma_host: "" (In-Process mode by default — no server needed)

After the script runs, verify the profile's manifest field points to vector_knowledge_manifest.json. If it still shows vector_wiki_manifest.json, update it:

{
  "version": 2,
  "profiles": {
    "wiki": {
      "manifest": ".agent/learning/vector_knowledge_manifest.json"
    }
  },
  "default_profile": "wiki"
}

Step 4: Verify and Show Next Steps

Confirm the files written, then print:

=== Vector DB Setup Complete (Mode <X>) ===

Files written:
  - .agent/learning/vector_knowledge_manifest.json  (<N> sources)
  - .agent/learning/vector_profiles.json            (wiki profile ready)

Next steps:
  /vector-db:ingest   <- build the semantic index from your sources
  /vector-db:search   <- run semantic queries
  /vector-db:audit    <- check index coverage

[Mode B/C/D] To activate the full retrieval stack:
  /rlm-factory:init   <- set up RLM Phase 1 keyword pre-filter
  /wiki-init          <- set up wiki concept node layer

Dependencies

This skill requires Python 3.8+ and standard library for initialization. Performance operations require chromadb and langchain as defined in the plugin root requirements.

To install this skill's dependencies:

python -m piptools compile requirements.in --output-file requirements.txt
pip install -r requirements.txt

Vector DB Initialization

Profile Configuration Reference

All operational settings live in .agent/learning/vector_profiles.json. These control performance and connection mode.

When to Use This

When a user first installs the vector-db plugin.
If the Vector DB profile is missing from .agent/learning/vector_profiles.json.
If you need to add a new manifest profile or update which folders are indexed.
If you need to re-scaffold a clean configuration after a plugin upgrade.

Default: In-Process (Filesystem) Mode

Vector-db runs In-Process by default — ChromaDB persists directly to a local directory (configured as chroma_data_path in vector_profiles.json). No server process is needed.

When running ingest.py or query.py you will see:

[WARN] Failed to connect to remote ChromaDB ... Falling back to local.
[DIR] Connecting to local persistent ChromaDB at .agent/learning/vector_wiki_db...

Step 0: Install Dependencies (MANDATORY — do this first)

Before anything else, install the plugin's Python dependencies from the lockfile.

Run from the project root:

# Regenerate the lockfile from the intent file (only needed when requirements.in changes):
python -m piptools compile plugins/vector-db/requirements.in \
    --output-file plugins/vector-db/requirements.txt

# Install all dependencies (always run this on first setup):
python -m pip install -r plugins/vector-db/requirements.txt

Note: pip-tools itself must be installed first if not present:
python -m pip install pip-tools
Known gotcha: The system pip command may not be available on macOS. Always use python -m pip install ... rather than bare pip install ....

Verify the critical packages are installed:

python -c "import chromadb; print('chromadb:', chromadb.__version__)"
python -c "import einops; print('einops: OK')"
python -c "from sentence_transformers import SentenceTransformer; print('sentence-transformers: OK')"

If any check fails, the install step above will fix it.

Step 1: Setup Mode Selection

Ask this after dependencies are installed.

First, check what other plugins are installed:

ls .agents/skills/rlm-init/              2>/dev/null && echo "rlm-factory: INSTALLED"          || echo "rlm-factory: NOT FOUND"
ls .agents/skills/obsidian-wiki-builder/ 2>/dev/null && echo "obsidian-wiki-engine: INSTALLED"  || echo "obsidian-wiki-engine: NOT FOUND"

Then ask:

Vector DB works standalone with zero external dependencies. You can also combine it with
other plugins for a more powerful retrieval stack. What setup would you like?

  A) Vector DB only (standalone)
     - Semantic search over any indexed folders
     - No other plugins needed — works right now

  B) Vector DB + RLM Phase 1 pre-filter                [requires: rlm-factory in .agents/]
     - RLM keyword pre-filter -> vector semantic search
     - Reduces noise, improves precision for large corpora

  C) Vector DB as wiki Phase 2 search                  [requires: obsidian-wiki-engine in .agents/]
     - Adds vector semantic search to /wiki-query
     - /wiki-query: RLM keyword (O(1)) -> vector (O(log N)) -> grep exact

  D) Full Super-RAG                                    [requires: rlm-factory + obsidian-wiki-engine]
     - All three phases: RLM keyword -> vector semantic -> wiki concept nodes

Enter A, B, C, or D (default: A):

If required plugins are NOT installed for the chosen mode:

[plugin-name] is not installed in .agents/.

To install it:

  # Recommended (uvx -- works on Mac, Linux, Windows)
  uvx --from git+https://github.com/richfrem/agent-plugins-skills plugin-add richfrem/agent-plugins-skills

  # See full install guide
  cat INSTALL.md

After installing, re-run /vector-db:init and choose your desired mode.

Continue with Mode A (standalone) for now? (y) or abort and install first? (n)

Step 1: Guided Source Discovery

Scan the project root and present a numbered table of candidate directories:

find . -maxdepth 1 -type d | grep -v '^\.$' | grep -v -E '\.(git|venv|vscode|windsurf|claude|agents|agent|knowledge_vector_data|wiki|vector_data)$' | sort

Present results as a numbered table with a one-line description of each folder. Then ask:

Which folders should be treated as raw content sources for vector indexing?

  Enter numbers separated by commas (e.g. 1, 3, 5)
  or type custom paths (relative or absolute)
  or both (e.g. 1, 2, /path/to/other/dir)

  You can specify all sources now in one go.

Then ask once, globally:

Any subdirectory patterns or file types to exclude beyond the defaults?
Defaults: .git/, node_modules/, .venv/, __pycache__/, requirements.in, requirements.txt

Press Enter to accept defaults, or type additions (e.g. temp/, *.tmp):

Step 2: Confirm and Write Manifest

Display the complete manifest before writing, using the same flat schema as rlm-factory and obsidian-wiki-engine:

{
  "description": "Globs tracking project documentation and knowledge records.",
  "include": [
    "<folder_1>/",
    "<folder_2>/"
  ],
  "exclude": [
    "/.git/",
    "/node_modules/",
    "/.venv/",
    "/__pycache__/",
    "requirements.in",
    "requirements.txt"
  ]
}

Ask: "Does this look correct? (y to write, e to edit, q to abort)"

If .agent/learning/vector_knowledge_manifest.json already exists:

Ask: "A manifest already exists. Overwrite, merge (add new includes only), or abort? (o/m/q)"
Merge: append new paths to the existing include array; never remove existing entries
Overwrite: replace entirely with the new manifest

Write to: .agent/learning/vector_knowledge_manifest.json Create parent directories if needed.

Note on manifest naming: vector_profiles.json may reference vector_wiki_manifest.json (legacy name). The canonical filename going forward is vector_knowledge_manifest.json. If the profile still points to the old name, update the manifest field in vector_profiles.json to match.

Step 3: Scaffold Profile and Install Dependencies

After the manifest is confirmed, run the init script which handles profile scaffolding and dependency installation:

python ./scripts/init.py

The script will:

Install Python dependencies from the lockfile (requirements.txt)
Scaffold or update .agent/learning/vector_profiles.json with the wiki profile
Set chroma_host: "" (In-Process mode by default — no server needed)

After the script runs, verify the profile's manifest field points to vector_knowledge_manifest.json. If it still shows vector_wiki_manifest.json, update it:

{
  "version": 2,
  "profiles": {
    "wiki": {
      "manifest": ".agent/learning/vector_knowledge_manifest.json"
    }
  },
  "default_profile": "wiki"
}

Step 4: Verify and Show Next Steps

Confirm the files written, then print:

=== Vector DB Setup Complete (Mode <X>) ===

Files written:
  - .agent/learning/vector_knowledge_manifest.json  (<N> sources)
  - .agent/learning/vector_profiles.json            (wiki profile ready)

Next steps:
  /vector-db:ingest   <- build the semantic index from your sources
  /vector-db:search   <- run semantic queries
  /vector-db:audit    <- check index coverage

[Mode B/C/D] To activate the full retrieval stack:
  /rlm-factory:init   <- set up RLM Phase 1 keyword pre-filter
  /wiki-init          <- set up wiki concept node layer

Adoption

richfrem/vector-db-init

$ install --global

Security Scan Results

SKILL.md

Dependencies

Vector DB Initialization

Profile Configuration Reference

When to Use This

Default: In-Process (Filesystem) Mode

Step 0: Install Dependencies (MANDATORY — do this first)

Step 1: Setup Mode Selection

Step 1: Guided Source Discovery

Step 2: Confirm and Write Manifest

Step 3: Scaffold Profile and Install Dependencies

Step 4: Verify and Show Next Steps

Related Skills

richfrem/issue-worktree-agent

richfrem/issue-pr-lifecycle-agent

richfrem/github-issue-prioritizer

richfrem/github-issue-backlog-agent

richfrem/vector-db-init

$ install --global

Security Scan Results

SKILL.md

Dependencies

Vector DB Initialization

Profile Configuration Reference

When to Use This

Default: In-Process (Filesystem) Mode

Step 0: Install Dependencies (MANDATORY — do this first)

Step 1: Setup Mode Selection

Step 1: Guided Source Discovery

Step 2: Confirm and Write Manifest

Step 3: Scaffold Profile and Install Dependencies

Step 4: Verify and Show Next Steps

Related Skills

richfrem/issue-worktree-agent

richfrem/issue-pr-lifecycle-agent

richfrem/github-issue-prioritizer

richfrem/github-issue-backlog-agent