Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

synalinks/synalinks-knowledge

Name: synalinks-knowledge
Author: synalinks

synalinks-knowledge/SKILL.md

npx skillsauth add synalinks/synalinks-skills synalinks-knowledge

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Synalinks Knowledge & RAG

Build retrieval-augmented programs over a unified DuckDB knowledge base. Supports BM25 full-text, vector similarity, and hybrid (RRF) search.

Quick Start

import synalinks
import asyncio

class Document(synalinks.DataModel):
    id: str = synalinks.Field(description="Document ID")
    title: str = synalinks.Field(description="Document title")
    content: str = synalinks.Field(description="Document content")

class Query(synalinks.DataModel):
    query: str = synalinks.Field(description="User query")

class Answer(synalinks.DataModel):
    answer: str = synalinks.Field(description="Answer based on retrieved context")

async def main():
    lm = synalinks.LanguageModel(model="openai/gpt-4o-mini")
    em = synalinks.EmbeddingModel(model="openai/text-embedding-3-small")

    knowledge_base = synalinks.KnowledgeBase(
        uri="duckdb://./docs.db",
        data_models=[Document],
        embedding_model=em,
        metric="cosine",
    )

    inputs = synalinks.Input(data_model=Query)
    context = await synalinks.RetrieveKnowledge(
        knowledge_base=knowledge_base,
        language_model=lm,
        search_type="hybrid",
        k=5,
        return_inputs=True,
    )(inputs)

    answer = await synalinks.Generator(
        data_model=Answer,
        language_model=lm,
        instructions="Answer based on the retrieved context only. If irrelevant, say you don't know.",
    )(context)

    rag = synalinks.Program(inputs=inputs, outputs=answer, name="rag_qa")
    result = await rag(Query(query="What is Python?"))
    print(result.prettify_json())

asyncio.run(main())

KnowledgeBase

knowledge_base = synalinks.KnowledgeBase(
    uri="duckdb://./my_database.db",     # or duckdb://:memory:
    data_models=[Document, Invoice],     # one table per DataModel
    embedding_model=em,                  # optional, required for similarity/hybrid
    metric="cosine",                     # "cosine" | "l2seq" | "ip"
    wipe_on_start=False,                 # clear DB on init
)

The first field of each DataModel is the primary key.

Knowledge Modules

EmbedKnowledge

Generate embeddings for a DataModel so it can be searched by similarity.

embedded = await synalinks.EmbedKnowledge(
    embedding_model=em,
    in_mask=["content"],   # keep only fields to embed
    # out_mask=["id"],     # OR exclude fields
)(inputs)

After masking, exactly one field should remain — the field that gets embedded.

UpdateKnowledge

Upsert a record into the knowledge base. Uses the DataModel's first field as the primary key.

stored = await synalinks.UpdateKnowledge(
    knowledge_base=knowledge_base,
)(extracted_data)

RetrieveKnowledge

LM-driven retrieval — the LM generates a search query, then the knowledge base is queried.

results = await synalinks.RetrieveKnowledge(
    knowledge_base=knowledge_base,
    language_model=lm,
    search_type="hybrid",   # "similarity" | "fulltext" | "hybrid"
    k=10,
    return_inputs=True,     # forward inputs alongside retrieved context
    return_query=True,      # include the LM-generated query in output
)(inputs)

Search Types

| Type | Backend | When to use | |------|---------|-------------| | fulltext | DuckDB BM25 | Exact terms, codes, IDs, named entities | | similarity | Vector (cosine/l2seq/ip) | Semantic matches, paraphrases | | hybrid | Reciprocal Rank Fusion of both | Default — best general-purpose |

Direct Search Methods

# Full-text (BM25)
results = await knowledge_base.fulltext_search("query", k=10)

# Vector similarity
results = await knowledge_base.similarity_search("query", k=10)

# Hybrid (RRF)
results = await knowledge_base.hybrid_search("query", k=10, k_rank=60)

# Lookup by primary key
record = await knowledge_base.get("id_value")

# Paginated scan
records = await knowledge_base.getall(
    Document.to_symbolic_data_model(),
    limit=50,
    offset=0,
)

# Raw SQL (params is a list bound to ? placeholders, in order)
results = await knowledge_base.query(
    "SELECT * FROM Invoice WHERE total > ?",
    params=[100.0],
)

Extraction → Storage Pipeline

Pipe a Generator into UpdateKnowledge to extract structured data and persist it:

class DocumentText(synalinks.DataModel):
    text: str = synalinks.Field(description="Raw document text")

inputs = synalinks.Input(data_model=DocumentText)

extracted = await synalinks.Generator(
    data_model=Invoice,
    language_model=lm,
    instructions="Extract invoice information from the document.",
)(inputs)

stored = await synalinks.UpdateKnowledge(knowledge_base=knowledge_base)(extracted)

ingest = synalinks.Program(inputs=inputs, outputs=stored, name="invoice_ingest")

Default EmbeddingModel

When embedding_model=None is passed to EmbedKnowledge / RetrieveKnowledge / KnowledgeBase (or ops.embedding), the framework resolves the default at call time:

synalinks.set_default_embedding_model("openai/text-embedding-3-small")
# String identifiers persist into the on-disk config; instances do not.

# Later, anywhere in the program:
em = synalinks.default_embedding_model()  # returns the configured instance, or None

EmbeddingModel accepts **default_kwargs forwarded to every call, and fallback= accepts a string, dict, or EmbeddingModel instance:

em = synalinks.EmbeddingModel(
    model="openai/text-embedding-3-small",
    dimensions=512,                          # forwarded to litellm.aembedding
    fallback="ollama/mxbai-embed-large",     # str / dict / EmbeddingModel
)

Keyword-only Arguments

The knowledge modules (EmbedKnowledge, UpdateKnowledge, RetrieveKnowledge, StampKnowledge) all use keyword-only constructors (def __init__(self, *, ...)). Always pass arguments by name:

# Correct
synalinks.UpdateKnowledge(knowledge_base=kb)

# Wrong — TypeError
synalinks.UpdateKnowledge(kb)

Best Practices

Specific field descriptions — they shape what the LLM extracts and how Retrieval generates queries
First field = primary key — design DataModels accordingly
Use hybrid search by default — falls back to whichever signal is stronger
Batch ingestion with program.predict(...) — see synalinks-training
Combine with agents — pass tools = [synalinks.Tool(knowledge_base.fulltext_search), ...] to a FunctionCallingAgent for tool-driven retrieval (see synalinks-agents)

References

references/knowledge-base.md — Complete API, all search methods, full RAG example

synalinks/synalinks-knowledge

synalinks-knowledge/SKILL.md

Use when working with Synalinks KnowledgeBase (DuckDB-backed), EmbedKnowledge, UpdateKnowledge, RetrieveKnowledge, StampKnowledge, RAG pipelines, hybrid / fulltext / similarity search, default-EmbeddingModel configuration, or document extraction-and-storage flows.

895 stars

devops

Updated May 9, 2026

$ install --global

skillsauth

npx skillsauth add synalinks/synalinks-skills synalinks-knowledge

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: May 9, 2026, 5:43 AM118.7s3 files scanned

SKILL.md

name:: synalinks-knowledge
description:: Use when working with Synalinks KnowledgeBase (DuckDB-backed), EmbedKnowledge, UpdateKnowledge, RetrieveKnowledge, StampKnowledge, RAG pipelines, hybrid / fulltext / similarity search, default-EmbeddingModel configuration, or document extraction-and-storage flows.

Synalinks Knowledge & RAG

Build retrieval-augmented programs over a unified DuckDB knowledge base. Supports BM25 full-text, vector similarity, and hybrid (RRF) search.

Quick Start

import synalinks
import asyncio

class Document(synalinks.DataModel):
    id: str = synalinks.Field(description="Document ID")
    title: str = synalinks.Field(description="Document title")
    content: str = synalinks.Field(description="Document content")

class Query(synalinks.DataModel):
    query: str = synalinks.Field(description="User query")

class Answer(synalinks.DataModel):
    answer: str = synalinks.Field(description="Answer based on retrieved context")

async def main():
    lm = synalinks.LanguageModel(model="openai/gpt-4o-mini")
    em = synalinks.EmbeddingModel(model="openai/text-embedding-3-small")

    knowledge_base = synalinks.KnowledgeBase(
        uri="duckdb://./docs.db",
        data_models=[Document],
        embedding_model=em,
        metric="cosine",
    )

    inputs = synalinks.Input(data_model=Query)
    context = await synalinks.RetrieveKnowledge(
        knowledge_base=knowledge_base,
        language_model=lm,
        search_type="hybrid",
        k=5,
        return_inputs=True,
    )(inputs)

    answer = await synalinks.Generator(
        data_model=Answer,
        language_model=lm,
        instructions="Answer based on the retrieved context only. If irrelevant, say you don't know.",
    )(context)

    rag = synalinks.Program(inputs=inputs, outputs=answer, name="rag_qa")
    result = await rag(Query(query="What is Python?"))
    print(result.prettify_json())

asyncio.run(main())

KnowledgeBase

knowledge_base = synalinks.KnowledgeBase(
    uri="duckdb://./my_database.db",     # or duckdb://:memory:
    data_models=[Document, Invoice],     # one table per DataModel
    embedding_model=em,                  # optional, required for similarity/hybrid
    metric="cosine",                     # "cosine" | "l2seq" | "ip"
    wipe_on_start=False,                 # clear DB on init
)

The first field of each DataModel is the primary key.

Knowledge Modules

EmbedKnowledge

Generate embeddings for a DataModel so it can be searched by similarity.

embedded = await synalinks.EmbedKnowledge(
    embedding_model=em,
    in_mask=["content"],   # keep only fields to embed
    # out_mask=["id"],     # OR exclude fields
)(inputs)

After masking, exactly one field should remain — the field that gets embedded.

UpdateKnowledge

Upsert a record into the knowledge base. Uses the DataModel's first field as the primary key.

stored = await synalinks.UpdateKnowledge(
    knowledge_base=knowledge_base,
)(extracted_data)

RetrieveKnowledge

LM-driven retrieval — the LM generates a search query, then the knowledge base is queried.

results = await synalinks.RetrieveKnowledge(
    knowledge_base=knowledge_base,
    language_model=lm,
    search_type="hybrid",   # "similarity" | "fulltext" | "hybrid"
    k=10,
    return_inputs=True,     # forward inputs alongside retrieved context
    return_query=True,      # include the LM-generated query in output
)(inputs)

Search Types

Direct Search Methods

# Full-text (BM25)
results = await knowledge_base.fulltext_search("query", k=10)

# Vector similarity
results = await knowledge_base.similarity_search("query", k=10)

# Hybrid (RRF)
results = await knowledge_base.hybrid_search("query", k=10, k_rank=60)

# Lookup by primary key
record = await knowledge_base.get("id_value")

# Paginated scan
records = await knowledge_base.getall(
    Document.to_symbolic_data_model(),
    limit=50,
    offset=0,
)

# Raw SQL (params is a list bound to ? placeholders, in order)
results = await knowledge_base.query(
    "SELECT * FROM Invoice WHERE total > ?",
    params=[100.0],
)

Extraction → Storage Pipeline

Pipe a Generator into UpdateKnowledge to extract structured data and persist it:

class DocumentText(synalinks.DataModel):
    text: str = synalinks.Field(description="Raw document text")

inputs = synalinks.Input(data_model=DocumentText)

extracted = await synalinks.Generator(
    data_model=Invoice,
    language_model=lm,
    instructions="Extract invoice information from the document.",
)(inputs)

stored = await synalinks.UpdateKnowledge(knowledge_base=knowledge_base)(extracted)

ingest = synalinks.Program(inputs=inputs, outputs=stored, name="invoice_ingest")

Default EmbeddingModel

When embedding_model=None is passed to EmbedKnowledge / RetrieveKnowledge / KnowledgeBase (or ops.embedding), the framework resolves the default at call time:

synalinks.set_default_embedding_model("openai/text-embedding-3-small")
# String identifiers persist into the on-disk config; instances do not.

# Later, anywhere in the program:
em = synalinks.default_embedding_model()  # returns the configured instance, or None

EmbeddingModel accepts **default_kwargs forwarded to every call, and fallback= accepts a string, dict, or EmbeddingModel instance:

em = synalinks.EmbeddingModel(
    model="openai/text-embedding-3-small",
    dimensions=512,                          # forwarded to litellm.aembedding
    fallback="ollama/mxbai-embed-large",     # str / dict / EmbeddingModel
)

Keyword-only Arguments

The knowledge modules (EmbedKnowledge, UpdateKnowledge, RetrieveKnowledge, StampKnowledge) all use keyword-only constructors (def __init__(self, *, ...)). Always pass arguments by name:

# Correct
synalinks.UpdateKnowledge(knowledge_base=kb)

# Wrong — TypeError
synalinks.UpdateKnowledge(kb)

Best Practices

Specific field descriptions — they shape what the LLM extracts and how Retrieval generates queries
First field = primary key — design DataModels accordingly
Use hybrid search by default — falls back to whichever signal is stronger
Batch ingestion with program.predict(...) — see synalinks-training
Combine with agents — pass tools = [synalinks.Tool(knowledge_base.fulltext_search), ...] to a FunctionCallingAgent for tool-driven retrieval (see synalinks-agents)

References

references/knowledge-base.md — Complete API, all search methods, full RAG example

Related Skills

synalinks/synalinks-training

development

VerifiedTrustedCommunity

Use when training Synalinks programs — program.compile() / fit() / evaluate() / predict(), validation_split, validation_data, batch_size, epochs, callbacks (ProgramCheckpoint, custom Callback subclasses), History, in-context reinforcement learning workflow. For reward functions see synalinks-rewards; for optimizer internals see synalinks-optimizers.

895SKILL.mdUpdated May 9, 2026

synalinks/synalinks-training

synalinks/synalinks-rewards

development

VerifiedTrustedCommunity

Use when configuring or writing Synalinks reward functions and metrics — ExactMatch, CosineSimilarity, LMAsJudge, ProgramAsJudge, RewardFunctionWrapper, custom reward functions (async, register_synalinks_serializable), in_mask / out_mask filtering, F1Score / FBetaScore / BinaryF1Score / ListF1Score metrics, MeanMetricWrapper, or whenever you're shaping the signal that drives optimization.

895SKILL.mdUpdated May 9, 2026

synalinks/synalinks-rewards

synalinks/synalinks-providers

tools

VerifiedTrustedCommunity

Use when integrating Synalinks with LM providers — picking the right model prefix (openai/, anthropic/, ollama/, groq/, cohere/, openrouter/, bedrock/, deepseek/, together_ai/, doubleword/, hosted_vllm/ (alias vllm/), gemini/, xai/, mistral/, azure/), env vars per provider, structured-output dispatch (constrained json_schema vs tool-call), local OpenAI-compatible servers (LMStudio, vLLM) requiring litellm.register_model and a dummy OPENAI_API_KEY, and OpenRouter embeddings (LiteLLM doesn't support them — use OpenRouterEmbeddingModel).

895SKILL.mdUpdated May 9, 2026

synalinks/synalinks-providers

synalinks/synalinks-programs

development

VerifiedTrustedCommunity

Use when building or composing a Synalinks Program — the four building APIs (Functional, Sequential, Subclassing, Mixed), Input nodes, multi-input/multi-output graphs, the call/build lifecycle, training=True/False semantics, summary, get_module, plot_program, save/load, get_state_tree/set_state_tree, get_config/from_config and custom serialization. For DataModel/Field, JSON operators (+ & | ^ ~), and LanguageModel/EmbeddingModel basics see synalinks-core. For inner modules see synalinks-modules; for compile/fit/evaluate/predict see synalinks-training.

895SKILL.mdUpdated May 9, 2026

synalinks/synalinks-programs

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/synalinks/synalinks-skills.git

# Copy into Claude Code skills folder (global)
cp -r synalinks-skills/synalinks-knowledge ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

synalinks/synalinks-skills

895 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT

Adoption

synalinks/synalinks-knowledge

$ install --global

Security Scan Results

SKILL.md

Synalinks Knowledge & RAG

Quick Start

KnowledgeBase

Knowledge Modules

EmbedKnowledge

UpdateKnowledge

RetrieveKnowledge

Search Types

Direct Search Methods

Extraction → Storage Pipeline

Default EmbeddingModel

Keyword-only Arguments

Best Practices

References

See Also

Related Skills

synalinks/synalinks-training

synalinks/synalinks-rewards

synalinks/synalinks-providers

synalinks/synalinks-programs

synalinks/synalinks-knowledge

$ install --global

Security Scan Results

SKILL.md

Synalinks Knowledge & RAG

Quick Start

KnowledgeBase

Knowledge Modules

EmbedKnowledge

UpdateKnowledge

RetrieveKnowledge

Search Types

Direct Search Methods

Extraction → Storage Pipeline

Default EmbeddingModel

Keyword-only Arguments

Best Practices

References

See Also

Related Skills

synalinks/synalinks-training

synalinks/synalinks-rewards

synalinks/synalinks-providers

synalinks/synalinks-programs