.claude/skills/do-search/SKILL.md
Implement search indexing for a domain using OpenSearch
npx skillsauth add viqueen/claude-go-playground .claude/skills/do-searchInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Implement OpenSearch indexing for a domain. This PR is auditable as: "Is the search indexing correct?"
Depends on: do-domain agent PR (internal/domain/<domain>/ must exist with event constants).
All file paths are relative to the chosen project: connect-rpc-backend/ or grpc-backend/.
The user will specify which project. All make commands must be run from the project root.
The user will specify:
space, content)pkg/embed/embed.go — Generic embedder interfaceIf this is the first domain being indexed, create the embedder package. If it already exists, skip this step.
This package is purely generic — provider-agnostic, no domain knowledge.
package embed
import "context"
// Embedder generates vector embeddings from text.
type Embedder interface {
Embed(ctx context.Context, text string) ([]float32, error)
}
The provider is chosen at wire time in cmd/server/. Implementations live in pkg/embed/:
pkg/embed/opensearch.go — uses OpenSearch's built-in ML plugin (_plugins/_ml/models/<model_id>/_predict). The model ID is configured via env var.pkg/embed/<provider>.go — each returns Embedder from a constructor.// OpenSearch ML plugin implementation
func NewOpenSearch(address string, modelID string) (Embedder, error) {
// Creates its own opensearchapi.Client for the ML predict API
return &openSearchEmbedder{client: client, modelID: modelID}, nil
}
Conventions:
Embedder interface is public, implementations are privateEmbedder (the interface)pkg/embed/ must NOT import gen/, internal/, or any domain-specific codepkg/search/search.go — Generic search client interface and constructorIf this is the first domain being indexed, create the shared search package. If it already exists, skip this step.
This package is purely generic — no domain-specific types, no imports from gen/ or internal/.
It is extractable as a shared module, consistent with the pkg/ layer rule.
package search
import (
"context"
"encoding/json"
"github.com/gofrs/uuid/v5"
)
// Filter represents an exact-match constraint on a keyword or integer field.
type Filter struct {
Field string
Value any
}
// Match represents a full-text search on a text field.
type Match struct {
Field string
Query string
}
// Vector represents a k-NN vector search on a knn_vector field.
type Vector struct {
Field string
Values []float32
K int
}
// Criteria defines a typed search query. The implementation translates this
// into an OpenSearch hybrid query internally:
// - Filters become term clauses (exact match, no scoring)
// - Matches become match clauses (full-text, scored)
// - Vector becomes a k-NN clause (semantic similarity, scored)
// Filters, matches, and vector are combined for hybrid search.
type Criteria struct {
Filters []Filter
Matches []Match
Vector *Vector
PageSize int32
PageToken string
}
// Page represents a paginated set of search results.
type Page struct {
Hits []Hit
NextPageToken string
}
// Hit represents a single search result with its raw JSON source.
type Hit struct {
ID uuid.UUID
Score float32
Source json.RawMessage
}
// Search defines the interface for indexing, deleting, and querying documents.
type Search interface {
// Index indexes or updates a document in the given index.
Index(ctx context.Context, index string, id uuid.UUID, document any) error
// Delete removes a document from the given index.
Delete(ctx context.Context, index string, id uuid.UUID) error
// Find searches an index using typed criteria and returns a paginated result.
Find(ctx context.Context, index string, criteria Criteria) (*Page, error)
// CreateIndexIfNotExists ensures an index exists with the given mapping.
CreateIndexIfNotExists(ctx context.Context, index string, mapping []byte) error
}
Backed by github.com/opensearch-project/opensearch-go/v4/opensearchapi.
The implementation translates Criteria into an OpenSearch query:
Filter becomes a term clause in filter (exact match, no scoring)Match becomes a match clause in must (full-text, scored)Vector becomes a knn clause (k-NN similarity search)Criteria (no filters, matches, or vector) matches all documentsPageSize and PageToken map to OpenSearch size and search_after — the token is an opaque encoding of the sort values (consistent with the gRPC list RPC pagination pattern using pkg/pagination)Constructor:
func New(address string) (Search, error) {
// Create opensearch client with the given address
// Return interface (not struct)
}
Conventions:
Search interface is public, implementation struct is privateCriteria with Filter, Match, and Vector — never raw JSON. The implementation owns the OpenSearch query DSL translation.PageSize/PageToken pattern consistent with gRPC list RPCs. Implementation uses OpenSearch search_after for efficient deep pagination. Token is an opaque base64-encoded sort value.Vector is set alongside Matches, the implementation uses OpenSearch's hybrid query to blend lexical and semantic scores.(Search, error) — the error covers connection/config issuesopensearchapi client (v4) — not the legacy v2 clientCreateIndexIfNotExists accepts []byte (raw embedded JSON), not stringzerolog context logger on errorspkg/search/ must NOT import gen/, internal/, or any domain-specific codeinternal/outbox/<domain>/mappings/ — Embedded JSON mapping filesMappings are standalone .json files loaded via //go:embed, following the same pattern as
sql/migrations/migrations.go. This keeps mappings reviewable, lintable, and out of Go code.
Domain-specific mappings live under the outbox domain package — not in pkg/search/ — because
they are tied to a specific domain's schema and belong in the internal/ layer.
internal/outbox/<domain>/mappings/mappings.gopackage mappings
import "embed"
//go:embed *.json
var FS embed.FS
internal/outbox/<domain>/mappings/<domain>.json — One JSON file per indexPlain JSON, one file per domain. The file name matches the domain name (not the index name).
The mapping must distinguish between three field categories:
keyword: unique identifiers, foreign keys, enum-like values, tags.
These are fields users filter or look up by exact value.text with analyzer: human-readable text users search within.knn_vector: embedding vectors for semantic search. Only include when
the entity has text fields worth embedding (e.g., body, description).Denormalization for cross-domain search: when an entity references a parent via FK,
include the parent's reference fields in the child's mapping so a single query can filter by both
entity and parent criteria. Name denormalized fields with the parent prefix (e.g., <parent>_<field>).
This avoids multi-index fan-out queries.
Cross-reference the SQL schema and proto definitions to identify which fields are references (unique indexes, foreign keys, enums, arrays of labels) vs. searchable (names, titles, descriptions, bodies).
Example for a root entity (no parent FK):
{
"settings": {
"index": {
"knn": true
}
},
"mappings": {
"properties": {
"<unique_key>": { "type": "keyword" },
"<name_field>": { "type": "text", "analyzer": "standard" },
"<text_field>": { "type": "text", "analyzer": "standard" },
"<enum_field>": { "type": "integer" },
"embedding": {
"type": "knn_vector",
"dimension": 1536,
"method": {
"name": "hnsw",
"space_type": "cosinesimil",
"engine": "lucene"
}
}
}
}
}
Example for a child entity (has parent FK). Note denormalized parent fields for cross-domain search:
{
"settings": {
"index": {
"knn": true
}
},
"mappings": {
"properties": {
"<parent>_id": { "type": "keyword" },
"<parent>_<ref_field>": { "type": "keyword" },
"<parent>_<text_field>": { "type": "text", "analyzer": "standard" },
"<parent>_<enum_field>": { "type": "integer" },
"<name_field>": { "type": "text", "analyzer": "standard" },
"<text_field>": { "type": "text", "analyzer": "standard" },
"<enum_field>": { "type": "integer" },
"<array_field>": { "type": "keyword" },
"embedding": {
"type": "knn_vector",
"dimension": 1536,
"method": {
"name": "hnsw",
"space_type": "cosinesimil",
"engine": "lucene"
}
}
}
}
}
Conventions:
<domain>.jsonid — OpenSearch uses _id (the document ID) natively for lookups by ID.created_at / updated_at unless the domain requires time-range search.deleted_at — soft-deleted entities are removed from the index on delete events.<parent>_<field>).knn_vector with dimension matching the embedder's output size, hnsw method, cosinesimil space type, lucene engine. Include "index": { "knn": true } in settings.keyword (exact-match filter)keyword (exact-match lookup)integer (exact-match filter)keyword (OpenSearch handles arrays natively)text with standard analyzerknn_vectorbooleaninternal/outbox/<domain>/index.go — Index name, mapping, and document structAll domain-specific search concerns live in the outbox domain package — the index name constant, the embedded mapping loader, and the document struct with its mapper from sqlc models.
For entities with parent references, the document struct includes denormalized parent fields and the mapper accepts both the entity and its parent model.
package <domain>
import (
db<schema> "<module>/gen/db/<schema>"
"<module>/internal/outbox/<domain>/mappings"
)
// Index name — plural lowercase
const IndexName = "<domain>s"
// Mapping loaded from embedded JSON
var IndexMapping = must(mappings.FS.ReadFile("<domain>.json"))
func must(data []byte, err error) []byte {
if err != nil {
panic(err)
}
return data
}
// EmbeddingField is the mapping field name for the vector embedding.
const EmbeddingField = "embedding"
// <Domain>Document represents the search document for a <domain>.
// Fields match the mapping properties in mappings/<domain>.json exactly.
type <Domain>Document struct {
// Denormalized parent fields (only when entity has a parent FK)
<Parent>ID string `json:"<parent>_id"`
<Parent><Field> string `json:"<parent>_<field>"`
// Entity's own fields
<Field> string `json:"<field>"`
<Enum> int32 `json:"<enum>"`
<Array> []string `json:"<array>"`
Embedding []float32 `json:"embedding,omitempty"`
}
// New<Domain>Document maps sqlc models to a search document.
// When the entity has a parent FK, accepts both the entity and its parent.
func New<Domain>Document(
entity *db<schema>.<Entity>,
parent *db<schema>.<Parent>, // omit if no parent FK
) <Domain>Document {
return <Domain>Document{
<Parent>ID: entity.<Parent>ID.String(),
<Parent><Field>: parent.<Field>,
<Field>: entity.<Field>,
<Enum>: entity.<Enum>,
<Array>: entity.<Array>,
}
}
// EmbeddingText returns the text to embed for this document.
// Concatenate the entity's searchable text fields.
func (d <Domain>Document) EmbeddingText() string {
return d.<TextField1> + "\n" + d.<TextField2>
}
Conventions:
<domain>svar (not const) because []byte cannot be a const<domain>.json mapping file exactly<Parent><Field> → "<parent>_<field>").[]float32 with omitempty — set by the index worker after calling the embedder. The EmbeddingText() method returns the text to embed (concatenation of the entity's searchable text fields).internal/outbox/<domain>/event_index.go — Wire index workers to OpenSearchUpdate the existing index worker to actually index/delete documents via the search client. The worker needs dependencies for search, queries, and the embedder.
Before (current placeholder):
type IndexWorker struct {
river.WorkerDefaults[IndexArgs]
}
After:
type IndexDependencies struct {
Search search.Search
Embedder embed.Embedder
Queries *db<domain>.Queries
}
type IndexWorker struct {
river.WorkerDefaults[IndexArgs]
search search.Search
embedder embed.Embedder
queries *db<domain>.Queries
}
func NewIndexWorker(deps IndexDependencies) *IndexWorker {
return &IndexWorker{
search: deps.Search,
embedder: deps.Embedder,
queries: deps.Queries,
}
}
The Work method references constants and types from the same package (index.go):
func (w *IndexWorker) Work(ctx context.Context, job *river.Job[IndexArgs]) error {
id, err := uuid.FromString(job.Args.<Domain>ID)
if err != nil {
return err
}
switch job.Args.EventType {
case <domain>domain.EventCreated, <domain>domain.EventUpdated:
entity, err := w.queries.Get<Entity>(ctx, id)
if err != nil {
return err
}
// For entities with parent references, also fetch the parent
parent, err := w.queries.Get<Parent>(ctx, entity.<Parent>ID)
if err != nil {
return err
}
doc := New<Domain>Document(&entity, &parent)
// Generate embedding from searchable text fields
embedding, err := w.embedder.Embed(ctx, doc.EmbeddingText())
if err != nil {
return err
}
doc.Embedding = embedding
return w.search.Index(ctx, IndexName, id, doc)
case <domain>domain.EventDeleted:
return w.search.Delete(ctx, IndexName, id)
default:
log.Ctx(ctx).Warn().Str("event_type", job.Args.EventType).Msg("unknown event type")
return nil
}
}
Key patterns:
embedder.Embed() with the document's searchable text, then sets the embedding before indexing. This keeps embedding off the request path.<domain>domain.EventCreated) — do NOT hardcode event type strings.IndexName, New<Domain>Document come from index.go in the same packagecmd/server/setup_connections.go — Add search client, embedder, and wire dependenciesAdd the search client and embedder to the Connections struct and initialize them:
type Connections struct {
Pool *pgxpool.Pool
RiverClient *river.Client[pgx.Tx]
SearchClient search.Search
Embedder embed.Embedder
}
Add EmbedModelID to pkg/config/config.go if not already present:
type Config struct {
DatabaseURL string
OpenSearchURL string
EmbedModelID string
ServerAddr string
}
With EmbedModelID: getEnv("EMBED_MODEL_ID", "") in Load() and EMBED_MODEL_ID in .env.
In setupConnections:
searchClient, err := search.New(cfg.OpenSearchURL) — this returns the Search interface backed by an opensearchapi.Clientembed.NewOpenSearch(cfg.OpenSearchURL, cfg.EmbedModelID) — creates its own OpenSearch HTTP client for the ML predict APIsearchClient.CreateIndexIfNotExists(ctx, <domain>events.IndexName, <domain>events.IndexMapping)NewIndexWorker when registering workersWorker registration changes from:
river.AddWorker(workers, &<domain>events.IndexWorker{})
To:
river.AddWorker(workers, <domain>events.NewIndexWorker(<domain>events.IndexDependencies{
Search: searchClient,
Embedder: embedder,
Queries: db<domain>.New(pool),
}))
internal/domain/ — domain layer does not know about search or embeddingsinternal/api/ — search is triggered asynchronously via outbox, not synchronously in handlerspkg/outbox/ — outbox interface is unchangedinternal/outbox/river.go — event mapping is unchanged (index jobs already created)Search and Embedder interfaces are public, implementations are privatepkg/search/ and pkg/embed/: contain only interfaces and clients — zero domain knowledgeIndexDependencies exported, includes Search, Embedder, and Queries.json files in internal/outbox/<domain>/mappings/, loaded via //go:embed — never inline JSON in Go codeinternal/outbox/<domain>/ alongside the index workerCriteria supports Filters, Matches, and Vector for combined keyword + semantic searchFind returns *Page with NextPageToken, consistent with gRPC list RPCsindex.go for index name + mapping + document struct, event_index.go for the worker, mappings/<domain>.json for mapping definitionsCreateIndexIfNotExists during server bootpkg/search/ depends on nothing domain-specific — purely generic, extractable as a shared modulepkg/embed/ depends on nothing domain-specific — purely generic, provider implementations may depend on opensearch-gointernal/outbox/<domain>/ depends on: pkg/search/, pkg/embed/, pkg/outbox, gen/db/<domain>, internal/domain/<domain> (for event constants only), riverinternal/outbox/<domain>/mappings/ depends on nothing — pure embedded datainternal/domain/ must NOT depend on pkg/search/ or pkg/embed/internal/api/ must NOT depend on pkg/search/ or pkg/embed/ (search queries will be a separate concern)go get github.com/opensearch-project/opensearch-go/v4 from the project rootfor f in internal/outbox/<domain>/mappings/*.json; do jq . "$f" > /dev/null || echo "INVALID: $f"; donemake vet — fix all compilation errorsmake build — confirm Docker build worksmake infra — start infrastructure (OpenSearch must be healthy)make start — create an entity via gRPC/Connect, verify the index worker logs show successful indexingcurl http://localhost:9200/<domain>s/_search?prettymake teardownpkg/embed/embed.go with Embedder interfacepkg/embed/opensearch.go with OpenSearch ML plugin implementationpkg/embed/ has zero imports from gen/, internal/, or any domain-specific codepkg/search/search.go with Search interface, Criteria, Filter, Match, Vector, Page, HitCriteria includes PageSize and PageToken for paginationFind returns *Page with NextPageToken (consistent with gRPC list RPCs)Criteria.Vector supports optional k-NN searchpkg/search/ has zero imports from gen/, internal/, or any domain-specific codeopensearch-go/v4 client (not legacy v2)internal/outbox/<domain>/mappings/mappings.go with //go:embed *.json and exported FSinternal/outbox/<domain>/mappings/<domain>.json with valid JSON mappingknn_vector field with dimension, hnsw method, cosinesimil, lucene engine"settings": { "index": { "knn": true } }jqinternal/outbox/<domain>/index.go with IndexName, IndexMapping, EmbeddingField, document struct, mapper, and EmbeddingText()Embedding []float32 with omitemptyEmbeddingText() concatenates searchable text fields<domain>.json exactlyinternal/outbox/<domain>/event_index.go updated with IndexDependencies including Embedderembedder.Embed() before indexing on create/updatesearch.Index(), delete events → search.Delete()setup_connections.go creates search client, embedder, and passes both to index workerssetup_connections.go calls CreateIndexIfNotExists on startuppkg/search/ or pkg/embed/ in internal/domain/ or internal/api/make vet passesmake build succeedstesting
Review a test PR
tools
Review a search indexing PR
tools
Review a scaffold PR
tools
Review a proto PR