src/skills/api-vector-db-pinecone/SKILL.md
Pinecone serverless vector database -- index management, vector operations, metadata filtering, namespaces, hybrid search, inference API
npx skillsauth add agents-inc/skills api-vector-db-pineconeInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Quick Guide: Use
@pinecone-database/pinecone(v7.x) for serverless vector database operations. Target indexes by host (pc.index({ host })), not by name. Use namespaces for multi-tenant isolation (physically separate, cheaper queries). Batch upserts at 200 records (max 1,000 or 2 MB). Metadata is limited to 40 KB per record with flat key-value pairs only (no nested objects). Pinecone is eventually consistent -- vectors may not appear in queries immediately after upsert. UsedescribeIndexStats()to verify indexing progress. For hybrid search, usedotproductmetric with sparse+dense vectors in a single index.
<critical_requirements>
All code must follow project conventions in CLAUDE.md (kebab-case, named exports, import ordering,
import type, named constants)
(You MUST target indexes by host URL, not by name -- pc.index({ host }) is the v7 API; pc.index('name') is deprecated)
(You MUST batch upserts to max 1,000 records or 2 MB per request -- exceeding either limit causes a 400 error)
(You MUST use flat key-value metadata only -- nested objects, null values, and keys starting with $ are rejected by Pinecone)
(You MUST handle eventual consistency -- vectors are not queryable immediately after upsert; use describeIndexStats() or retry logic for freshness-critical flows)
</critical_requirements>
Additional resources:
Auto-detection: Pinecone, @pinecone-database/pinecone, createIndex, createIndexForModel, upsert, query, topK, includeMetadata, sparseValues, namespace, describeIndexStats, vector database, similarity search, embedding, cosine, dotproduct, euclidean, RAG retrieval, semantic search, pinecone-sparse-english, rerank, searchRecords, upsertRecords, fetchByMetadata
When to use:
Key patterns covered:
When NOT to use:
Pinecone is a managed serverless vector database purpose-built for similarity search at scale. The core principle: store embeddings and metadata, query by vector similarity, filter by metadata.
Core principles:
describeIndexStats() before querying.Create a Pinecone client from an API key. See examples/core.md for full examples.
// Good Example
import { Pinecone } from "@pinecone-database/pinecone";
function createPineconeClient(): Pinecone {
const apiKey = process.env.PINECONE_API_KEY;
if (!apiKey) {
throw new Error("PINECONE_API_KEY environment variable is required");
}
return new Pinecone({ apiKey });
}
export { createPineconeClient };
Why good: API key from environment variable, validation before construction, named export
// Bad Example
import { Pinecone } from "@pinecone-database/pinecone";
const pc = new Pinecone({ apiKey: "sk-abc123..." });
// Hardcoded key leaks in version control
Why bad: Hardcoded API key is a security risk, no validation
Always target an index by its host URL, not its name. See examples/core.md.
// Good Example -- target by host
const indexModel = await pc.createIndex({
name: "products",
dimension: EMBEDDING_DIMENSION,
metric: "cosine",
spec: { serverless: { cloud: "aws", region: "us-east-1" } },
});
const index = pc.index({ host: indexModel.host });
Why good: pc.index({ host }) is the v7 API, avoids an extra API call to resolve the name to a host
// Bad Example -- target by name (deprecated)
const index = pc.index("products");
// Triggers an extra describeIndex call to resolve the host URL
Why bad: Targeting by name requires an extra network call and is deprecated in v7
Upsert vectors with flat metadata for filtering. See examples/core.md for typed metadata.
// Good Example
interface DocumentMetadata {
title: string;
category: string;
createdAt: number; // Unix timestamp (numbers only, no Date objects)
}
const NAMESPACE = "articles";
await index.namespace(NAMESPACE).upsert({
records: [
{
id: "doc-1",
values: embedding, // number[] matching index dimension
metadata: { title: "Guide", category: "tutorial", createdAt: 1710000000 },
},
],
});
Why good: Typed metadata interface, flat key-value pairs, numeric timestamp (not Date), namespace isolation
Query for similar vectors with metadata filtering. See examples/metadata-filtering.md for all operators.
// Good Example
const TOP_K = 10;
const results = await index.namespace(NAMESPACE).query({
vector: queryEmbedding,
topK: TOP_K,
includeMetadata: true,
filter: {
$and: [
{ category: { $eq: "tutorial" } },
{ createdAt: { $gte: 1700000000 } },
],
},
});
for (const match of results.matches) {
console.log(match.id, match.score, match.metadata);
}
Why good: Named constant for topK, structured filter with $and, includes metadata in response
// Bad Example
const results = await index.query({
vector: queryEmbedding,
topK: 100,
includeMetadata: true,
filter: { tags: ["a", "b"] }, // INVALID: arrays are not valid filter values
});
Why bad: Missing namespace (queries default namespace), array filter syntax is invalid (use $in), no named constant for topK
Use namespaces for tenant isolation. See examples/namespaces.md.
// Good Example -- physically isolated tenant data
function getTenantIndex(pc: Pinecone, host: string, tenantId: string) {
return pc.index({ host }).namespace(`tenant-${tenantId}`);
}
// Each tenant's queries scan only their namespace
const tenantIndex = getTenantIndex(pc, INDEX_HOST, "acme-corp");
const results = await tenantIndex.query({ vector: embedding, topK: TOP_K });
Why good: Physical isolation per tenant, queries scan only the target namespace (lower cost and latency)
// Bad Example -- metadata filtering for multi-tenancy
await index.query({
vector: embedding,
topK: 10,
filter: { tenantId: { $eq: "acme-corp" } },
// Scans ENTIRE index, filters after -- expensive at scale
});
Why bad: Metadata filtering scans the full namespace regardless of filter selectivity, cost scales with total data not tenant data
Generate embeddings and rerank results. See examples/inference.md.
// Good Example -- embed text
const embedResult = await pc.inference.embed({
model: "multilingual-e5-large",
inputs: [{ text: "What is machine learning?" }],
parameters: { inputType: "query", truncate: "END" },
});
const queryVector = embedResult.data[0].values;
Why good: Specifies inputType (query vs passage), handles truncation for long inputs
// Good Example -- rerank results
const rerankResult = await pc.inference.rerank({
model: "pinecone-rerank-v0",
query: "machine learning basics",
documents: results.matches.map((m) => ({
id: m.id,
text: m.metadata?.content as string,
})),
topN: 5,
returnDocuments: true,
});
Why good: Reranks query results for better relevance, limits output with topN
<decision_framework>
Which Pinecone index type should I use?
|-- Serverless? (recommended for most use cases)
| |-- Variable or unpredictable traffic? -> Serverless (auto-scales, pay-per-use)
| |-- Starting a new project? -> Serverless (simpler, no capacity planning)
| '-- Need hybrid sparse-dense search? -> Serverless with dotproduct metric
|
'-- Pod-based? (legacy, specific needs)
|-- Need guaranteed low latency SLAs? -> Pod-based (dedicated compute)
'-- Using collections for snapshots? -> Pod-based (collections are pod-only)
Which distance metric should I use?
|-- Using embeddings from a language model? -> cosine (normalized, most common)
|-- Need hybrid search (sparse + dense)? -> dotproduct (REQUIRED for hybrid)
|-- Comparing raw feature vectors? -> euclidean (absolute distance matters)
'-- Unsure? -> cosine (safe default for most embedding models)
How should I isolate tenant data?
|-- Strict data isolation required? -> Namespaces (physical separation)
|-- Need to query across tenants? -> Metadata filtering (logical separation)
|-- Cost-sensitive at scale? -> Namespaces (query cost = tenant size, not total)
|-- Few tenants (< 10)? -> Either approach works
'-- Many tenants (100+)? -> Namespaces (metadata filtering scans everything)
How should I generate embeddings?
|-- Want simplest architecture? -> Integrated inference (createIndexForModel)
| (Pinecone handles embedding automatically on upsert/query)
|
|-- Need a specific embedding model not hosted by Pinecone? -> External
| (Generate embeddings yourself, upsert raw vectors)
|
|-- Need hybrid search with sparse vectors? -> External sparse model
| (Use pinecone-sparse-english-v0 via inference API + your dense model)
|
'-- Need full control over embedding pipeline? -> External
(Custom preprocessing, chunking, model selection)
</decision_framework>
<red_flags>
High Priority Issues:
pc.index("name") is deprecated in v7; use pc.index({ host }) to avoid an extra API callMedium Priority Issues:
includeMetadata: true in queries -- metadata is NOT included by default; omitting this returns only IDs and scoresDate objects in metadata -- Pinecone metadata supports strings, numbers, booleans, and string arrays only; convert dates to Unix timestampscreateIndex() readiness -- index creation is async; the index is not ready for operations immediately after createIndex() returnsCommon Mistakes:
topK > 1,000 with includeMetadata: true -- the max topK is 1,000 when including metadata or values; without them, max is 10,000{ tags: ["a", "b"] }) -- use $in operator instead: { tags: { $in: ["a", "b"] } }deleteAll() without a namespace deletes from the default namespace only, not the entire indexGotchas & Edge Cases:
describeIndexStats() returns approximate counts -- record counts are not exact in real-time, especially after recent upserts or deletes$eq: "42" does not match numeric 42listPaginated() returns vector IDs only (no values or metadata) -- use fetch() to get full vector data$in and $nin operators accept a maximum of 10,000 values each$ (reserved for operators)upsert is an upsert, not an insert -- upserting with an existing ID overwrites the previous vector and metadata entirely (no partial merge)update() merges metadata by default -- updating metadata replaces only the fields you specify, not the entire metadata objectupsertRecords (integrated inference) accepts a direct array, not { records: [...] } -- this differs from the regular upsert method which uses { records: [...] }</red_flags>
<critical_reminders>
All code must follow project conventions in CLAUDE.md (kebab-case, named exports, import ordering,
import type, named constants)
(You MUST target indexes by host URL, not by name -- pc.index({ host }) is the v7 API; pc.index('name') is deprecated)
(You MUST batch upserts to max 1,000 records or 2 MB per request -- exceeding either limit causes a 400 error)
(You MUST use flat key-value metadata only -- nested objects, null values, and keys starting with $ are rejected by Pinecone)
(You MUST handle eventual consistency -- vectors are not queryable immediately after upsert; use describeIndexStats() or retry logic for freshness-critical flows)
Failure to follow these rules will cause index creation failures, rejected upserts, empty query results, and degraded multi-tenant performance.
</critical_reminders>
development
Material Design component library for Vue 3
development
VitePress 1.x — Vue-powered static site generator for documentation sites, built on Vite
tools
Docusaurus 3.x documentation framework — site configuration, docs/blog plugins, sidebars, versioning, MDX, swizzling, and deployment
development
TanStack Form patterns - useForm, form.Field, validators, arrays, linked fields, createFormHook, type safety