.agent/skills/ground-truth-design/SKILL.md
Design ground truth queries for the Oak semantic search service using known-answer-first methodology. Use when creating new ground truths, redesigning existing queries, or working with files in src/lib/search-quality/ground-truth/.
npx skillsauth add oaknational/oak-open-curriculum-ecosystem ground-truth-designInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Design ground truth queries that accurately measure search quality for the Oak curriculum search service.
We know Elasticsearch works. We test whether our search service with our data delivers value to teachers.
| We Test | We Don't Test (ES Handles) | |---------|---------------------------| | Does search help teachers find content? | Stemming / morphological variation | | Natural teacher queries returning relevant lessons | Disambiguation (filtering handles) | | Typo recovery (a handful of proofs) | Phrase matching internals |
Teachers can search for anything. We don't judge what's "appropriate".
ALL search works on metadata. Transcripts are supplementary. Don't create "metadata-only" as a special category.
When filtered to French, don't include "French" in the query. The filter provides context.
ALL ground truths are from the perspective of a PROFESSIONAL TEACHER in the UK.
| Dimension | Current Scope | NOT (Future Work) | |-----------|---------------|-------------------| | Content Type | Lessons only | Units, sequences, threads | | User Persona | Professional UK teachers | Pupils, students, learners | | Search Intent | Finding curriculum content to teach | Self-directed learning |
Teachers search for topics to teach, not personal help or advice:
Teachers type topics directly. Any prefix like "how to teach", "lessons on", "teaching about" is redundant noise.
IMPORTANT: The bulk-downloads/ directory is gitignored. File search tools will NOT see these files.
Use shell commands to explore bulk data:
ls bulk-downloads/ # List available files
cat bulk-downloads/maths-primary.json | jq '.lessons | length' # Count lessons
Ground truths are tests where we know the correct answer before running search.
Mine bulk data to identify lessons:
cd apps/oak-search-cli
# List units and lessons
jq '.sequence[] | {unit: .unitTitle, lessons: [.unitLessons[].lessonTitle]}' \
bulk-downloads/SUBJECT-PHASE.json
# Search for keywords
jq '.lessons[] | select(.lessonKeywords | test("TERM"; "i")) | {slug: .lessonSlug, title: .lessonTitle}' \
bulk-downloads/SUBJECT-PHASE.json
Create a query a teacher would realistically type:
| Rule | Requirement | Example | |------|-------------|---------| | Length | 3-7 words | "cell structure and function" | | Realistic | Would a teacher type this? | Yes: "fractions unlike denominators" | | Pedagogy aware | Professional UK teacher queries | Yes: curriculum-aligned vocabulary | | No meta-phrases | No "teaching about", "lessons on" | Not: "lessons on fractions" | | Topic-focused | Topics, not advice | Not: "how to teach fractions" | | No redundant subject | Don't repeat filter context | Not: "French negation" when filtered to French |
cd apps/oak-search-cli
pnpm oaksearch search lessons "your query" --subject subject --key-stage keyStage
| Score | Meaning | |-------|---------| | 3 | Direct match — teaches exactly what query asks | | 2 | Related — covers topic but not directly | | 1 | Tangential — mentions concept peripherally |
src/lib/search-quality/ground-truth/
├── types.ts # LessonGroundTruth type definition
├── index.ts # Exports and accessors
├── README.md # Overview
└── entries/ # Individual ground truths
├── maths-secondary.ts
├── maths-primary.ts
└── ...
/**
* Subject Phase ground truth entry.
*
* @see ground-truth-protocol.md for the process
* @packageDocumentation
*/
import type { LessonGroundTruth } from '../types';
/**
* Subject Phase ground truth: Topic description.
*/
export const SUBJECT_PHASE: LessonGroundTruth = {
subject: 'subject-slug',
phase: 'primary' | 'secondary',
keyStage: 'ks1' | 'ks2' | 'ks3' | 'ks4',
query: 'realistic teacher query here',
expectedRelevance: {
'best-match-slug': 3,
'good-match-slug': 2,
'related-slug': 2,
},
description: 'Brief description of what the lesson teaches and why it matches.',
} as const;
pnpm type-check # TypeScript compilation
pnpm test # Unit tests
WRONG:
CORRECT:
Query says "verbs" but expected slugs have "avoir" without "verb" in metadata. Search cannot match them.
Verify that expected slugs contain terms semantically connected to the query.
For full evaluation protocol (COMMIT methodology), lessons learned from 25+ review sessions, and troubleshooting:
apps/oak-search-cli/src/lib/search-quality/ground-truth/GROUND-TRUTH-GUIDE.mdtools
When and how to use git worktrees for isolated work.
documentation
TSDoc and documentation workflow for canonical source comments, README updates, and ADR touchpoints.
development
Structured debugging workflow: reproduce, isolate, hypothesise, verify, fix, regression test.
data-ai
Load the shared thorough start-right workflow from `.agent/skills/start-right-thorough/shared/start-right-thorough.md`.