Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

data-fair/upgrade-scripts

Name: upgrade-scripts
Author: data-fair

skills/upgrade-scripts/SKILL.md

npx skillsauth add data-fair/lib upgrade-scripts

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Upgrade Scripts in data-fair Services

@data-fair/lib-node/upgrade-scripts is the migration runner used by data-fair services. At service startup it scans the upgrade/ directory for version-named subfolders, compares them against the version recorded in the services Mongo collection, and runs the scripts whose folder version is >= the recorded version. After running, it stores the current package.json version back in the services collection.

The one rule about the folder name

Name the folder after the version of the last release of the service at the time you write the script. Never after an anticipated future version.

Why this is the rule

When you are working on a branch, you do not know what version the change will eventually ship as. The same branch's content can end up in a minor release, a major release, or be backported to several lines at once. The forward version is genuinely unknown at authoring time, so the only stable reference is the last version that has already been released.

The runner uses semver.gte(folder, dbVersion), so the folder name effectively encodes the claim "this migration applies whenever the database was previously at this version or older". The last released version is the correct answer to that claim — it is the most recent state from which a user's database could still be coming.

What "expected" behavior looks like

With folder = last-released-version:

Production upgrade. Before the release, prod package.json is at the last release (= folder name) and the DB matches. When the new release deploys, package.json bumps past the folder; the runner sees dbVersion = <last-release>, folder >= dbVersion → runs once. After the run, DB is updated to the new package.json. Subsequent restarts: folder < dbVersion → never runs again.
Staging. Staging usually runs the development branch with package.json still at the last release. So folder = pjson = DB, and the script re-runs on every staging deploy until the next release ships. This is expected and is exactly why scripts must be idempotent.

Finding the last released version

Read it straight from the service's package.json. In data-fair services the version is bumped on release, not at branch start, so on any working branch the version field is exactly the last released version.

jq -r .version package.json
# → 6.4.2
# → create upgrade/6.4.2/your-script.ts

Take the value as it stands; do not invent or anticipate the next bump.

Multiple scripts in one folder

If several scripts target the same last-released version, put them all in the same folder — they execute in lexicographic order. Prefix with 01-, 02- if order matters.

Script structure

Scripts are TypeScript modules with a default export that satisfies the UpgradeScript interface:

// upgrade/6.4.2/backfill-modified.ts
import type { UpgradeScript } from '@data-fair/lib-node/upgrade-scripts.js'

const upgradeScript: UpgradeScript = {
  description: 'Backfill _modified field on existing datasets',
  async exec (db, debug) {
    debug('backfilling _modified on datasets without it')
    let count = 0
    const cursor = db.collection('datasets').find({ _modified: { $exists: false } })
    for await (const dataset of cursor) {
      const _modified = dataset.dataUpdatedAt ?? dataset.updatedAt
      if (_modified) {
        await db.collection('datasets').updateOne(
          { _id: dataset._id },
          { $set: { _modified } }
        )
        count++
      }
    }
    debug(`backfilled ${count} datasets`)
  }
}

export default upgradeScript

Key points:

description is logged at run time; make it a single short sentence in the imperative or descriptive mood.
exec(db, debug) receives a live Db connection from the same Mongo client the service uses, and a namespaced debug logger (upgrade:<folder>:<filename>).
Must be idempotent. Use $exists: false, { field: { $ne: newValue } }, or similar guards. A script that has already done its work must be a no-op, not a failure.

Idempotency patterns

Scripts must be safe to re-run. Re-runs happen on every staging deploy until the next release ships (see above), and also when two pods start concurrently, when one crashes mid-loop, or on manual re-runs.

Make the body trivially safe to re-run:

// ✓ Filter out already-migrated documents
await db.collection('x').updateMany(
  { newField: { $exists: false } },
  { $set: { newField: defaultValue } }
)

// ✓ Use $rename only if source still exists
await db.collection('x').updateMany(
  { oldName: { $exists: true } },
  { $rename: { oldName: 'newName' } }
)

// ✗ Anything that breaks on second run
await db.collection('x').updateMany({}, { $inc: { counter: 1 } })

For destructive migrations (dropping a field, deleting documents), pair the write with a precondition check so a re-run is a no-op.

Where to wire it in

The runner is normally called once at service startup, before the HTTP server accepts traffic, alongside the lock manager:

import upgradeScripts from '@data-fair/lib-node/upgrade-scripts.js'
import locks from '@data-fair/lib-node/locks.js'
import db from './db.js'

await locks.init(db)
await upgradeScripts(db, locks)

If your service uses workspaces, the runner reads name and version from the parent package.json first (../package.json), falling back to the current one. The name is the key under which the version is stored in the services collection, so don't rename a service without a manual data migration.

Fresh installs

Pass isFresh so the runner can skip historical scripts on a brand-new database:

await upgradeScripts(db, locks, './', async () => {
  const count = await db.collection('datasets').estimatedDocumentCount()
  return count === 0
})

When isFresh returns true, no scripts run; the runner just records the current version. When false, all scripts with folder name init run, then normal semver-gated scripts run as usual.

Debugging

The runner uses the debug package:

DEBUG=upgrade,upgrade:* npm start

You will see:

the resolved service name and version
the version found in the database
each script as it runs, with its description
per-script logs from inside exec

If a script seems to not run, double-check:

The folder name parses as semver (semver.coerce is not used on folder names — 1.0 will fail to compare; use 1.0.0).
The folder version is >= the DB-stored version (db.services.findOne({ id: '<service-name>' })).
The script file's default export matches the UpgradeScript shape.

Checklist before merging an upgrade script

[ ] Folder is named after the last released version of the service — the version field of package.json on your working branch. Never an anticipated future version.
[ ] description is one short sentence.
[ ] exec is idempotent (safe to run twice).
[ ] No reliance on collection/field names that newer code has removed — legacy code may not exist when this script eventually runs in an old install upgrading several versions at once.
[ ] If multiple scripts in the same folder must run in order, prefix the filenames with 01-, 02-, etc.
[ ] Tested against a database snapshot of the old shape (or at minimum manually walked through with a sample document).

data-fair/upgrade-scripts

skills/upgrade-scripts/SKILL.md

How to add a database migration / upgrade script in a data-fair service that uses @data-fair/lib-node/upgrade-scripts (data-fair, processings, events, catalogs, etc.). Covers the gotcha that trips up most agents: which version goes in the folder name. Use this skill whenever the user asks to add an upgrade script, write a migration, backfill a field on existing documents, reshape a Mongo collection on deploy, or anything described as "needs to run once on production after deploy". Also use it when reading or modifying an existing `upgrade/X.Y.Z/` directory.

1 stars

development

Updated May 28, 2026

$ install --global

skillsauth

npx skillsauth add data-fair/lib upgrade-scripts

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: May 28, 2026, 2:42 AM137.4s1 file scanned

SKILL.md

name:: upgrade-scripts
description:: >
catalogs, etc.). Covers the gotcha that trips up most agents:: which version

Upgrade Scripts in data-fair Services

The one rule about the folder name

Name the folder after the version of the last release of the service at the time you write the script. Never after an anticipated future version.

Why this is the rule

What "expected" behavior looks like

With folder = last-released-version:

Production upgrade. Before the release, prod package.json is at the last release (= folder name) and the DB matches. When the new release deploys, package.json bumps past the folder; the runner sees dbVersion = <last-release>, folder >= dbVersion → runs once. After the run, DB is updated to the new package.json. Subsequent restarts: folder < dbVersion → never runs again.
Staging. Staging usually runs the development branch with package.json still at the last release. So folder = pjson = DB, and the script re-runs on every staging deploy until the next release ships. This is expected and is exactly why scripts must be idempotent.

Finding the last released version

jq -r .version package.json
# → 6.4.2
# → create upgrade/6.4.2/your-script.ts

Take the value as it stands; do not invent or anticipate the next bump.

Multiple scripts in one folder

If several scripts target the same last-released version, put them all in the same folder — they execute in lexicographic order. Prefix with 01-, 02- if order matters.

Script structure

Scripts are TypeScript modules with a default export that satisfies the UpgradeScript interface:

// upgrade/6.4.2/backfill-modified.ts
import type { UpgradeScript } from '@data-fair/lib-node/upgrade-scripts.js'

const upgradeScript: UpgradeScript = {
  description: 'Backfill _modified field on existing datasets',
  async exec (db, debug) {
    debug('backfilling _modified on datasets without it')
    let count = 0
    const cursor = db.collection('datasets').find({ _modified: { $exists: false } })
    for await (const dataset of cursor) {
      const _modified = dataset.dataUpdatedAt ?? dataset.updatedAt
      if (_modified) {
        await db.collection('datasets').updateOne(
          { _id: dataset._id },
          { $set: { _modified } }
        )
        count++
      }
    }
    debug(`backfilled ${count} datasets`)
  }
}

export default upgradeScript

Key points:

description is logged at run time; make it a single short sentence in the imperative or descriptive mood.
exec(db, debug) receives a live Db connection from the same Mongo client the service uses, and a namespaced debug logger (upgrade:<folder>:<filename>).
Must be idempotent. Use $exists: false, { field: { $ne: newValue } }, or similar guards. A script that has already done its work must be a no-op, not a failure.

Idempotency patterns

Make the body trivially safe to re-run:

// ✓ Filter out already-migrated documents
await db.collection('x').updateMany(
  { newField: { $exists: false } },
  { $set: { newField: defaultValue } }
)

// ✓ Use $rename only if source still exists
await db.collection('x').updateMany(
  { oldName: { $exists: true } },
  { $rename: { oldName: 'newName' } }
)

// ✗ Anything that breaks on second run
await db.collection('x').updateMany({}, { $inc: { counter: 1 } })

For destructive migrations (dropping a field, deleting documents), pair the write with a precondition check so a re-run is a no-op.

Where to wire it in

The runner is normally called once at service startup, before the HTTP server accepts traffic, alongside the lock manager:

import upgradeScripts from '@data-fair/lib-node/upgrade-scripts.js'
import locks from '@data-fair/lib-node/locks.js'
import db from './db.js'

await locks.init(db)
await upgradeScripts(db, locks)

Fresh installs

Pass isFresh so the runner can skip historical scripts on a brand-new database:

await upgradeScripts(db, locks, './', async () => {
  const count = await db.collection('datasets').estimatedDocumentCount()
  return count === 0
})

When isFresh returns true, no scripts run; the runner just records the current version. When false, all scripts with folder name init run, then normal semver-gated scripts run as usual.

Debugging

The runner uses the debug package:

DEBUG=upgrade,upgrade:* npm start

You will see:

the resolved service name and version
the version found in the database
each script as it runs, with its description
per-script logs from inside exec

If a script seems to not run, double-check:

The folder name parses as semver (semver.coerce is not used on folder names — 1.0 will fail to compare; use 1.0.0).
The folder version is >= the DB-stored version (db.services.findOne({ id: '<service-name>' })).
The script file's default export matches the UpgradeScript shape.

Checklist before merging an upgrade script

[ ] Folder is named after the last released version of the service — the version field of package.json on your working branch. Never an anticipated future version.
[ ] description is one short sentence.
[ ] exec is idempotent (safe to run twice).
[ ] No reliance on collection/field names that newer code has removed — legacy code may not exist when this script eventually runs in an old install upgrading several versions at once.
[ ] If multiple scripts in the same folder must run in order, prefix the filenames with 01-, 02-, etc.
[ ] Tested against a database snapshot of the old shape (or at minimum manually walked through with a sample document).

Related Skills

data-fair/pr-ready

development

VerifiedTrustedCommunity

Pre-PR flight check on the current branch. Reviews the diff against stated intent, flags scope creep, regression risks, and commit hygiene problems, and drafts a compact PR title (conventional-commit style) and description. Requires a clean working tree. Does not run tests, lint, or type-check.

1SKILL.mdUpdated May 28, 2026

data-fair/data-fair-ws

tools

VerifiedTrustedCommunity

How to implement real-time websocket communication in data-fair services. Covers the full stack: server-side setup with @data-fair/lib-express/ws-server, emitting events with @data-fair/lib-node/ws-emitter, subscribing from Vue components with @data-fair/lib-vue/ws, and using @data-fair/lib-node/ws-client for Node.js programmatic WS clients and integration tests. Use this skill whenever the user wants to add websocket support, emit real-time events, subscribe to channels, implement live updates, push notifications, any pub/sub pattern in a data-fair service, or write integration tests that verify websocket behavior — even if they just say "real-time", "live updates", or "test websockets".

1SKILL.mdUpdated Apr 15, 2026

data-fair/data-fair-ws

data-fair/data-fair-session

development

VerifiedTrustedCommunity

How to use the @data-fair/lib session management system in services that consume sessions (not login/account management). Use this skill whenever the task involves reading user identity, checking permissions, protecting routes, accessing account/organization info, or implementing authorization logic in a data-fair service -- both on the Express/Node backend and in Vue frontend components. Also use it when the user mentions session middleware, account roles, admin mode, or organization switching in a data-fair context.

1SKILL.mdUpdated Apr 15, 2026

data-fair/data-fair-session

data-fair/skill-creator

testing

VerifiedTrustedCommunity

Create new skills, modify and improve existing skills, and measure skill performance. Use when users want to create a skill from scratch, update or optimize an existing skill, run evals to test a skill, benchmark skill performance with variance analysis, or optimize a skill's description for better triggering accuracy.

1SKILL.mdUpdated Apr 15, 2026

data-fair/skill-creator

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/data-fair/lib.git

# Copy into Claude Code skills folder (global)
cp -r lib/skills/upgrade-scripts ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

data-fair/lib

1 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT