skills/data-engineering/data-catalog/SKILL.md
Manages metadata for data assets to enable discovery, governance, and lineage tracking in data engineering.
npx skillsauth add alphaonedev/openclaw-graph data-catalogInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
This skill manages metadata for data assets, enabling discovery, governance, and lineage tracking in data engineering workflows. It catalogs datasets, schemas, and dependencies to support data-driven projects.
Use this skill when you need to track data assets in a project, such as during ETL processes, data governance audits, or when building data pipelines. Apply it in scenarios involving large-scale data repositories, compliance requirements, or collaborative data teams.
{"name": "sales_data", "schema": {"columns": ["id", "date"]}}.{"source": "raw_logs", "target": "processed_reports"}.$DATA_CATALOG_API_KEY.To use this skill, first authenticate with an environment variable like export DATA_CATALOG_API_KEY=your_key. Then, follow a pattern: initialize the catalog, register assets, query as needed, and handle updates. For pipelines, embed it in scripts to auto-register outputs. Always validate metadata before operations to avoid conflicts.
Use the dcatalog CLI or REST API for interactions. Authentication requires $DATA_CATALOG_API_KEY in requests.
CLI Commands:
dcatalog register --asset-name sales_data --type dataset --metadata '{"schema": ["id", "amount"]}' --api-key $DATA_CATALOG_API_KEYdcatalog search --query "sales" --tags metadata --limit 10dcatalog update-lineage --source raw_data --target processed_data --relation depends_onAPI Endpoints:
curl -H "Authorization: Bearer $DATA_CATALOG_API_KEY" -d '{"name": "sales_data", "tags": ["metadata"]}' -X POST https://api.opencclaw.com/api/v1/assetscurl -H "Authorization: Bearer $DATA_CATALOG_API_KEY" https://api.opencclaw.com/api/v1/assets/search?query=salesimport requests
headers = {"Authorization": f"Bearer {os.environ['DATA_CATALOG_API_KEY']}"}
response = requests.put('https://api.opencclaw.com/api/v1/lineage', headers=headers, json={"source": "raw_data", "target": "report"})
Config formats are JSON-based, e.g., for CLI config file (~/.dcatalog/config.json):
{"default_tags": ["data-governance"], "api_endpoint": "https://api.opencclaw.com"}
Integrate this skill with data tools like Apache Airflow or AWS Glue by wrapping API calls in custom operators. For example, in a Python script, import the API client and pass $DATA_CATALOG_API_KEY. Ensure compatibility by matching schema versions; use JSON configs for mappings, e.g., link to S3 buckets via {"bucket": "my-bucket", "prefix": "data/"}. Test integrations in a sandbox environment before production.
Handle errors by checking HTTP status codes in API responses; for example, if status is 401, prompt for $DATA_CATALOG_API_KEY revalidation. For CLI, use try-catch in scripts:
try:
subprocess.run(["dcatalog", "register", "--asset-name", "test"], check=True)
except subprocess.CalledProcessError as e:
print(f"Error: {e.returncode} - {e.output}")
Common issues include invalid JSON metadata (fix by validating with json.loads() before sending) or authentication failures (retry with refreshed keys). Log errors with timestamps for debugging.
tools
Root web development: project structure, tooling selection, deployment decisions
development
WebAssembly: Rust/Go/C to WASM, wasm-bindgen, Emscripten, WASM Component Model
development
Vue 3: Composition API script setup, Pinia, Vue Router 4, SFCs, Vite, Nuxt 3
tools
Tailwind CSS 4: utility classes, config, JIT, arbitrary values, darkMode, plugins, shadcn/ui