plugins/adf-master/skills/fabric-onelake-2025/SKILL.md
ADF + Microsoft Fabric / OneLake 2025 integration. PROACTIVELY activate for: (1) Fabric Lakehouse connector in ADF, (2) Fabric Warehouse connector in ADF, (3) OneLake shortcuts and cross-workspace data, (4) Invoke Pipeline activity for cross-platform orchestration (ADF -> Fabric, Fabric -> ADF), (5) copying data between ADF and Microsoft Fabric workspaces, (6) authenticating with Fabric workspace identity, (7) cross-platform parameter passing, (8) hybrid ADF + Fabric pipeline patterns. Provides: Fabric connector setup, Invoke Pipeline templates, OneLake shortcut patterns, and ADF-to-Fabric migration guidance.
npx skillsauth add JosiahSiegel/claude-plugin-marketplace fabric-onelake-2025Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Microsoft Fabric is a unified SaaS analytics platform combining Power BI, Azure Synapse Analytics, and Azure Data Factory capabilities. ADF provides native connectors for Fabric Lakehouse and Fabric Warehouse, enabling seamless data movement between ADF and Fabric workspaces.
The Fabric Lakehouse connector enables read and write operations to Microsoft Fabric Lakehouse for tables and files.
| Activity | Supported | |----------|-----------| | Copy Activity (source and sink) | Yes | | Lookup Activity | Yes | | Get Metadata Activity | Yes | | Delete Activity | Yes |
LakehouseworkspaceId and artifactIdLakehouseTableSink (tables), LakehouseFileSink (files)LakehouseTableSourceappend or overwriteFinding Workspace and Artifact IDs:
https://app.powerbi.com/groups/<workspaceId>/...For complete linked service, dataset, and copy activity JSON examples, see references/lakehouse-examples.md.
The Fabric Warehouse connector provides T-SQL based data warehousing capabilities within the Fabric ecosystem.
| Activity | Supported | |----------|-----------| | Copy Activity (source and sink) | Yes | | Lookup Activity | Yes | | Get Metadata Activity | Yes | | Script Activity | Yes | | Stored Procedure Activity | Yes |
Warehouseendpoint, warehouseWarehouseSinkinsert or upsertautoCreate (creates table if missing)For complete linked service, copy activity, stored procedure, and script activity JSON examples, see references/warehouse-examples.md.
ADF supports three integration patterns with OneLake:
| Pattern | Description | Key Benefit | |---------|-------------|-------------| | ADLS Gen2 Shortcuts | Reference ADLS data via OneLake shortcuts (zero-copy) | No data duplication | | Incremental Load | Watermark-based incremental copy to Lakehouse | Efficient updates | | Cross-Platform Invoke | Use InvokePipeline activity to call Fabric pipelines | Hybrid orchestration |
OneLake Shortcuts are the preferred approach when data already exists in ADLS Gen2 -- they provide instant zero-copy access without data movement. Use ADF Copy Activity only when data transformation or format conversion is needed.
For complete pipeline JSON examples for all three patterns, see references/onelake-patterns.md.
For Fabric Lakehouse:
For Fabric Warehouse:
CREATE USER [your-adf-name] FROM EXTERNAL PROVIDER;
ALTER ROLE db_datareader ADD MEMBER [your-adf-name];
ALTER ROLE db_datawriter ADD MEMBER [your-adf-name];
App Registration Setup:
Use Managed Identity -- System-assigned for single ADF, user-assigned for multiple. Avoid service principal keys when possible. Store any secrets in Key Vault.
Enable Staging for Large Loads -- Use staging with compression for data volumes > 1 GB, complex transformations, or Fabric Warehouse loads.
Leverage OneLake Shortcuts -- Use ADLS Gen2 -> OneLake Shortcut -> Direct Access instead of ADLS Gen2 -> Copy Activity -> Lakehouse. No data movement, instant availability, reduced costs.
Monitor Fabric Capacity Units (CU) -- Track CU consumption per pipeline run, peak usage, and throttling. Optimize with incremental loads, off-peak scheduling, and right-sized parallelism.
Use Table Option AutoCreate -- Set tableOption: "autoCreate" on WarehouseSink for automatic schema management and faster development.
Implement Error Handling -- Configure retry policies on Copy activities and add WebActivity-based failure logging with dependencyConditions: ["Failed"].
| Issue | Error Message | Solution |
|-------|--------------|----------|
| Permission Denied | "User does not have permission to access Fabric workspace" | Add ADF managed identity as Contributor; for Warehouse, create SQL user; allow 5 min propagation |
| Endpoint Not Found | "Unable to connect to endpoint" | Verify workspaceId/artifactId; check workspace URL; ensure Lakehouse/Warehouse is not paused |
| Schema Mismatch | "Column types do not match" | Use tableOption: "autoCreate" or explicit column mappings in translator |
| Performance Degradation | Slow copy performance | Enable staging, increase parallelCopies (4-8), increase DIUs (8-32), check CU throttling |
references/lakehouse-examples.md - Complete linked service, dataset, copy activity, and lookup JSON examplesreferences/warehouse-examples.md - Complete linked service, copy activity, stored procedure, and script activity JSON examplesreferences/onelake-patterns.md - Pipeline patterns for shortcuts, incremental loads, and cross-platform Invoke Pipelinedevelopment
This skill should be used when the user asks to train, debug, scale, or improve ML models. PROACTIVELY activate for: (1) PyTorch, TensorFlow/Keras, JAX, Flax, Hugging Face Trainer/Accelerate training loops, (2) distributed training, DDP/FSDP/DeepSpeed, TPU/GPU setup, (3) mixed precision AMP/bf16, gradient accumulation, checkpointing, seeding, (4) overfitting, imbalance, loss functions, regularization, LR schedules, warmup, (5) memory optimization, gradient checkpointing, offloading, quantization-aware training. Provides: reproducible training best practices across deep learning and classical ML.
development
This skill should be used when the user asks to productionize, track, version, govern, monitor, or automate ML systems. PROACTIVELY activate for: (1) MLflow, Weights & Biases, Neptune, Comet, ClearML experiment tracking, (2) model registry, model versioning, artifact lineage, reproducibility, (3) Kubeflow, SageMaker Pipelines, Vertex AI Pipelines, Azure ML pipelines, Databricks workflows, (4) CI/CD, continuous training/evaluation, A/B tests, canary/shadow deployments, (5) drift detection, model monitoring, data validation, responsible AI governance. Provides: end-to-end MLOps architecture and operational safeguards.
development
This skill should be used when the user asks to optimize, export, serve, compress, or accelerate ML inference. PROACTIVELY activate for: (1) latency, throughput, p95/p99, batching, concurrency, KV cache, memory, or cost issues, (2) quantization INT8/INT4, GPTQ, AWQ, bitsandbytes, pruning, sparsity, distillation, (3) ONNX export, ONNX Runtime, TensorRT, TorchScript, torch.compile, XLA, OpenVINO, Core ML, TFLite, (4) Triton, TorchServe, TF Serving, BentoML, Seldon, KServe configuration, (5) edge deployment, CPU/GPU/TPU/Inferentia serving. Provides: hardware-aware inference optimization and safe benchmarking.
testing
This skill should be used when the user asks to tune hyperparameters, run sweeps, optimize search spaces, or use AutoML. PROACTIVELY activate for: (1) Optuna, Ray Tune, FLAML, AutoGluon, Hyperopt, Nevergrad, KerasTuner, W&B sweeps, (2) grid search, random search, Bayesian optimization, TPE, Gaussian processes, evolutionary search, (3) ASHA, Hyperband, successive halving, multi-fidelity optimization, population-based training, (4) learning-rate finder, batch-size search, early stopping, pruning, (5) reproducible sweep design and experiment analysis. Provides: budget-aware hyperparameter search strategy.