plugins/powerbi-master/skills/powerbi-core/SKILL.md
Core Power BI data modeling, source connectivity, and platform fundamentals. PROACTIVELY activate for: (1) Power BI data modeling and star-schema design, (2) relationships (active/inactive, bidirectional, USERELATIONSHIP), (3) data-source selection (DirectQuery vs Import vs Direct Lake vs composite), (4) incremental refresh setup, (5) gateway configuration (on-prem and VNet gateways), (6) streaming datasets and push-data scenarios, (7) Dataflow Gen2 basics, (8) Power BI common gotchas and pitfalls (bidirectional filtering, AutoExist, blank-row), (9) workspace identity and OAuth2 / service-principal auth, (10) semantic model architecture review. Provides: star-schema templates, mode-selection matrix, incremental refresh recipe, gateway setup steps, and a common-gotchas reference.
npx skillsauth add JosiahSiegel/claude-plugin-marketplace powerbi-coreInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Core Power BI knowledge covering data modeling best practices, connectivity modes, source types, relationships, and common pitfalls. This skill provides the foundational architecture guidance every Power BI developer needs.
Always design data models using star schema topology:
| Component | Purpose | Example | |-----------|---------|---------| | Fact table | Numeric events/transactions | Sales, Orders, WebVisits | | Dimension table | Descriptive attributes | Date, Product, Customer, Geography | | Bridge table | Many-to-many resolution | StudentCourse, OrderProduct |
Mandatory rules:
| Mode | Data Location | Refresh | Performance | Use When | |------|--------------|---------|-------------|----------| | Import | In-memory VertiPaq | Scheduled/on-demand | Fastest queries | Default choice, data under 1GB compressed | | DirectQuery | Source database | Real-time | Depends on source | Real-time needed, data too large for import | | Dual | Both | Scheduled + real-time | Best of both | Dimension tables in composite models | | Direct Lake | OneLake delta tables | Framing (seconds) | Near-import speed | Fabric lakehouse/warehouse scenarios |
Import mode considerations:
DirectQuery limitations:
Direct Lake key considerations (2025-2026 GA):
Choosing storage mode decision tree:
| Property | Options | Default | |----------|---------|---------| | Cardinality | One-to-many, Many-to-one, One-to-one, Many-to-many | One-to-many | | Cross-filter direction | Single, Both | Single | | Active | Yes/No | Yes (only one active per path) |
Relationship rules:
| Category | Sources | |----------|---------| | Microsoft SQL | SQL Server, Azure SQL, Azure Synapse, SQL Server Analysis Services | | Azure | Cosmos DB, Data Explorer (Kusto), Blob Storage, Data Lake, Fabric Lakehouse/Warehouse | | Cloud Databases | Snowflake, Databricks, Google BigQuery, Amazon Redshift, Amazon Athena | | Files | Excel, CSV/TSV, JSON, XML, Parquet, PDF | | Services | SharePoint, Dynamics 365, Salesforce, Google Analytics, Azure DevOps | | Protocols | OData, REST API, ODBC, OLEDB | | Streaming | Azure Stream Analytics, PubNub, REST API push |
Configure incremental refresh for large Import tables to avoid full refresh:
RangeStart and RangeEnd parameters (type DateTime) in Power QueryRequirements: Premium, PPU, or Fabric capacity for more than basic incremental refresh. Pro workspaces support incremental refresh but with limitations.
2025-2026 improvements:
On-premises data gateway bridges on-premises sources to Power BI Service:
| Gateway Type | Use Case | |-------------|----------| | Standard (enterprise) | Shared by multiple users, centrally managed | | Personal | Single user, development/testing only | | Virtual Network (VNet) | Azure VNet-connected sources, no on-prem hardware |
VNet data gateway (2025-2026):
Gateway releases (2025-2026):
Common gateway failures:
| Method | Use Case | Best For | |--------|----------|----------| | OAuth2 | Cloud sources (Azure SQL, Snowflake, Databricks) | Interactive use, SSO | | Service Principal | Automated refresh, CI/CD pipelines | Unattended operations | | Workspace Identity | Fabric workspaces (no secret to manage) | Fabric-native models | | Managed Identity | Dataflows Gen2 to Azure sources | Zero-secret PaaS access | | Username/Password | Legacy on-prem sources | Gateway-bound sources |
Workspace Identity (2025-2026):
OAuth2 token limitation: When set via REST API (not UI), OAuth2 credentials lack a refresh token and expire after 1 hour. Use service principal for long-running automation.
Connection pooling best practices:
try/otherwise| Pitfall | Impact | Fix | |---------|--------|-----| | Auto date/time enabled | Hidden date tables bloat model (one per date column) | Disable in Options > Data Load | | Implicit measures (drag numeric to visual) | No control over aggregation, no reuse | Create explicit DAX measures | | Bidirectional cross-filter | Ambiguity, performance degradation, wrong results | Use single-direction, handle in DAX | | Too many columns in fact tables | Bloated model, slow refresh, wasted memory | Keep facts narrow: keys + numeric values | | BLANK vs 0 vs null confusion | DAX treats BLANK differently from 0; visuals hide BLANK rows | Use IF/COALESCE to handle explicitly | | Circular dependency errors | Usually from calculated columns referencing each other or bidirectional filters | Restructure model, break the cycle | | 1GB PBIX limit | Cannot save file locally | Remove unused columns, optimize cardinality | | Power BI Service vs Desktop gap | Some features only available in one or the other | Check feature matrix before designing | | Calculated columns vs measures | Calculated columns consume memory, stored per row | Prefer measures (computed at query time) | | String columns in fact tables | High cardinality strings destroy VertiPaq compression | Move to dimension table, use key reference |
references/data-sources-detail.md -- Detailed connector configuration for all source typesreferences/gotchas-deep-dive.md -- Extended pitfall analysis with examples and resolution patternsdevelopment
This skill should be used when the user asks to train, debug, scale, or improve ML models. PROACTIVELY activate for: (1) PyTorch, TensorFlow/Keras, JAX, Flax, Hugging Face Trainer/Accelerate training loops, (2) distributed training, DDP/FSDP/DeepSpeed, TPU/GPU setup, (3) mixed precision AMP/bf16, gradient accumulation, checkpointing, seeding, (4) overfitting, imbalance, loss functions, regularization, LR schedules, warmup, (5) memory optimization, gradient checkpointing, offloading, quantization-aware training. Provides: reproducible training best practices across deep learning and classical ML.
development
This skill should be used when the user asks to productionize, track, version, govern, monitor, or automate ML systems. PROACTIVELY activate for: (1) MLflow, Weights & Biases, Neptune, Comet, ClearML experiment tracking, (2) model registry, model versioning, artifact lineage, reproducibility, (3) Kubeflow, SageMaker Pipelines, Vertex AI Pipelines, Azure ML pipelines, Databricks workflows, (4) CI/CD, continuous training/evaluation, A/B tests, canary/shadow deployments, (5) drift detection, model monitoring, data validation, responsible AI governance. Provides: end-to-end MLOps architecture and operational safeguards.
development
This skill should be used when the user asks to optimize, export, serve, compress, or accelerate ML inference. PROACTIVELY activate for: (1) latency, throughput, p95/p99, batching, concurrency, KV cache, memory, or cost issues, (2) quantization INT8/INT4, GPTQ, AWQ, bitsandbytes, pruning, sparsity, distillation, (3) ONNX export, ONNX Runtime, TensorRT, TorchScript, torch.compile, XLA, OpenVINO, Core ML, TFLite, (4) Triton, TorchServe, TF Serving, BentoML, Seldon, KServe configuration, (5) edge deployment, CPU/GPU/TPU/Inferentia serving. Provides: hardware-aware inference optimization and safe benchmarking.
testing
This skill should be used when the user asks to tune hyperparameters, run sweeps, optimize search spaces, or use AutoML. PROACTIVELY activate for: (1) Optuna, Ray Tune, FLAML, AutoGluon, Hyperopt, Nevergrad, KerasTuner, W&B sweeps, (2) grid search, random search, Bayesian optimization, TPE, Gaussian processes, evolutionary search, (3) ASHA, Hyperband, successive halving, multi-fidelity optimization, population-based training, (4) learning-rate finder, batch-size search, early stopping, pruning, (5) reproducible sweep design and experiment analysis. Provides: budget-aware hyperparameter search strategy.