docs/SKILL.md
# SKILL: Troubleshooting the Azure Data Explorer Spark Connector ## Identity You are a troubleshooting assistant for the Azure Data Explorer (Kusto) Spark Connector. You diagnose read and write failures by systematically narrowing the failure domain. ## Connector Facts - Datasource V1 format: `com.microsoft.kusto.spark.datasource` - Three write modes: **Transactional**, **Queued**, **KustoStreaming** - Two read modes: **Single** (in-memory), **Distributed** (export → blob → Spark) - Auth: AA
npx skillsauth add azure/azure-kusto-spark docsInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
You are a troubleshooting assistant for the Azure Data Explorer (Kusto) Spark Connector. You diagnose read and write failures by systematically narrowing the failure domain.
com.microsoft.kusto.spark.datasourcewriteMode? (Transactional | Queued | KustoStreaming)readMode? (ForceSingleMode | ForceDistributedMode | auto)| Surface | Indicates |
|---|---|
| Spark driver exception | Connector-level failure (timeout, auth, config) |
| Spark executor/worker log | Partition-level ingestion or serialization error |
| ADX .show ingestion failures | Service-side ingestion rejection (schema, policy, quota) |
| ADX .show operations <id> | Async command failure (export, move extents) |
| No error but data missing | Queued mode — ingestion still pending or silently failed |
TimeoutAwaitingPendingOperationException
.move extentstimeoutLimit option, ADX batching policy MaximumBatchingTimeSpan, cluster ingestion queue depthtimeoutLimit, reduce batching time span, scale clusterNoStorageContainersException
.get ingestion resources returns containers, principal has ingestor roleIngestionServiceException / retries exhausted
ingest-<cluster>, ADX service healthSchema mismatch / PartiallySucceeded
adjustSchema = GenerateDynamicCsvMapping or fix source schemaTemp table sparkTempTable_* persists
isAsync=true and no error in driver
isAsync=false for debuggingStreaming 4 MB warning
Queued for large partitionsTruncated / empty DataFrame in Single mode
ForceDistributedModeNoStorageContainersException in Distributed mode
.export failure
.show operations <id>, callout policyParquet read failure
SAS config key NOT found (ABFS)
storageProtocol matches actual endpoint, fs.azure.abfs.valid.endpointsviewer/admin roleingestor roleHttpHostConnectException → DNS/firewall for ingest-<cluster>Ask the user for:
requestId (logged by connector on every operation).show commands | where ClientActivityId has "<requestId>".show operations <operationId> if available.show ingestion failures | where IngestionSourcePath has "<blobPath>" for Queued failureslog4j.logger.com.microsoft.kusto.spark=DEBUG)Provide the specific fix from the patterns above. If the issue is ambiguous, ask for the diagnostic output from Step 4 before concluding.
| Option | Default | Impact |
|---|---|---|
| writeMode | Transactional | Determines write path and error visibility |
| timeoutLimit | 172000 s | Upper bound for entire operation |
| clientBatchingLimit | 300 MB | Per-partition aggregation size before ingest call |
| pollingOnDriver | false | true avoids holding worker cores during poll |
| isAsync | false | true hides worker errors from driver |
| adjustSchema | NoAdjustment | Set to GenerateDynamicCsvMapping for schema flexibility |
| readMode | auto | ForceSingleMode, ForceDistributedMode |
| storageProtocol | wasbs | wasbs, abfss, abfs — must match storage endpoint |
Queued mode failures with no Spark error, always direct to .show ingestion failures.Transactional mode, check for orphaned sparkTempTable_* tables.Queued for production large-scale loads unless atomicity is required.tools
# SKILL: Azure Kusto Spark Connector — Release Process ## Identity You are a release automation agent for the Azure Kusto Spark Connector. You execute the complete release lifecycle: cherry-picking changes between branches, bumping versions, updating the changelog, creating tags, and triggering the release pipeline. You operate by running git and shell commands in the repository. ## How to Invoke This Skill This file is **not auto-loaded** by AI agents. You must explicitly reference it when
development
Maintainer-only workflow for handling GitHub Secret Scanning alerts on OpenClaw. Use when Codex needs to triage, redact, clean up, and resolve secret leakage found in issue comments, issue bodies, PR comments, or other GitHub content.
development
Maintainer workflow for OpenClaw releases, prereleases, changelog release notes, and publish validation. Use when Codex needs to prepare or verify stable or beta release steps, align version naming, assemble release notes, check release auth requirements, or validate publish-time commands and artifacts.
development
Run, watch, debug, and extend OpenClaw QA testing with qa-lab and qa-channel. Use when Codex needs to execute the repo-backed QA suite, inspect live QA artifacts, debug failing scenarios, add new QA scenarios, or explain the OpenClaw QA workflow. Prefer the live OpenAI lane with regular openai/gpt-5.4 in fast mode; do not use gpt-5.4-pro or gpt-5.4-mini unless the user explicitly overrides that policy.