skills/oh-arkruntime-thread-safety-audit/SKILL.md
Use this skill when auditing, reviewing, or fixing thread-safety issues in ArkCompiler Runtime Core, especially ArkTS-Sta ETS stdlib and plugin code under static_core/plugins/ets. It covers static mutable state, singleton initialization, shared maps/caches/counters/timers, taskpool/EAWorker concurrency, TSAN follow-up, and concurrent test design.
npx skillsauth add openharmonyinsight/openharmony-skills oh-arkruntime-thread-safety-auditInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Use this skill to inspect, review, or fix thread-safety issues in ArkCompiler Runtime Core, especially in:
static_core/plugins/ets/stdlib/**/*.etsstatic_core/plugins/ets/**static_core/plugins/ets/tests/The common trigger is ArkTS-Sta's concurrent execution model: public stdlib APIs can be called from taskpool, EAWorker, or multiple coroutines/workers, so static mutable state and global singletons must be treated as shared across threads unless proven otherwise.
Use only APIs/types that can be found in this checkout or in official OpenHarmony documentation. Examples below are source-backed by this repository; treat non-exported helpers as internal implementation references, not public API recommendations.
| API/type | Source |
|---|---|
| AtomicInt, AtomicLong, AtomicBoolean | static_core/plugins/ets/stdlib/std/concurrency/Atomics.ets:83, static_core/plugins/ets/stdlib/std/concurrency/Atomics.ets:194, static_core/plugins/ets/stdlib/std/concurrency/Atomics.ets:782 |
| Atomics.load | static_core/plugins/ets/stdlib/std/concurrency/LegacyAtomics.ets:253 |
| ConcurrentHashMap | static_core/plugins/ets/stdlib/std/containers/ConcurrentHashMap.ets:131 |
| Mutex, QueueSpinlock | static_core/plugins/ets/stdlib/std/core/SyncPrimitives.ets:43, static_core/plugins/ets/stdlib/std/core/SyncPrimitives.ets:156 |
| ConcurrencyHelpers.mutexCreate, ConcurrencyHelpers.lockGuard | static_core/plugins/ets/stdlib/std/core/ConcurrencyHelpers.ets:27, static_core/plugins/ets/stdlib/std/core/ConcurrencyHelpers.ets:31; package-local variants also exist under std/concurrency and std/containers |
| taskpool.execute | static_core/plugins/ets/stdlib/std/concurrency/taskpool.ets:1841, static_core/plugins/ets/stdlib/std/concurrency/taskpool.ets:1852, static_core/plugins/ets/stdlib/std/concurrency/taskpool.ets:1863 |
| EAWorker | static_core/plugins/ets/stdlib/std/core/EAWorker.ets:65 |
| CoroutineExtras.stopTaskpool | static_core/plugins/ets/stdlib/std/debug/concurrency/CoroutineExtras.ets:42 |
If a suggested example uses an API/type that cannot be resolved with rg in the target checkout, remove the example or mark it as internal pseudocode.
Start from the user-provided file, diff, or API surface. Use <target-paths> below for those files or their nearest owning directory. Only expand to all of static_core/plugins/ets when the request has no narrower scope.
Find static fields, global-looking mutable objects, caches, locks, and atomics in ETS/TS files:
rg -n "static .*[:=]|private static|public static|const .*[:=]|let .*[:=]|new Map|new Array|StringBuilder|ConcurrentHashMap|Mutex|QueueSpinlock|Atomic" \
--type-add 'ets:*.ets' --type-add 'ets:*.ts' -t ets \
<target-paths>
Decide whether the candidate state is reachable from concurrent paths:
taskpool.execute / EAWorkerUse local call-site searches around the candidate API:
rg -n "taskpool\\.execute|EAWorker|callback|setTimeout|setInterval|native .*\\(|public .*\\(" \
--type-add 'ets:*.ets' --type-add 'ets:*.ts' -t ets \
<target-paths>
Registration or initialization functions are not concurrent entry points by themselves. Treat them as risks only if there is evidence they can be called concurrently after workers or user code start.
Check native code only when the ETS candidate is a native method, the stack trace points to C++, or the user explicitly asks for native/TSAN analysis.
rg -n "static .*|std::once_flag|std::mutex|os::memory::Mutex|std::atomic|thread_local|std::vector|std::map|std::set|ani_native_function" \
--type cpp \
<related-native-paths>
After these phases, classify each candidate before proposing a fix:
| Category | Usually safe? | What to check |
| ------------------------------------------------ | ------------: | ------------------------------------------ |
| static readonly immutable constants | Yes | No mutable object hidden inside |
| static primitive counters/flags | No | ++, direct read/write, visibility |
| static Map/Array/StringBuilder | No | concurrent mutation, check-then-act |
| singleton instance | No | lazy initialization race |
| registry/cache | No | get-or-create, stale pairing, invalidation |
| instance fields in per-object objects | Maybe | object may be globally shared |
| ConcurrentHashMap/Atomic/Mutex protected state | Maybe | critical section covers whole invariant |
Do not call a file safe only because it has a lock somewhere. Verify the lock protects the full invariant.
Use four conditions to identify a meaningful risk:
Do not assign High just because a scan matched static, ConcurrentHashMap, Map, or a lazy-init shape. First prove the concurrent path and the user-visible or runtime-visible impact. If no concrete failure mode can be described, prefer leaving the code unchanged.
Risk levels:
When in doubt, report the issue as Candidate instead of upgrading the severity. Prefer no code change for Candidate and No risk findings.
The snippets in this section are source-backed patterns, not standalone tests. Before copying them into code, verify imports, package visibility, and whether the helper is public or stdlib-internal in the target file.
Bad:
if (Console.instance == undefined) {
Console.instance = new Console()
}
return Console.instance!
Fix:
Console.instanceLock.guard(() => {
if (Console.instance == undefined) {
Console.instance = new Console()
}
})
return Console.instance!
The check and assignment must be in the same critical section.
When an API builds a string and then emits it, protect the whole operation:
this.outputLock.guard(() => {
let buf = this.lvl2Buf.get(level)!
buf.append(s)
this.printString(buf.toString(), level.valueOf())
this.lvl2Buf.set(level, new StringBuilder)
})
Locking only append or only printString can still mix buffers.
Bad:
let slot = map.get(key)
if (slot == undefined) {
slot = new Slot()
map.set(key, slot)
}
return slot
Fix pattern:
const existing = states.get(key)
if (existing !== undefined) {
return existing
}
slotLock.lockGuard(() => {
if (states.get(key) === undefined) {
states.set(key, new Slot())
}
})
const slot = states.get(key)
if (slot === undefined) {
throw new Error(`Failed to create slot for '${key}'`)
}
return slot
Use ConcurrentHashMap for concurrent storage and a separate lock for lazy creation when no atomic computeIfAbsent exists.
Bad:
private static uniqueNameCounter: long = -1
return (++Proxy.uniqueNameCounter).toString()
Fix:
private static uniqueNameCounter: AtomicLong = new AtomicLong(0)
return Proxy.uniqueNameCounter.fetchAdd(1).toString()
For post-increment from 0:
oldCounter++
the atomic equivalent is also:
counter.fetchAdd(1)
with initial value 0.
Bad:
private static LOG_ENABLED: boolean = false
Logger.LOG_ENABLED = true
if (Logger.LOG_ENABLED) { ... }
Fix:
private static LOG_ENABLED: AtomicBoolean = new AtomicBoolean(false)
Logger.LOG_ENABLED.store(true)
if (Logger.LOG_ENABLED.load()) { ... }
Bad:
private static lastParsedTag: string | undefined = undefined
private static lastParsedResult: ParsedLocaleData | undefined = undefined
If tag and result must match, they are one invariant. Protect reads and writes with one lock, or replace them with one immutable cache entry object.
Minimal fix:
private static localeCacheMutex: Object = ConcurrencyHelpers.mutexCreate()
private static getCachedLocale(tag: string): ParsedLocaleData | undefined {
let result: ParsedLocaleData | undefined = undefined
ConcurrencyHelpers.lockGuard(Locale.localeCacheMutex, () => {
if (Locale.lastParsedTag == tag && Locale.lastParsedResult != undefined) {
result = Locale.lastParsedResult
}
})
return result
}
private static setCachedLocale(tag: string, data: ParsedLocaleData): void {
ConcurrencyHelpers.lockGuard(Locale.localeCacheMutex, () => {
Locale.lastParsedTag = tag
Locale.lastParsedResult = data
})
}
Do not lock only one field. Do not read tag outside the lock and result inside the lock.
If a value has multiple fields that must change together, use an internal lock:
final class TimerSlot {
public tryStart(now: long): boolean {
let started = false
this.lock.guard(() => {
if (!this.active) {
this.startTime = now
this.active = true
started = true
}
})
return started
}
private startTime: long = 0
private active: boolean = false
private lock: QueueSpinlock = new QueueSpinlock()
}
Do not use separate atomics for multi-field invariants unless the state machine is deliberately designed and reviewed.
Choose the narrowest synchronization that protects the real invariant:
| Problem | Preferred fix |
| ---------------------------------- | ------------------------------------------ |
| single numeric counter | AtomicInt / AtomicLong |
| single boolean flag | AtomicBoolean |
| pair/triple fields that must match | one mutex around all reads/writes |
| singleton lazy init | lock-guarded check-then-act |
| shared key-value registry | ConcurrentHashMap + locked get-or-create |
| per-key mutable compound state | per-slot lock |
| global output line construction | one output lock around build + emit |
Avoid:
When reviewing a patch, verify:
++x vs fetchAdd(1) requires adjusting initial value.x++ maps naturally to fetchAdd(1).ConcurrentHashMap values do not contain unsynchronized mutable fields.Use tests proportional to risk.
Use existing repository tests as templates instead of inventing new API shapes:
taskpool.execute with function arguments: static_core/plugins/ets/tests/ets_func_tests/std/containers/BlockingQueue/AddAndPollStress.ets:47static_core/plugins/ets/tests/ets-common-tests/taskpool/common_tasks_new.ets:423CoroutineExtras.stopTaskpool declaration: static_core/plugins/ets/stdlib/std/debug/concurrency/CoroutineExtras.ets:42Atomics.load declaration for typed-array barriers: static_core/plugins/ets/stdlib/std/concurrency/LegacyAtomics.ets:253Prefer simple worker task signatures used by existing tests. If a start barrier is needed, cite the source for the atomic operation used by the barrier and keep the snippet local to the test file.
When TSAN is part of validation, keep the note short: state the raced variable/function and whether the race is in ETS code or native code.
Add focused tests for the affected shared state:
Be careful: a test that concurrently misses a cache may enter a slower parse path instead of the cached path. Mention this only when it affects the chosen test design.
Use the repo's local tools when available:
git diff --check -- <changed-files>
ninja -C static_core/build plugins/ets/etsstdlib.abc
static_core/build/bin/es2panda --extension=ets --stdlib=static_core/build/plugins/ets/etsstdlib.abc --output=/tmp/test.abc <test.ets>
static_core/build/bin/ark --boot-panda-files=static_core/build/plugins/ets/etsstdlib.abc --load-runtimes=ets --verification-mode=ahead-of-time /tmp/test.abc <entrypoint>
Entrypoints can be discovered with:
static_core/build/bin/ark_disasm /tmp/test.abc /tmp/test.pa
rg -n "\\.function .* main\\(" /tmp/test.pa
development
Run local code quality checks covering a subset of OpenHarmony gate CI (copyright, CodeArts C/C++) plus additional local checks (pylint/flake8, shellcheck/bashate, gn format). Use before committing to reduce gate failures. Triggers on: /oh-precommit-codecheck, "门禁检查", "门禁预检", "检查代码", "run codecheck", "check code quality", "lint my code", "代码检查", or after completing code implementation. WHEN to use: before git commit, before creating PR, after modifying C/C++/Python/Shell/GN files, when gate CI fails with codecheck defects, or when you want to preview what gate will flag.
development
OpenHarmony PR full lifecycle workflow. Five modes: - Commit: standardized commit with DCO sign-off and Issue linking - Create PR: commit + push to fork + create Issue + create PR on upstream - Fix Codecheck: fetch gate CI codecheck defects from a PR and auto-fix them - Review PR: fetch a PR's changes to local for code review - Fix Review: fetch unresolved review comments from a PR and auto-fix them Triggers on: /oh-pr-workflow, "提交代码", "创建PR", "提个PR", "commit", "修复告警", "修复门禁", "修复codecheck", "fix codecheck", "review pr", "review这个pr", "看下这个pr", "检视pr", "修复review", "修复检视意见", "fix review", or a GitCode PR URL with fix/review intent.
testing
分析 HM Desktop PRD 文档,提取需求信息、验证完整性、检查章节顺序(需求来源→需求背景→需求价值分析→竞品分析→需求描述)、检查 KEP 定义、检测需求冲突并生成结构化分析报告。适用于用户请求:(1) 分析或审查 PRD 文档, (2) 从需求中提取 KEP 列表, (3) 检查 PRD 完整性或一致性, (4) 将需求映射到模块架构, (5) 验证 PRD 格式合规性, (6) 验证竞品分析章节完整性。关键词:PRD分析, requirement extraction, KEP验证, completeness check, chapter order validation, 竞品分析检查, analyze PRD, 需求提取, 完整性检查, 章节顺序验证
development
基于 PRD 文档自动生成鸿蒙系统设计文档,包括架构设计文档和功能设计文档。生成前会分析 OpenHarmony 存量代码结构,确保与现有架构兼容。架构设计文档第2章必须为竞品方案分析,位于需求背景之后。适用于用户请求:(1) 生成架构设计文档, (2) 生成功能设计文档, (3) 从 PRD 生成设计文档, (4) 创建系统架构设计, (5) 编写功能规格说明, (6) 分析 OH 代码结构。关键词:architecture design, functional design, design doc, 竞品方案分析, OpenHarmony code analysis, 架构设计, 功能设计, 设计文档生成, OH代码分析, analyze codebase, competitor analysis