Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

affaan-m/content-hash-cache-pattern

Name: content-hash-cache-pattern
Author: affaan-m

docs/ja-JP/skills/content-hash-cache-pattern/SKILL.md

npx skillsauth add affaan-m/everything-claude-code content-hash-cache-pattern

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

コンテンツハッシュファイルキャッシュパターン

SHA-256コンテンツハッシュをキャッシュキーとして使用して、高コストなファイル処理結果（PDF解析、テキスト抽出、画像分析）をキャッシュします。パスベースのキャッシュとは異なり、このアプローチはファイルの移動/名前変更に対して生き残り、コンテンツが変更されたときに自動的に無効化されます。

起動条件

ファイル処理パイプラインの構築（PDF、画像、テキスト抽出）
処理コストが高く、同じファイルが繰り返し処理される場合
--cache/--no-cacheCLIオプションが必要な場合
既存の純粋な関数を変更せずにキャッシュを追加したい場合

コアパターン

1. コンテンツハッシュベースのキャッシュキー

パスではなくファイルコンテンツをキャッシュキーとして使用します：

import hashlib
from pathlib import Path

_HASH_CHUNK_SIZE = 65536  # 大きなファイルには64KBチャンク

def compute_file_hash(path: Path) -> str:
    """ファイルコンテンツのSHA-256（大きなファイルにはチャンク処理）。"""
    if not path.is_file():
        raise FileNotFoundError(f"File not found: {path}")
    sha256 = hashlib.sha256()
    with open(path, "rb") as f:
        while True:
            chunk = f.read(_HASH_CHUNK_SIZE)
            if not chunk:
                break
            sha256.update(chunk)
    return sha256.hexdigest()

なぜコンテンツハッシュ？ ファイルの名前変更/移動 = キャッシュヒット。コンテンツ変更 = 自動無効化。インデックスファイル不要。

2. キャッシュエントリの凍結データクラス

from dataclasses import dataclass

@dataclass(frozen=True, slots=True)
class CacheEntry:
    file_hash: str
    source_path: str
    document: ExtractedDocument  # キャッシュされた結果

3. ファイルベースのキャッシュストレージ

各キャッシュエントリは{hash}.jsonとして保存されます — ハッシュによるO(1)検索、インデックスファイル不要。

import json
from typing import Any

def write_cache(cache_dir: Path, entry: CacheEntry) -> None:
    cache_dir.mkdir(parents=True, exist_ok=True)
    cache_file = cache_dir / f"{entry.file_hash}.json"
    data = serialize_entry(entry)
    cache_file.write_text(json.dumps(data, ensure_ascii=False), encoding="utf-8")

def read_cache(cache_dir: Path, file_hash: str) -> CacheEntry | None:
    cache_file = cache_dir / f"{file_hash}.json"
    if not cache_file.is_file():
        return None
    try:
        raw = cache_file.read_text(encoding="utf-8")
        data = json.loads(raw)
        return deserialize_entry(data)
    except (json.JSONDecodeError, ValueError, KeyError):
        return None  # 破損をキャッシュミスとして扱う

4. サービスレイヤーラッパー（SRP）

処理関数を純粋に保ちます。キャッシュを別のサービスレイヤーとして追加します。

def extract_with_cache(
    file_path: Path,
    *,
    cache_enabled: bool = True,
    cache_dir: Path = Path(".cache"),
) -> ExtractedDocument:
    """サービスレイヤー: キャッシュチェック -> 抽出 -> キャッシュ書き込み。"""
    if not cache_enabled:
        return extract_text(file_path)  # 純粋な関数、キャッシュの知識なし

    file_hash = compute_file_hash(file_path)

    # キャッシュを確認
    cached = read_cache(cache_dir, file_hash)
    if cached is not None:
        logger.info("Cache hit: %s (hash=%s)", file_path.name, file_hash[:12])
        return cached.document

    # キャッシュミス -> 抽出 -> 保存
    logger.info("Cache miss: %s (hash=%s)", file_path.name, file_hash[:12])
    doc = extract_text(file_path)
    entry = CacheEntry(file_hash=file_hash, source_path=str(file_path), document=doc)
    write_cache(cache_dir, entry)
    return doc

主要な設計上の決定

| 決定 | 根拠 | |----------|-----------| | SHA-256コンテンツハッシュ | パス非依存、コンテンツ変更で自動無効化 | | {hash}.jsonファイル命名 | O(1)検索、インデックスファイル不要 | | サービスレイヤーラッパー | SRP: 抽出は純粋に保ち、キャッシュは別の関心事 | | 手動JSONシリアル化 | 凍結データクラスのシリアル化を完全制御 | | 破損はNoneを返す | グレースフルデグラデーション、次回の実行で再処理 | | cache_dir.mkdir(parents=True) | 最初の書き込み時に遅延ディレクトリ作成 |

ベストプラクティス

パスではなくコンテンツをハッシュ — パスは変わるが、コンテンツのアイデンティティは変わらない
大きなファイルはチャンク処理でハッシュ — ファイル全体をメモリに読み込まないようにする
処理関数を純粋に保つ — キャッシュについて何も知らないようにする
切り捨てたハッシュでキャッシュヒット/ミスをログ記録 — デバッグのため
破損をグレースフルに処理 — 無効なキャッシュエントリはミスとして扱い、クラッシュしない

避けるべきアンチパターン

# 悪い例: パスベースのキャッシュ（ファイルの移動/名前変更で壊れる）
cache = {"/path/to/file.pdf": result}

# 悪い例: 処理関数内にキャッシュロジックを追加（SRP違反）
def extract_text(path, *, cache_enabled=False, cache_dir=None):
    if cache_enabled:  # この関数は今や2つの責任を持っている
        ...

# 悪い例: ネストされた凍結データクラスでdataclasses.asdict()を使用
# （複雑なネストされた型で問題を引き起こす可能性がある）
data = dataclasses.asdict(entry)  # 代わりに手動シリアル化を使用

使用すべき場合

ファイル処理パイプライン（PDF解析、OCR、テキスト抽出、画像分析）
--cache/--no-cacheオプションが有益なCLIツール
同じファイルが複数回にわたって現れるバッチ処理
既存の純粋な関数を変更せずにキャッシュを追加する場合

使用すべきでない場合

常に最新でなければならないデータ（リアルタイムフィード）
非常に大きなキャッシュエントリ（代わりにストリーミングを検討）
ファイルコンテンツ以外のパラメータに依存する結果（例：異なる抽出設定）

affaan-m/content-hash-cache-pattern

docs/ja-JP/skills/content-hash-cache-pattern/SKILL.md

SHA-256コンテンツハッシュを使用して、高コストなファイル処理結果をキャッシュします — パス非依存、自動無効化、サービスレイヤーの分離。

185,766 stars

content-media

Updated May 18, 2026

$ install --global

skillsauth

npx skillsauth add affaan-m/everything-claude-code content-hash-cache-pattern

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: May 18, 2026, 4:04 AM46.4s1 file scanned

SKILL.md

name:: content-hash-cache-pattern
description:: SHA-256コンテンツハッシュを使用して、高コストなファイル処理結果をキャッシュします — パス非依存、自動無効化、サービスレイヤーの分離。
origin:: ECC

コンテンツハッシュファイルキャッシュパターン

起動条件

ファイル処理パイプラインの構築（PDF、画像、テキスト抽出）
処理コストが高く、同じファイルが繰り返し処理される場合
--cache/--no-cacheCLIオプションが必要な場合
既存の純粋な関数を変更せずにキャッシュを追加したい場合

コアパターン

1. コンテンツハッシュベースのキャッシュキー

パスではなくファイルコンテンツをキャッシュキーとして使用します：

import hashlib
from pathlib import Path

_HASH_CHUNK_SIZE = 65536  # 大きなファイルには64KBチャンク

def compute_file_hash(path: Path) -> str:
    """ファイルコンテンツのSHA-256（大きなファイルにはチャンク処理）。"""
    if not path.is_file():
        raise FileNotFoundError(f"File not found: {path}")
    sha256 = hashlib.sha256()
    with open(path, "rb") as f:
        while True:
            chunk = f.read(_HASH_CHUNK_SIZE)
            if not chunk:
                break
            sha256.update(chunk)
    return sha256.hexdigest()

なぜコンテンツハッシュ？ ファイルの名前変更/移動 = キャッシュヒット。コンテンツ変更 = 自動無効化。インデックスファイル不要。

2. キャッシュエントリの凍結データクラス

from dataclasses import dataclass

@dataclass(frozen=True, slots=True)
class CacheEntry:
    file_hash: str
    source_path: str
    document: ExtractedDocument  # キャッシュされた結果

3. ファイルベースのキャッシュストレージ

各キャッシュエントリは{hash}.jsonとして保存されます — ハッシュによるO(1)検索、インデックスファイル不要。

import json
from typing import Any

def write_cache(cache_dir: Path, entry: CacheEntry) -> None:
    cache_dir.mkdir(parents=True, exist_ok=True)
    cache_file = cache_dir / f"{entry.file_hash}.json"
    data = serialize_entry(entry)
    cache_file.write_text(json.dumps(data, ensure_ascii=False), encoding="utf-8")

def read_cache(cache_dir: Path, file_hash: str) -> CacheEntry | None:
    cache_file = cache_dir / f"{file_hash}.json"
    if not cache_file.is_file():
        return None
    try:
        raw = cache_file.read_text(encoding="utf-8")
        data = json.loads(raw)
        return deserialize_entry(data)
    except (json.JSONDecodeError, ValueError, KeyError):
        return None  # 破損をキャッシュミスとして扱う

4. サービスレイヤーラッパー（SRP）

処理関数を純粋に保ちます。キャッシュを別のサービスレイヤーとして追加します。

def extract_with_cache(
    file_path: Path,
    *,
    cache_enabled: bool = True,
    cache_dir: Path = Path(".cache"),
) -> ExtractedDocument:
    """サービスレイヤー: キャッシュチェック -> 抽出 -> キャッシュ書き込み。"""
    if not cache_enabled:
        return extract_text(file_path)  # 純粋な関数、キャッシュの知識なし

    file_hash = compute_file_hash(file_path)

    # キャッシュを確認
    cached = read_cache(cache_dir, file_hash)
    if cached is not None:
        logger.info("Cache hit: %s (hash=%s)", file_path.name, file_hash[:12])
        return cached.document

    # キャッシュミス -> 抽出 -> 保存
    logger.info("Cache miss: %s (hash=%s)", file_path.name, file_hash[:12])
    doc = extract_text(file_path)
    entry = CacheEntry(file_hash=file_hash, source_path=str(file_path), document=doc)
    write_cache(cache_dir, entry)
    return doc

主要な設計上の決定

ベストプラクティス

パスではなくコンテンツをハッシュ — パスは変わるが、コンテンツのアイデンティティは変わらない
大きなファイルはチャンク処理でハッシュ — ファイル全体をメモリに読み込まないようにする
処理関数を純粋に保つ — キャッシュについて何も知らないようにする
切り捨てたハッシュでキャッシュヒット/ミスをログ記録 — デバッグのため
破損をグレースフルに処理 — 無効なキャッシュエントリはミスとして扱い、クラッシュしない

避けるべきアンチパターン

# 悪い例: パスベースのキャッシュ（ファイルの移動/名前変更で壊れる）
cache = {"/path/to/file.pdf": result}

# 悪い例: 処理関数内にキャッシュロジックを追加（SRP違反）
def extract_text(path, *, cache_enabled=False, cache_dir=None):
    if cache_enabled:  # この関数は今や2つの責任を持っている
        ...

# 悪い例: ネストされた凍結データクラスでdataclasses.asdict()を使用
# （複雑なネストされた型で問題を引き起こす可能性がある）
data = dataclasses.asdict(entry)  # 代わりに手動シリアル化を使用

使用すべき場合

ファイル処理パイプライン（PDF解析、OCR、テキスト抽出、画像分析）
--cache/--no-cacheオプションが有益なCLIツール
同じファイルが複数回にわたって現れるバッチ処理
既存の純粋な関数を変更せずにキャッシュを追加する場合

使用すべきでない場合

常に最新でなければならないデータ（リアルタイムフィード）
非常に大きなキャッシュエントリ（代わりにストリーミングを検討）
ファイルコンテンツ以外のパラメータに依存する結果（例：異なる抽出設定）

Related Skills

affaan-m/unified-memory

development

VerifiedTrustedCommunity

Share durable, inspectable context and handoffs between Claude, Codex, Hermes, Cursor, OpenCode, and other agents through the local ECC Memory Vault. Use when an agent must save work state, transfer context, resume another agent's task, or search shared project knowledge.

234,238SKILL.mdUpdated Jul 28, 2026

affaan-m/unified-memory

affaan-m/contract-first

development

VerifiedTrustedCommunity

Use when multiple consumers and providers must evolve an API or event schema without field drift, integration surprises, or one side silently redefining the interface.

234,238SKILL.mdUpdated Jul 28, 2026

affaan-m/contract-first

affaan-m/ito-compute

tools

VerifiedTrustedCommunity

Query live GPU inventory, submit an authenticated Itô fixed-rate RFQ, inspect RFQ or procurement status, and run explicitly gated node qualification through the separately installed canonical CLI. Use when a user asks to find H100/H200 capacity, request a fixed compute rate, check Itô compute status, or validate GPU nodes.

234,238SKILL.mdUpdated Jul 24, 2026

affaan-m/continuous-learning-v2

data-ai

VerifiedTrustedCommunity

Instinct-based learning system that observes sessions via hooks, creates atomic instincts with confidence scoring, and evolves them into skills/commands/agents. v2.1 adds project-scoped instincts to prevent cross-project contamination.

234,238SKILL.mdUpdated Mar 27, 2026

affaan-m/continuous-learning-v2

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/affaan-m/everything-claude-code.git

# Copy into Claude Code skills folder (global)
cp -r everything-claude-code/docs/ja-JP/skills/content-hash-cache-pattern ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

affaan-m/everything-claude-code

185,766 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT