.claude/skills/filter-benchmark/SKILL.md
Benchmark and performance-test Django ORM filters on large datasets. Use this skill whenever the user wants to test filter performance, check query plans with EXPLAIN ANALYZE, generate bulk test data for filters, verify index usage, or benchmark any queryset filter in the Saleor codebase. Trigger this even when the user says things like "test the filter on a big dataset", "check if the index is used", "generate data for performance testing", "run explain analyze on this filter", or "benchmark this query".
npx skillsauth add saleor/saleor filter-benchmarkInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Performance-test Django ORM filters by generating bulk data, extracting the SQL query, and running EXPLAIN ANALYZE to verify index usage and query efficiency.
When adding or modifying filters in the Saleor GraphQL layer, it's critical to verify they perform well on large datasets — a filter that looks correct on 10 rows can cause full table scans on 100k+. This skill automates the full benchmarking workflow:
Read the filter code to understand:
Meta.indexes)Exists)Look at the filter function source — it lives in saleor/graphql/<app>/filters.py. Understand the ORM query it builds (.filter(), Q(), Exists(), OuterRef(), etc.) because the test data must exercise all code paths.
Write a Python script that creates test data with enough variety to exercise the filter. The script should:
bulk_create with ignore_conflicts=True for maximum speedtimezone.now() and timedelta for date generationauto_now / auto_now_add fields via bulk_update after creation when neededThe script should be a function that creates ~1000 objects per call. The user will call it in a loop to reach 100k+.
Example structure:
import random
from datetime import timedelta
from decimal import Decimal
from django.utils import timezone
from saleor.payment.models import TransactionItem, TransactionEvent
from saleor.payment import TransactionEventType
def populate(batch_size=1000):
"""Create batch_size TransactionItems with varied events."""
now = timezone.now()
# Create TransactionItems
transactions = TransactionItem.objects.bulk_create(
[
TransactionItem(
currency="USD",
charged_value=Decimal("10.00"),
# Vary dates across 2 years
created_at=now - timedelta(days=random.randint(0, 730)),
)
for _ in range(batch_size)
]
)
# Override auto_now/auto_now_add fields with varied values via bulk_update
for t in transactions:
t.created_at = now - timedelta(days=random.randint(0, 730))
t.modified_at = now - timedelta(days=random.randint(0, 365))
TransactionItem.objects.bulk_update(transactions, ["created_at", "modified_at"])
# Create events for each transaction
event_types = [e.value for e in TransactionEventType]
events = []
for t in transactions:
num_events = random.randint(1, 4)
for _ in range(num_events):
events.append(
TransactionEvent(
transaction=t,
type=random.choice(event_types),
amount_value=Decimal("10.00"),
currency="USD",
created_at=now - timedelta(days=random.randint(0, 730)),
)
)
TransactionEvent.objects.bulk_create(events)
print(
f"Created {len(transactions)} transactions and {len(events)} events. "
f"Total: {TransactionItem.objects.count()} transactions, "
f"{TransactionEvent.objects.count()} events"
)
Adapt this pattern to whatever model and filter is being tested. The key principle: the data must vary on exactly the fields the filter touches.
Run the population script in Django shell. Activate the venv first:
source .venv/bin/activate && python manage.py shell
Then in the shell, run the function in a loop:
for i in range(100):
populate()
This reaches 100k objects. Monitor the output to confirm counts are growing. If the database already has data from a previous run, check counts first and only add what's needed.
Get the actual SQL that the filter produces. There are two approaches — use whichever fits best:
Open a Django shell and call the filter function with representative input values, then print the query:
from saleor.payment.models import TransactionItem
from saleor.graphql.payment.filters import filter_where_created_at_range
qs = TransactionItem.objects.all()
filtered = filter_where_created_at_range(qs, None, {"gte": "2025-01-01T00:00:00Z", "lte": "2025-06-01T00:00:00Z"})
print(str(filtered.query))
Add a breakpoint() right after the filter call in the filter function, run a test that hits it, then in the debugger:
print(str(qs.query))
Use this approach when the filter input is complex (e.g., involves GraphQL variable resolution or multiple conditions).
.explain() directlyprint(qs.explain(analyze=True, verbose=True, buffers=True))
This runs EXPLAIN ANALYZE directly from Django without needing psql. Useful for a quick check, but the data.sql + make explain approach gives JSON output which is easier to analyze.
Use the helper script at .claude/skills/filter-benchmark/analyze_query.py. It does everything in one call per queryset:
enable_seqscan = OFF to verify index availability/tmpRun in Django shell:
import sys; sys.path.insert(0, ".claude/skills/filter-benchmark")
from analyze_query import analyze, report
# Analyze each queryset — prints quick summary + warnings inline
analyze(filtered_qs, "created_at range")
analyze(another_qs, "events by type")
# Print a full markdown report with Dalibo links
report()
Before analyzing, verify the querysets actually return rows. If EXPLAIN ANALYZE shows 0 rows, the filter input values don't match the generated data — adjust filter parameters (widen date range, use existing event types) and re-run.
The report() output includes a summary table and per-query details. Use it as the basis for the final report to the user, adding:
1. Dataset size — how many objects of each model, how field values are distributed
2. Recommendations — if there are warnings, suggest concrete fixes:
AddIndexConcurrently)After presenting the report with Dalibo links, ask the user if they want to clean up the test data. If yes, delete the objects in reverse dependency order (child models first, then parents) to avoid foreign key violations. Use .delete() with filtering to target only the bulk-created data — for example, filter by the date range or other markers used during generation. Print counts of deleted objects for confirmation.
development
Generate Django schema migrations for the Saleor codebase using manage.py makemigrations. Use when (1) a new Django model is created, (2) model fields are added/altered/removed, (3) user explicitly asks to create a migration. This skill covers ONLY synchronous schema migrations (CreateModel, AddField, AlterField, RemoveField, etc.) — NOT data migrations with RunPython or RunSQL. If unsure whether the migration is schema-only, ask the user.
development
Commit changes in the Saleor codebase with pre-commit hook error handling. Use when asked to "commit", "commit changes", "make a commit", or any git commit task in the Saleor project. Handles virtual env activation, staging, commit message writing, and automatic resolution of pre-commit hook failures (ruff, mypy, schema, migrations).
testing
Run pytest tests with automatic virtual environment activation. Use this skill whenever running tests, executing pytest, or when asked to "run tests", "test this", or any test execution task. Ensures venv is always activated before pytest runs.
development
Maintainer-only workflow for handling GitHub Secret Scanning alerts on OpenClaw. Use when Codex needs to triage, redact, clean up, and resolve secret leakage found in issue comments, issue bodies, PR comments, or other GitHub content.