plugins/rai/skills/rai-graph-analysis/SKILL.md
Graph algorithm selection and execution on PyRel v1 models — construction from ontology patterns, parameter tuning, and result extraction. Use for questions about a network's structure — centrality and importance, community detection, connectivity and components, reachability and dependencies, shortest paths and distance, and node similarity.
npx skillsauth add RelationalAI/rai-agent-skills rai-graph-analysisInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
What: Graph algorithm selection and execution — building Graph instances from PyRel ontology patterns and running the right algorithm to answer structural questions about the data.
When to use:
Graph instance from an existing PyRel model (choosing node concept, edge construction, directed/weighted)When NOT to use:
rai-discoveryrai-pyrel-codingrai-ontology-designrai-prescriptive-problem-formulationrai-rules-authoringrai-predictive-modelingOverview (process steps):
# Imports
from relationalai.semantics import Model, Float, Integer, String, where, define, data
from relationalai.semantics.reasoners.graph import Graph
from relationalai.semantics.std import aggregates, floats
model = Model("my_model")
Note: where, define, and data are available as standalone imports and as model.where(), model.define(), model.data() methods. Both are equivalent for single-model scripts. Use the model.* form when multiple Models exist — standalone functions fail with "Multiple Models have been defined.". See rai-pyrel-coding for data loading patterns with model.data().
Pattern 1: No existing node or edge concepts — the most common pattern. Use when nodes are drawn from multiple concepts, or when not all instances of a concept should be nodes. Edges are defined manually with Edge.new().
graph = Graph(
model,
directed=True|False, # Edge directionality
weighted=True|False, # Whether edges carry weights
)
Node, Edge = graph.Node, graph.Edge
# Define nodes and edges manually with Edge.new(src=..., dst=...)
Pattern 2: Existing node concept, manual edges — use when a single concept covers all desired nodes and you want all instances as nodes (including isolated nodes with no edges). Isolated nodes can also be added explicitly via model.define(Node(my_instance)).
graph = Graph(
model,
directed=True|False,
weighted=True|False,
node_concept=MyConcept, # All instances of MyConcept become nodes; graph.Node is bound to MyConcept
)
Node, Edge = graph.Node, graph.Edge
# graph.Node IS MyConcept — properties assigned to graph.Node are directly available on MyConcept
# Define edges manually with Edge.new(src=..., dst=...)
Pattern 3: Existing node concept + edge concept — use when each interaction is already modeled as its own concept with source/destination relationships. Rather than deriving new nodes and edges, graph.Node, graph.Edge, graph.EdgeSrc, graph.EdgeDst, and graph.EdgeWeight bind directly to the provided concepts and relationships — avoiding extra computation, which matters for large graphs.
graph = Graph(
model,
directed=True,
weighted=True,
node_concept=Account,
edge_concept=Transaction, # Each Transaction instance becomes an edge
edge_src_relationship=Transaction.payer, # Source endpoint
edge_dst_relationship=Transaction.payee, # Destination endpoint
edge_weight_relationship=Transaction.amount, # Edge weight (required when weighted=True)
)
Important:
edge_concept, you must also pass node_concept, edge_src_relationship, and edge_dst_relationship. Add edge_weight_relationship when the graph is weighted.floats.float() to cast from other numeric types: weight=floats.float(Transaction.amount).edge_src_relationship, edge_dst_relationship, and edge_weight_relationship accept only a Relationship or Chain — not a Property, not an Expression. If your ontology models the link as a Property (e.g., f"{Edge} from {Node:source}"), or if you need to cast (floats.float(...)), use Pattern 1 or 2 with manual Edge.new(src=..., dst=..., weight=...) instead. Pattern 3 will fail silently with a model warning and produce 0 edges.Graph constructor aggregator parameter guidanceaggregator defaults to None; multi-edges (parallel edges between the same node pair) emit warnings when present. aggregator="sum" (the only supported alternative) collapses multi-edges by summing weights — works on unweighted too.
Use it only when multi-edges are expected (co-occurrence with multiple shared attributes, or intermediary concepts where multiple operations connect the same pair). Omit otherwise — silencing unexpected multi-edges masks bugs in edge-derivation logic.
Compatibility constraints in this section are non-exhaustive. For any algorithm you plan to call, load algorithm-selection.md and confirm its compatibility row before writing the
Graph(...)constructor. The Pre-flight table in Step 7 summarizes the most common gotchas, but is also non-exhaustive.
| Family | Methods | Output Shape | Typical Use |
|--------|---------|-------------|-------------|
| Basic Stats | num_nodes(), num_edges(), num_triangles() | relation; use .inspect() to print | Graph validation, sanity checks |
| Neighbors | neighbor(), inneighbor(), outneighbor(), common_neighbor() | (node, node) or (node, node, node) | Neighborhood exploration |
| Degree | degree(), indegree(), outdegree(), weighted_degree(), weighted_indegree(), weighted_outdegree() | (node, value) | Connection counts |
| Centrality | eigenvector_centrality(), betweenness_centrality(), degree_centrality(), pagerank() | (node, score) | Importance, influence, bottlenecks |
| Community | louvain(), infomap(), label_propagation() | (node, label) | Natural groupings, clusters |
| Components | weakly_connected_component(), is_connected() | (node, component_id) or scalar | Fragmentation, isolation |
| Reachability | reachable() | (source, target) pairs | Dependency tracing, impact analysis |
| Distance | distance(), diameter_range() | (start, end, length) | Shortest paths, network diameter |
| Similarity | jaccard_similarity(), cosine_similarity(), adamic_adar(), preferential_attachment() | (node1, node2, score) | Entity comparison, link prediction |
| Clustering | local_clustering_coefficient(), average_clustering_coefficient(), triangle_count(), triangle(), unique_triangle() | (node, value) or (n1, n2, n3) | Local density, tightness |
Bridge edges (single points of failure on the edge side) have no named primitive — express them as WCC with per-edge ablation, in RAI. betweenness_centrality() is not a substitute: it scores nodes on a relative scale and does not answer "would removing this edge disconnect the graph." See Bridge edges in algorithm-selection.md for the recipe.
For compatibility constraints (directed/weighted limits per algorithm), see Step 7.
Before writing any graph or enrichment code, read the existing model file to understand:
from my_model import model, Site, Operation) so that all base define() rules are in scope. Creating a standalone Model("same name") reference without the base definitions will produce an empty graph because concept instances won't exist.Start from the question, not the ontology. The question determines which concepts become nodes, what edges you need, and how to derive them.
Step 2 — Scope the question:
Step 3 — Identify the nodes:
Step 4 — Determine the edges:
Step 5 — Derive edges from relationships:
Key principle: A single ontology can support multiple graph constructions. Different questions about the same data lead to different node/edge choices. The question always comes first — the ontology is the source of available structure, not the driver.
Recognizing edge sources in the ontology:
Once you know what nodes and edges you need, scan the ontology for these structural signals to find them:
| Ontology Signal | What to Look For | How It Becomes Edges |
|-----------------|-----------------|---------------------|
| Direct relationship | Concept A has a relationship to Concept B (or to itself) | The relationship itself is an edge — use Edge.new(src=a, dst=b) |
| Intermediary concept | A concept C with two relationships pointing to the same node type (e.g., source and destination both referencing Concept A) | Each instance of C becomes an edge between the two endpoints — use Edge.new() or edge_concept (edge_concept only when the src/dst are Relationships) |
| Shared attribute | Two instances of Concept A both relate to the same instance of Concept B (e.g., two users sharing an address, two customers ordering the same product) | Co-occurrence — entities sharing an attribute are connected. Requires id < guard to prevent duplicates |
| Self-referencing | A concept with a relationship back to itself (e.g., parent, reports_to, depends_on, subcomponent, follows) | Instance-to-instance edges within one concept — may be acyclic (hierarchies, DAGs) or contain cycles |
| Multi-concept co-occurrence | Multiple distinct attributes shared between entities (e.g., shared address OR shared phone OR shared email) | Each shared attribute type creates edges — combine in a single graph for richer connectivity |
Tip: When the ontology has an interaction concept with source/destination relationships and edge-relevant properties (volume, weight, intensity), prefer edge_concept over manual Edge.new() — it's more concise and ensures every instance is included.
Map the ontology signal you identified in Steps 2–5 to a construction pattern and decide how to implement it:
| Ontology Signal (from Steps 2–5) | Construction Pattern | Edge Method | Example |
|-------------------------------|---------------------|-------------|---------|
| Direct relationship | Entity-level | Edge.new(src=a, dst=b) | Graph Construction from Ontology |
| Intermediary concept | Infrastructure-level | Edge.new() from intermediary, or edge_concept if concept has src/dst/weight | centrality_weighted_undirected.py, edge_concept_multi_algorithm.py |
| Shared attribute | Co-occurrence | Edge.new() with id < guard | community_to_derived_concept.py |
| Multi-concept co-occurrence | Multi-attribute co-occurrence | Multiple Edge.new() calls (one per shared attribute) | chained_graph_rules.py |
| Self-referencing | Hierarchy, recursive, or cyclic | Edge.new(src=parent, dst=child) | graph-construction.md |
Then decide the remaining axes:
Edge.new() or edge_concept? — Prefer edge_concept when the interaction is already modeled as a concept with src/dst relationshipsSee graph-construction.md for detailed patterns including filtered edges, multi-graph, and weight construction.
Start from the question, not the algorithm name. First check whether the question reduces to an aggregation — per-entity counts and one-hop group-bys ("how many sources per target?", "which targets have only one predecessor?") run as aggs.count(distinct(...)).per(target) with no Graph constructor needed. Reserve graph reasoners for multi-hop traversal, partitioning, and centrality.
| Question Type | Algorithm Family | Default Choice |
|--------------|-----------------|----------------|
| "Who/what is most important?" (directed graph) | Centrality | pagerank() (inbound flow) |
| "Who/what is most important?" (undirected graph) | Centrality | eigenvector_centrality() (mutual importance) |
| "Which nodes are bottlenecks?" | Centrality | betweenness_centrality() (bridge nodes) |
| "What natural groups exist?" | Community | louvain() (undirected) or infomap() (directed) |
| "Is the network fragmented?" | Components | weakly_connected_component() |
| "Which edges are single points of failure (bridges)?" | Components | No named primitive — WCC with per-edge ablation. See algorithm-selection.md |
| "What are all transitive dependencies?" | Reachability | reachable() |
| "What depends on X? / What does X affect?" | Reachability | reachable() |
| "How far apart are X and Y?" | Distance | distance() — supports weighted (non-negative) |
| "Which entities are most similar?" | Similarity | jaccard_similarity() |
| "How tightly connected locally?" | Clustering | local_clustering_coefficient() — requires undirected |
Within the Centrality family, pick the variant by the structural test:
directed=False. Use for structural-importance / centrality questions (symmetric, co-occurrence, similarity graphs). See eigenvector vs. PageRank.weighted=False.One algorithm per question. Pick the single algorithm that matches the structural test above. Don't combine multiple centrality measures into a "composite criticality score" — the resulting ranking has no clean interpretation, depends on arbitrary normalization choices, and typically reorders results vs. any one of its inputs. If the question genuinely requires multiple lenses, run them as separate columns in the output, but rank by one and explain why.
Pre-flight compatibility check. Before Step 8, confirm the chosen algorithm fits the graph's directed/weighted settings:
| Algorithm | Cannot use with |
|-----------|----------------|
| betweenness_centrality() | weighted=True |
| eigenvector_centrality() | directed=True (use pagerank() for directed) |
| louvain() | directed=True (use infomap() for directed) |
| local_clustering_coefficient() | directed=True (requires undirected) |
| diameter_range() | directed=True or weighted=True (requires undirected, unweighted) |
pagerank() / reachable() work on undirected but are most meaningful on directed.louvain(), infomap(), label_propagation() are non-deterministic — report ranges or structural assertions, not exact values.For per-algorithm deep dives (parameters, output shapes, interpretation, compatibility matrix), see algorithm-selection.md and confirm the row for any algorithm you call.
Canonical assign → bind → query skeleton:
graph = Graph(model, directed=False, weighted=True, node_concept=MyConcept, aggregator="sum") # match flags to the algorithm (Step 7 pre-flight): eigenvector needs directed=False
model.define(graph.Edge.new(src=..., dst=..., weight=...)) # define edges
graph.Node.score = graph.eigenvector_centrality() # execute — substitute any algorithm
model.where(graph.Node == MyConcept).define(MyConcept.score(graph.Node.score)) # bind back
model.select(MyConcept.id, MyConcept.score).to_df() # query
For the directed / weighted / aggregator decisions, see Parameter Guidance. Expensive algorithms (reachable, jaccard_similarity, distance, triangle, …) require explicit domain constraints (of=/from_=/to=/between=/full=True) to run — see Domain Constraints. For the full assign → bind → query walkthrough and per-algorithm extraction, see Result Extraction and Binding and result-extraction.md.
The question determines which construction to use — the same ontology can yield multiple valid graphs.
| Pattern | Nodes | Edges | Typical Use |
|---------|-------|-------|-------------|
| Entity-level directed | Business entities | Direct relationships (ships_to) | PageRank, reachability |
| Infrastructure undirected weighted | Locations/sites | Intermediary concepts (Operation) | Centrality, WCC, bridges |
| Co-occurrence / shared-attribute | Entities | Shared membership/purchases | Community detection |
| Self-referencing | Single concept | Instance-to-instance refs (parent-child, depends_on, etc.) | Reachability, tree traversal, BOM, dependency graphs |
| edge_concept | Any | Existing interaction concept | When edges are already modeled |
Manual edges with Edge.new() — most common:
# Entity-level: direct relationship between concepts
graph = Graph(model, directed=True, weighted=False, node_concept=Business)
b1, b2 = Business.ref(), Business.ref()
model.where(b1.ships_to(b2)).define(graph.Edge.new(src=b1, dst=b2))
# Infrastructure-level: edges from intermediary concept
graph = Graph(model, directed=False, weighted=True, node_concept=Site, aggregator="sum")
op = Operation.ref()
model.where(
op.source_site(site1 := Site.ref()),
op.output_site(site2 := Site.ref()),
).define(graph.Edge.new(src=site1, dst=site2, weight=op.shipment_count))
# Co-occurrence: shared attribute (id < guard prevents duplicates)
graph = Graph(model, directed=False, weighted=True, node_concept=Customer, aggregator="sum")
left_order, right_order = Order.ref(), Order.ref()
model.where(
left_order.product == right_order.product,
left_order.customer.id < right_order.customer.id,
).define(graph.Edge.new(src=left_order.customer, dst=right_order.customer, weight=1.0))
Concept-based edges with edge_concept — when each interaction is already a concept, pass it via edge_concept= + the three required relationships (edge_src_relationship, edge_dst_relationship, and edge_weight_relationship when weighted). See Quick Reference Pattern 3 for the full constructor signature.
Filtered edges: restrict via .where(t.amount >= 100.0).define(graph.Edge.new(src=t.payer, dst=t.payee)).
For detailed patterns (multi-intermediary, hierarchy, self-referencing, multi-graph, weight construction, validation), see graph-construction.md. Graph algorithms and prescriptive optimization coexist on the same Model — use domain concepts directly as node_concept (no mirror concepts or separate models needed).
Defaults: directed=False, weighted=False. Only deviate when the question's structure justifies it. Flipping either flag changes algorithm outputs significantly — Louvain communities partition differently, PageRank scores invert, eigenvector is undefined on directed graphs. Add direction or weight only when the question explicitly demands it (see "When to use" below).
| Parameter | Values | When to use |
|-----------|--------|-------------|
| directed | True | Promote to directed only when the question is about flow, dependency, or causality — i.e., the edge's direction has semantic meaning the answer depends on. |
| | False | Default. Symmetric relationships, co-occurrence, infrastructure connectivity, similarity. |
| weighted | True | Promote to weighted only when an edge-property quantity is explicitly named in the question. Don't add weights speculatively because a numeric column on the edge "looks relevant" — for community detection in particular, weighting changes the partition. Weights must be floats — cast with floats.float(). |
| | False | Default. Only connection existence matters. |
| aggregator | "sum" | Only supported alternative to the default None. Collapses multi-edges by summing weights. Only use when multi-edges are expected — see Aggregator guidance. |
| node_concept | Concept | Which concept forms nodes. Required with edge_concept. Optional otherwise (inferred from edges). |
Domain constraints control which subset of an output relationship gets materialized. Some algorithms (e.g., preferential_attachment, common_neighbor, jaccard_similarity, reachable, triangle, distance) are expensive to materialize in full and require explicit domain constraints via of=, from_=, to=, between=, or full=True to proceed.
| Keyword | Meaning |
|---------|---------|
| of=R | Constrain to nodes in relationship R (binary relationships: degree, neighbor, etc.) |
| from_=R | Source / first-argument nodes (paths, reachability) |
| to=R | Destination / second-argument nodes |
| from_=R1, to=R2 | Separately constrain both axes |
| between=R | Jointly constrain to specific node pairs (binary Relationship of pairs) |
| full=True | Override guard and compute full relationship |
For detailed guidance — which relationships support which keywords, the full=True guard rationale, and a worked constrained-similarity example — see domain-constraints.md.
# 1. Run algorithm — result lives on graph.Node
graph.Node.centrality_score = graph.eigenvector_centrality()
# 2. Bind to original concept (makes it available for rules, optimization, queries)
Site.centrality_score = model.Property(f"{Site} has {Float:centrality_score}")
model.where(graph.Node == Site).define(Site.centrality_score(graph.Node.centrality_score))
# 3. Query as DataFrame
df = model.select(Site.id, Site.centrality_score).to_df().sort_values("centrality_score", ascending=False)
node_conceptWhen node_concept is set (e.g., node_concept=User), graph.Node IS the concept — assigning to graph.Node directly creates the property on the concept without a separate binding step:
# graph.Node IS User when node_concept=User — property is automatically available on User
graph.Node.community = graph.weakly_connected_component()
# Query directly via User — no explicit binding needed
df = model.select(User.name, User.community).to_df()
graph.Node.community_label = graph.louvain()
Segment = model.Concept("Segment", identify_by={"id": Integer})
model.define(Segment.new(id=graph.Node.community_label))
Customer.segment = model.Property(f"{Customer} belongs to {Segment:segment}")
model.where(graph.Node == Customer).define(
Customer.segment(Segment.filter_by(id=graph.Node.community_label))
)
Critical: Community detection IDs (Louvain, Infomap) always return as Int128Array — cast with .astype(int) before pandas operations and before model.data(). WCC is different: its output is a Node, so the shorthand produces string hashes (.astype(str)) while the direct-query pattern using graph.Node.ref() exposes the node identifier type directly. See result-extraction.md for full casting guidance including both WCC access modes.
graph.num_nodes().inspect() # 0 = edge definitions don't match data
graph.num_edges().inspect() # 0 = relationship/join path is wrong
For per-algorithm query patterns (binary, ternary, reachability, similarity filtering, aggregation), see result-extraction.md.
Graph outputs become model properties that other reasoning consumes:
centrality < 0.1 → at-risk flag)Graph() instances on the same model for complementary perspectives (e.g., directed flow graph + undirected connectivity graph)# Graph metric feeds optimization
problem.maximize(aggregates.sum(x * Site.centrality_score))
# Graph metric feeds rule
Site.is_at_risk = model.Relationship(f"{Site} is at risk")
model.where(Site.centrality_score < 0.1).define(Site.is_at_risk())
| Mistake | Cause | Fix |
|---------|-------|-----|
| louvain() fails on directed graph | Louvain requires undirected | Set directed=False or use infomap() for directed |
| Empty graph (no edges) | Edge definition doesn't match data — wrong relationship or join path | Verify edge source/destination properties exist and have data; query edge count before running algorithms |
| 0 edges from a multi-argument relationship | Endpoints bound from a relationship packing several positional fields, with the slots in the wrong order — the binding matches no rows | Confirm the field order with inspect.fields(rel) before binding, and check graph.num_edges().inspect() immediately after defining edges. Binding endpoints from raw source columns (node.id == table.col) is a reliable fallback when the field order is ambiguous |
| Int128Array errors in pandas or model.data() | Community detection IDs (Louvain, Infomap) are always Int128 — incompatible with pandas ops and model.data() type inference | Cast: df["col"] = df["col"].astype(int) before pandas operations and before passing to model.data(). WCC is different — its output is a Node, not an integer: via shorthand the column arrives as string hashes (.astype(str)), via direct-query comp_ref.id it inherits the node's identifier type — Int128 when identify_by={"id": Integer} (cast with .astype(int)), a plain string column otherwise. See result-extraction.md |
| Duplicate/self-loop edges | Missing guard in co-occurrence pattern | Add left.id < right.id to .where() clause |
| aggregator missing | Weighted graph with multi-edges requires aggregator for parallel edges | Add aggregator="sum" — but only when multi-edges are expected (see Aggregator guidance) |
| Parallel edges mask bridges | Two edges between the same node pair mean removing one doesn't disconnect the pair — neither is a true bridge | Before reporting bridges, check whether the underlying data has parallel edges between the same endpoints. Do not collapse with aggregator="sum" before bridge detection — that merges parallel edges into one, making the single result appear to be a bridge when physically it isn't. For the recipe, see Bridge edges |
| Reaching for an external Python graph library when the algorithm isn't in the cheat sheet | Common reflex when the named primitive doesn't exist (e.g., bridge edges, edge betweenness) — but switching to NetworkX/igraph loses the concept binding that lets results feed downstream rules and optimization on the same model, and forces a pandas round-trip back into the model | First check whether the question can be expressed compositionally with the existing primitives — for example, bridge edges fall out of WCC + per-edge ablation, see Bridge edges. Stay in the RAI Graph reasoner; if scale is the concern, scope the candidate set first |
| Weight type error | Weights must be floats, but property is Integer/Number | Cast with floats.float(property) in Edge.new weight parameter |
| ValueError: edge_weight_relationship must have type Float, is Number(38,0) | Weight property is Integer (count, duration in seconds, etc.), but graph weights must be Float | Cast inline with weight=floats.float(Entity.int_prop) in Edge.new(), or define a Float-typed derived property on the node concept |
| Centrality all equal / graph too dense | Two causes: (1) co-occurrence edges built on shared attribute values rather than shared specific entities produce near-complete graphs; (2) even with correct edges, all nodes have identical connectivity | (1) Build edges from shared specific entities (same FK value), not shared attribute categories; (2) add weighted edges to differentiate. If centrality is still uniform after both fixes, the underlying data may lack structural bottlenecks |
| Similarity produces too many results | O(n^2) output for n nodes | Use domain constraints to limit computation: e.g. graph.jaccard_similarity(from_=seed_nodes). If you need the full relationship, filter by minimum threshold or limit to top-k per node after full=True. |
| Reachability on undirected connected graph gives trivial results | On undirected connected graphs, all nodes are reachable from all others — results aren't useful | Set directed=True for meaningful reachability/impact analysis. (On disconnected undirected graphs, reachable can still be useful for discovering components for specific nodes.) |
| Wrong node concept | Using intermediary concept as nodes instead of entity concept | Intermediary concepts form edges, not nodes — e.g., Operation is an edge between Site nodes |
| Graph results not visible on original concept | Results bound to graph.Node but not to the source concept | Add explicit binding: model.where(graph.Node == MyConcept).define(...) |
| TypeError with large local models | In local execution mode, models with many (200+) Relationships in scope alongside a Graph can exceed type inference limits, producing a TypeError | First try: isolate the Graph in a separate script/model that imports only the concepts the Graph needs, reducing the type inference scope. Fallback: keep large datasets as pandas DataFrames for Python-side analysis. This limitation is specific to local execution; Snowflake-backed models handle larger schemas. |
| Empty graph when extending existing model | Script creates Model("name") without importing base model definitions — concepts exist but have no instances | Import the base model module (e.g., from my_model import model, Site) so base define() rules are in scope |
| ValidationError: Unused variable when using rank() with graph properties | Using rank(desc(graph.Node.betweenness)) alongside other graph properties in select() triggers the unused variable validator | Sort in pandas instead: .to_df().sort_values("betweenness", ascending=False).reset_index(drop=True) — avoid rank() in graph queries |
| DataFrame column name doesn't match the property name you assigned | Column names follow the algorithm's default output binding (often centrality, score, community) or the variable name used in .select() — they don't inherit the attribute name from graph.Node.my_name = graph.algorithm() | Always set names explicitly with .alias("my_name") in select(...). Don't rely on assignment-name propagation: model.select(Site.id, score.alias("centrality")).to_df() |
| RAIException: Ungrounded variables when mixing chained derived properties + Graph + boolean rules | Defining chained derived properties (e.g., peak_forecast → future_headroom) alongside Graph construction and boolean Relationship rules in the same model causes ungrounded variable errors | Query raw data via simple selects and compute derived values / rules in pandas. Root cause is related to the type inference limit noted above — chained derivations compound the issue |
Each example targets a distinct combination of edge construction, topology, algorithm, and result pattern.
| Primary Pattern | Construction | Topology | Algorithms | Result Pattern | File |
|----------------|-------------|----------|------------|----------------|------|
| Infrastructure Edge.new() | Intermediary concept (Operation) creates edges | Undirected, weighted | Eigenvector centrality | Simple property binding | centrality_weighted_undirected.py |
| Co-occurrence Edge.new() | Shared-attribute edges with id < guard | Undirected, unweighted | WCC + betweenness | Hybrid risk: graph metric + domain attribute | co_occurrence_wcc_bottleneck.py |
| edge_concept + computed weight | Interaction concept as edges, multi-factor weight | Directed, weighted | PageRank + degree centrality | Multi-algorithm classification | edge_concept_multi_algorithm.py |
| Directed reachability | edge_concept for dependency chain | Directed, unweighted | Reachability (4 modes) + betweenness | Graph + ontology enrichment | reachability_four_modes.py |
| Louvain → derived concept | Co-occurrence edges, community labels become entities | Undirected, weighted | Louvain + degree centrality | Community → concept + hub-per-community | community_to_derived_concept.py |
| Graph + rules combo | Multi-concept co-occurrence (shared address/phone/email) | Undirected, unweighted | WCC | Layered Relationship flags on graph results | chained_graph_rules.py |
| Identity graph self-join | Self-join edges from shared identifiers (phone, email) | Undirected, unweighted | WCC | Identity cluster detection | self_join_wcc_subtypes.py |
| Multiple graphs, same model | Multiple Graph instances on same node concept | Weighted + unweighted | Eigenvector + betweenness | Parallel graph views, separate Edge defs | multi_graph_same_model.py |
| Jaccard similarity | Co-occurrence edges via shared attribute | Undirected, unweighted | Jaccard similarity | Top-k similar pairs extraction | similarity_jaccard.py |
| Shortest path distances | edge_concept with cost weight | Directed, weighted | Distance | All-pairs shortest paths, filter by source/target | shortest_path_distance.py |
| Reference | Description | File |
|-----------|-------------|------|
| Graph construction | Detailed construction patterns from ontology — entity-level, infrastructure, hierarchy, bridge, self-referencing, filtered, multi-graph | graph-construction.md |
| Algorithm selection | Per-algorithm deep dive — when to use, parameters, output shape, complexity, decision guidance | algorithm-selection.md |
| Domain constraints | Why they matter, which relationships support which keywords, full=True guard rationale, worked constrained-similarity example | domain-constraints.md |
| Result extraction | Query patterns for each algorithm output shape, model binding, DataFrame extraction, type handling | result-extraction.md |
data-ai
Configure and train graph neural network (GNN) models, generate predictions, evaluate results, and manage trained models. Use when ready to train, generate predictions, evaluate, or manage models; for concepts, data loading, edges, and feature configuration, see `rai-predictive-modeling`.
development
Build graph neural network (GNN) models — concepts, Snowflake data loading, task relationships, graph edges, and PropertyTransformer features. Use for node classification, regression, and link prediction tasks; for training, predictions, and evaluation, see `rai-predictive-training`.
development
Setup and configuration for RelationalAI — first-time install walkthrough and all raiconfig.yaml tuning. Use when installing RAI, connecting to Snowflake, or editing raiconfig.yaml. Not for writing PyRel model code (see rai-pyrel-coding) or solver usage and diagnostics (see rai-prescriptive-solver-management).
testing
Converts natural language business rules into PyRel derived properties — validation, classification, derivation, alerting, and reconciliation. Use whenever a task assigns each entity a new tier, segment, score, or flag, or derives a new property; author it here as a derived property, then query it with rai-querying.