skills/43-wentorai-research-plugins/skills/tools/knowledge-graph/concept-map-generator/SKILL.md
Generate structured concept maps from academic texts automatically
npx skillsauth add brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research concept-map-generatorInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
A skill for automatically generating structured concept maps from academic texts, lecture notes, and research papers. Covers concept extraction using NLP techniques, relationship identification, hierarchical organization, and export to visual formats. Concept maps differ from mind maps in that they explicitly label relationships between concepts, making them more suitable for representing scientific knowledge.
A concept map consists of three elements: concepts (nodes), linking phrases (labeled edges), and propositions (concept-link-concept triples that form meaningful statements).
Concept Map Elements:
Concept: A perceived regularity or pattern designated by a label.
Examples: "DNA replication", "enzyme", "natural selection"
Representation: boxes or ovals containing short noun phrases
Linking Phrase: Words that connect two concepts to form a proposition.
Examples: "is catalyzed by", "requires", "leads to", "is a type of"
Representation: labeled arrows between concept nodes
Proposition: A meaningful statement formed by two concepts and a link.
Example: [DNA replication] --requires--> [DNA polymerase]
This reads: "DNA replication requires DNA polymerase"
Cross-links: Connections between concepts in different domains or
branches of the map, showing integrative understanding.
Feature Concept Map Mind Map
-------------- ------------------ ------------------
Structure Network (graph) Tree (hierarchical)
Relationships Labeled explicitly Implied by proximity
Root node May have multiple Single central topic
Cross-links Encouraged Rare
Best for Deep understanding Brainstorming
Scientific use Knowledge modeling Idea generation
Reading direction Follow arrow labels Center outward
import spacy
def extract_concepts(text, nlp_model="en_core_web_sm"):
"""
Extract candidate concepts from academic text using NLP.
Strategy:
1. Extract noun phrases as concept candidates
2. Filter by frequency and specificity
3. Merge overlapping spans
4. Rank by TF-IDF relevance
"""
nlp = spacy.load(nlp_model)
doc = nlp(text)
# Extract noun phrases
candidates = []
for chunk in doc.noun_chunks:
# Remove determiners and leading adjectives for cleaner concepts
clean = chunk.text.strip()
if len(clean.split()) <= 4: # Keep manageable length
candidates.append(clean.lower())
# Count frequencies
from collections import Counter
freq = Counter(candidates)
# Filter: keep concepts mentioned at least twice
concepts = [c for c, count in freq.most_common() if count >= 2]
return concepts
def extract_relationships(text, concepts, nlp_model="en_core_web_sm"):
"""
Extract relationships between concepts using dependency parsing.
Identifies verb phrases connecting known concepts in the same sentence.
"""
nlp = spacy.load(nlp_model)
doc = nlp(text)
concept_set = set(concepts)
triples = []
for sent in doc.sents:
sent_text = sent.text.lower()
# Find which concepts appear in this sentence
found = [c for c in concept_set if c in sent_text]
if len(found) >= 2:
# Extract the verb connecting them
verbs = [token.lemma_ for token in sent
if token.pos_ == "VERB"]
if verbs:
for i in range(len(found)):
for j in range(i + 1, len(found)):
triples.append({
"source": found[i],
"target": found[j],
"relation": verbs[0],
"sentence": sent.text
})
return triples
Academic concept maps benefit from hierarchical organization, placing the most general, inclusive concepts at the top and progressively more specific concepts below.
Hierarchy Construction Algorithm:
1. Identify superordinate concepts:
- Concepts that appear in titles, abstracts, section headings
- Concepts with the most outgoing relationships
- Concepts that subsume other concepts (hypernyms)
2. Identify subordinate concepts:
- Concepts that are instances or types of superordinates
- Concepts with high specificity (long noun phrases)
- Concepts that appear only in methods/results sections
3. Assign levels:
Level 0: Domain (e.g., "machine learning")
Level 1: Subdomains (e.g., "supervised learning", "unsupervised learning")
Level 2: Methods (e.g., "random forests", "k-means clustering")
Level 3: Details (e.g., "Gini impurity", "elbow method")
4. Add cross-links between branches:
e.g., "random forests" --uses--> "bootstrap sampling"
(links supervised learning to statistical methods)
Input: Research paper
Output: Concept map organized by paper structure
Section-Based Extraction:
Introduction -> Key concepts, research questions, theoretical framework
Methods -> Methodological concepts, tools, techniques, variables
Results -> Findings, measurements, statistical outcomes
Discussion -> Interpretations, implications, limitations
Connection Types in Academic Maps:
"is defined as" - definitional relationships
"is measured by" - operationalization
"causes / leads to" - causal relationships
"is correlated with" - associative relationships
"is a type of" - taxonomic relationships
"is part of" - mereological relationships
"contradicts" - conflicting findings
"extends" - building on prior work
def export_to_graphml(concepts, relationships, output_path):
"""
Export concept map to GraphML format for Gephi, yEd, or Cytoscape.
"""
import networkx as nx
G = nx.DiGraph()
for concept in concepts:
G.add_node(concept, label=concept)
for rel in relationships:
G.add_edge(
rel["source"],
rel["target"],
label=rel["relation"]
)
nx.write_graphml(G, output_path)
return output_path
def export_to_cmap(concepts, relationships, output_path):
"""
Export to CXL format for CmapTools (IHMC).
CmapTools is the standard concept mapping software in education.
"""
# CXL is an XML format specific to CmapTools
header = '<?xml version="1.0" encoding="UTF-8"?>\n'
header += '<cmap xmlns="http://cmap.ihmc.us/xml/cmap/">\n'
body = ' <map>\n'
for i, concept in enumerate(concepts):
body += f' <concept id="c{i}" label="{concept}"/>\n'
for j, rel in enumerate(relationships):
src_id = concepts.index(rel["source"])
tgt_id = concepts.index(rel["target"])
body += (
f' <connection id="conn{j}" '
f'from-id="c{src_id}" to-id="c{tgt_id}" '
f'label="{rel["relation"]}"/>\n'
)
body += ' </map>\n'
footer = '</cmap>\n'
with open(output_path, "w") as f:
f.write(header + body + footer)
return output_path
CmapTools (IHMC):
- Free desktop application specifically designed for concept maps
- Collaborative editing, cloud hosting
- Export: CXL, image, PDF, web page
- Best for: Educational concept maps, collaborative projects
yEd Graph Editor:
- Free desktop application with auto-layout algorithms
- Import: GraphML, Excel, CSV
- Hierarchical, organic, circular layouts
- Best for: Large concept maps needing automatic layout
Mermaid.js (text-based):
- Embed concept maps in Markdown documents
- Version-controllable (plain text)
- Best for: Documentation, README files, lab notebooks
Evaluation Rubric:
Comprehensiveness: Does the map capture the key concepts?
- All major concepts from the source text should appear
- No important relationships should be missing
Accuracy: Are the propositions correct?
- Each concept-link-concept triple should be factually accurate
- Linking phrases should precisely describe the relationship
Hierarchy: Is the map well-organized?
- Most general concepts at top, specific at bottom
- Logical grouping of related concepts
Cross-links: Does the map show integrative understanding?
- Links between different branches demonstrate deep understanding
- Cross-links are the most valuable part of a concept map
Concept maps serve as both learning tools and knowledge artifacts. In research, they help teams align on shared understanding of complex domains, identify knowledge gaps, and communicate theoretical frameworks to collaborators and reviewers.
development
Conduct rigorous thematic analysis (TA) of qualitative data following Braun and Clarke's (2006) six-phase framework. Use whenever the user mentions 'thematic analysis', 'TA', 'Braun and Clarke', 'qualitative coding', 'identifying themes', or asks for help analysing interviews, focus groups, open-ended survey responses, or transcripts to identify patterns. Also trigger for questions about inductive vs theoretical coding, semantic vs latent themes, essentialist vs constructionist epistemology, building a thematic map, or writing up a qualitative findings section. Covers all six phases, the four upfront analytic decisions, the 15-point quality checklist, and the five common pitfalls. Produces a Word document write-up and an annotated thematic map. Does NOT cover IPA, grounded theory, discourse analysis, conversation analysis, or narrative analysis — use a different method for those.
development
Guide users through writing a systematic literature review (SLR) following the PRISMA 2020 framework. Use this skill whenever the user mentions 'systematic review', 'systematic literature review', 'SLR', 'PRISMA', 'PRISMA 2020', 'PRISMA flow diagram', 'PRISMA checklist', or asks for help writing, structuring, or auditing a literature review that follows reporting guidelines. Also trigger when the user asks about inclusion/exclusion criteria for a review, search strategies for databases like Scopus/WoS/PubMed, study selection processes, risk of bias assessment, or narrative synthesis for a review paper. This skill covers the full PRISMA 2020 checklist (27 items), produces a Word document manuscript in strict journal article format, generates an annotated PRISMA flow diagram, and enforces APA 7th Edition referencing throughout. It does NOT cover meta-analysis or statistical pooling. By Chuah Kee Man.
testing
Performs placebo-in-time sensitivity analysis with hierarchical null model and optional Bayesian assurance. Use when checking model robustness, verifying lack of pre-intervention effects, or estimating study power.
data-ai
Fit, summarize, plot, and interpret a chosen CausalPy experiment. Use after the causal method has been selected, including when configuring PyMC/sklearn models and scale-aware custom priors.