skills/huggingface-paper-publisher/SKILL.md
Publish and manage research papers on Hugging Face Hub. Supports creating paper pages, linking papers to models/datasets, claiming authorship, and generating professional markdown-based research articles.
npx skillsauth add huggingface/skills huggingface-paper-publisherInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
This skill provides comprehensive tools for AI engineers and researchers to publish, manage, and link research papers on the Hugging Face Hub. It streamlines the workflow from paper creation to publication, including integration with arXiv, model/dataset linking, and authorship management.
1.0.0
The included script uses PEP 723 inline dependencies. Prefer uv run over
manual environment setup.
The skill includes Python scripts in scripts/ for paper publishing operations.
uv run (dependencies are resolved from the script header)HF_TOKEN environment variable with Write-access tokenAll paths are relative to the directory containing this SKILL.md file. Before running any script, first
cdto that directory or use the full path.
Add a paper to Hugging Face Paper Pages from arXiv.
Basic Usage:
uv run scripts/paper_manager.py index \
--arxiv-id "2301.12345"
Check If Paper Exists:
uv run scripts/paper_manager.py check \
--arxiv-id "2301.12345"
Direct URL Access:
You can also visit https://huggingface.co/papers/{arxiv-id} directly to index a paper.
Add paper references to model or dataset README with proper YAML metadata.
Add to Model Card:
uv run scripts/paper_manager.py link \
--repo-id "username/model-name" \
--repo-type "model" \
--arxiv-id "2301.12345"
Add to Dataset Card:
uv run scripts/paper_manager.py link \
--repo-id "username/dataset-name" \
--repo-type "dataset" \
--arxiv-id "2301.12345"
Add Multiple Papers:
uv run scripts/paper_manager.py link \
--repo-id "username/model-name" \
--repo-type "model" \
--arxiv-ids "2301.12345,2302.67890,2303.11111"
With Custom Citation:
uv run scripts/paper_manager.py link \
--repo-id "username/model-name" \
--repo-type "model" \
--arxiv-id "2301.12345" \
--citation "$(cat citation.txt)"
When you add an arXiv paper link to a model or dataset README:
arxiv:<PAPER_ID> is automatically added to the repositoryVerify your authorship on papers published on Hugging Face.
Start Claim Process:
uv run scripts/paper_manager.py claim \
--arxiv-id "2301.12345" \
--email "[email protected]"
Manual Process:
https://huggingface.co/papers/{arxiv-id}Check Authorship Status:
uv run scripts/paper_manager.py check-authorship \
--arxiv-id "2301.12345"
Control which verified papers appear on your public profile.
List Your Papers:
uv run scripts/paper_manager.py list-my-papers
Toggle Visibility:
uv run scripts/paper_manager.py toggle-visibility \
--arxiv-id "2301.12345" \
--show true
Manage in Settings: Navigate to your account settings → Papers section to toggle "Show on profile" for each paper.
Generate a professional markdown-based research paper using modern templates.
Create from Template:
uv run scripts/paper_manager.py create \
--template "standard" \
--title "Your Paper Title" \
--output "paper.md"
Available Templates:
standard - Traditional scientific paper structuremodern - Clean, web-friendly format inspired by Distillarxiv - arXiv-style formattingml-report - Machine learning experiment reportGenerate Complete Paper:
uv run scripts/paper_manager.py create \
--template "modern" \
--title "Fine-Tuning Large Language Models with LoRA" \
--authors "Jane Doe, John Smith" \
--abstract "$(cat abstract.txt)" \
--output "paper.md"
Convert to HTML:
uv run scripts/paper_manager.py convert \
--input "paper.md" \
--output "paper.html" \
--style "modern"
Standard Research Paper Sections:
---
title: Your Paper Title
authors: Jane Doe, John Smith
affiliations: University X, Lab Y
date: 2025-01-15
arxiv: 2301.12345
tags: [machine-learning, nlp, fine-tuning]
---
# Abstract
Brief summary of the paper...
# 1. Introduction
Background and motivation...
# 2. Related Work
Previous research and context...
# 3. Methodology
Approach and implementation...
# 4. Experiments
Setup, datasets, and procedures...
# 5. Results
Findings and analysis...
# 6. Discussion
Interpretation and implications...
# 7. Conclusion
Summary and future work...
# References
Modern Template Features:
Index Paper:
uv run scripts/paper_manager.py index --arxiv-id "2301.12345"
Link to Repository:
uv run scripts/paper_manager.py link \
--repo-id "username/repo-name" \
--repo-type "model|dataset|space" \
--arxiv-id "2301.12345" \
[--citation "Full citation text"] \
[--create-pr]
Claim Authorship:
uv run scripts/paper_manager.py claim \
--arxiv-id "2301.12345" \
--email "your.email@edu"
Manage Visibility:
uv run scripts/paper_manager.py toggle-visibility \
--arxiv-id "2301.12345" \
--show true|false
Create Research Article:
uv run scripts/paper_manager.py create \
--template "standard|modern|arxiv|ml-report" \
--title "Paper Title" \
[--authors "Author1, Author2"] \
[--abstract "Abstract text"] \
[--output "filename.md"]
Convert Markdown to HTML:
uv run scripts/paper_manager.py convert \
--input "paper.md" \
--output "paper.html" \
[--style "modern|classic"]
Check Paper Status:
uv run scripts/paper_manager.py check --arxiv-id "2301.12345"
List Your Papers:
uv run scripts/paper_manager.py list-my-papers
Search Papers:
uv run scripts/paper_manager.py search --query "transformer attention"
When linking papers to models or datasets, proper YAML frontmatter is required:
Model Card Example:
---
language:
- en
license: apache-2.0
tags:
- text-generation
- transformers
- llm
library_name: transformers
---
# Model Name
This model is based on the approach described in [Our Paper](https://arxiv.org/abs/2301.12345).
## Citation
```bibtex
@article{doe2023paper,
title={Your Paper Title},
author={Doe, Jane and Smith, John},
journal={arXiv preprint arXiv:2301.12345},
year={2023}
}
**Dataset Card Example:**
```yaml
---
language:
- en
license: cc-by-4.0
task_categories:
- text-generation
- question-answering
size_categories:
- 10K<n<100K
---
# Dataset Name
Dataset introduced in [Our Paper](https://arxiv.org/abs/2301.12345).
For more details, see the [paper page](https://huggingface.co/papers/2301.12345).
The Hub automatically extracts arXiv IDs from these links and creates arxiv:2301.12345 tags.
Workflow 1: Publish New Research
# 1. Create research article
uv run scripts/paper_manager.py create \
--template "modern" \
--title "Novel Fine-Tuning Approach" \
--output "paper.md"
# 2. Edit paper.md with your content
# 3. Submit to arXiv (external process)
# Upload to arxiv.org, get arXiv ID
# 4. Index on Hugging Face
uv run scripts/paper_manager.py index --arxiv-id "2301.12345"
# 5. Link to your model
uv run scripts/paper_manager.py link \
--repo-id "your-username/your-model" \
--repo-type "model" \
--arxiv-id "2301.12345"
# 6. Claim authorship
uv run scripts/paper_manager.py claim \
--arxiv-id "2301.12345" \
--email "your.email@edu"
Workflow 2: Link Existing Paper
# 1. Check if paper exists
uv run scripts/paper_manager.py check --arxiv-id "2301.12345"
# 2. Index if needed
uv run scripts/paper_manager.py index --arxiv-id "2301.12345"
# 3. Link to multiple repositories
uv run scripts/paper_manager.py link \
--repo-id "username/model-v1" \
--repo-type "model" \
--arxiv-id "2301.12345"
uv run scripts/paper_manager.py link \
--repo-id "username/training-data" \
--repo-type "dataset" \
--arxiv-id "2301.12345"
uv run scripts/paper_manager.py link \
--repo-id "username/demo-space" \
--repo-type "space" \
--arxiv-id "2301.12345"
Workflow 3: Update Model with Paper Reference
# 1. Get current README
hf download username/model-name README.md
# 2. Add paper link
uv run scripts/paper_manager.py link \
--repo-id "username/model-name" \
--repo-type "model" \
--arxiv-id "2301.12345" \
--citation "Full citation for the paper"
# The script will:
# - Add YAML metadata if missing
# - Insert arXiv link in README
# - Add formatted citation
# - Preserve existing content
Paper Indexing
Metadata Management
Authorship
Repository Linking
Research Articles
Batch Link Papers:
# Link multiple papers to one repository
for arxiv_id in "2301.12345" "2302.67890" "2303.11111"; do
uv run scripts/paper_manager.py link \
--repo-id "username/model-name" \
--repo-type "model" \
--arxiv-id "$arxiv_id"
done
Extract Paper Info:
# Get paper metadata from arXiv
uv run scripts/paper_manager.py info \
--arxiv-id "2301.12345" \
--format "json"
Generate Citation:
# Create BibTeX citation
uv run scripts/paper_manager.py citation \
--arxiv-id "2301.12345" \
--format "bibtex"
Validate Links:
# Check all paper links in a repository
uv run scripts/paper_manager.py validate \
--repo-id "username/model-name" \
--repo-type "model"
Issue: "Paper not found on Hugging Face"
hf.co/papers/{arxiv-id} to trigger indexingIssue: "Authorship claim not verified"
Issue: "arXiv tag not appearing"
Issue: "Cannot link to repository"
Issue: "Template rendering errors"
This skill complements tfrere's research article template by providing:
You can use tfrere's template for writing, then use this skill to publish and link the paper on Hugging Face Hub.
Pattern 1: New Paper Publication
# Write → Publish → Index → Link
uv run scripts/paper_manager.py create --template modern --output paper.md
# (Submit to arXiv)
uv run scripts/paper_manager.py index --arxiv-id "2301.12345"
uv run scripts/paper_manager.py link --repo-id "user/model" --arxiv-id "2301.12345"
Pattern 2: Existing Paper Discovery
# Search → Check → Link
uv run scripts/paper_manager.py search --query "transformers"
uv run scripts/paper_manager.py check --arxiv-id "2301.12345"
uv run scripts/paper_manager.py link --repo-id "user/model" --arxiv-id "2301.12345"
Pattern 3: Author Portfolio Management
# Claim → Verify → Organize
uv run scripts/paper_manager.py claim --arxiv-id "2301.12345"
uv run scripts/paper_manager.py list-my-papers
uv run scripts/paper_manager.py toggle-visibility --arxiv-id "2301.12345" --show true
Python Script Example:
from scripts.paper_manager import PaperManager
pm = PaperManager(hf_token="your_token")
# Index paper
pm.index_paper("2301.12345")
# Link to model
pm.link_paper(
repo_id="username/model",
repo_type="model",
arxiv_id="2301.12345",
citation="Full citation text"
)
# Check status
status = pm.check_paper("2301.12345")
print(status)
Planned features for future versions:
tools
Hugging Face Hub CLI (`hf`) for downloading, uploading, and managing models, datasets, spaces, buckets, repos, papers, jobs, and more on the Hugging Face Hub. Use when: handling authentication; managing local cache; managing Hugging Face Buckets; running or scheduling jobs on Hugging Face infrastructure; managing Hugging Face repos; discussions and pull requests; browsing models, datasets and spaces; reading, searching, or browsing academic papers; managing collections; querying datasets; configuring spaces; setting up webhooks; or deploying and managing HF Inference Endpoints. Make sure to use this skill whenever the user mentions 'hf', 'huggingface', 'Hugging Face', 'huggingface-cli', or 'hugging face cli', or wants to do anything related to the Hugging Face ecosystem and to AI and ML in general. Also use for cloud storage needs like training checkpoints, data pipelines, or agent traces. Use even if the user doesn't explicitly ask for a CLI command. Replaces the deprecated `huggingface-cli`.
development
AI demos and GPU compute with Gradio Spaces and Hugging Face Spaces ZeroGPU. Use when writing or reviewing code that uses `@spaces.GPU`, configuring `python_version` or `requirements.txt` for a ZeroGPU Space, or handling ZeroGPU-specific code constraints — pickle-based process isolation, `gr.State` semantics across the worker boundary, no `torch.compile` (use AoTI instead), CUDA wheel-only builds (no `nvcc` at build or runtime), large vs xlarge sizing, and dynamic duration callables. Make sure to use this skill whenever the user mentions ZeroGPU, `@spaces.GPU`, or the `spaces` Python package, or hits ZeroGPU-specific code errors like `PicklingError` across the worker boundary, `illegal duration`, or `flash-attn` wheel-build failures — even when the user does not explicitly ask for ZeroGPU coding guidance. Trigger on `import spaces` or `@spaces.GPU` in code.
development
Train or fine-tune sentence-transformers models across `SentenceTransformer` (bi-encoder; dense or static embedding model; for retrieval, similarity, clustering, classification, paraphrase mining, dedup, multimodal), `CrossEncoder` (reranker; pair scoring for two-stage retrieval / pair classification), and `SparseEncoder` (SPLADE, sparse embedding model; for learned-sparse retrieval). Covers loss selection, hard-negative mining, evaluators, distillation, LoRA, Matryoshka, and Hugging Face Hub publishing. Use for any sentence-transformers training task.
development
Use this skill for Hugging Face Dataset Viewer API workflows that fetch subset/split metadata, paginate rows, search text, apply filters, download parquet URLs, and read size or statistics.