garrettroi

114 verified skills

pytorch-fsdp

Expert guidance for Fully Sharded Data Parallel training with PyTorch FSDP - parameter sharding, mixed precision, CPU offloading, FSDP2

testing

findmy

Track Apple devices and AirTags via FindMy.app on macOS using AppleScript and screen capture.

development

github-pr-workflow

Full pull request lifecycle — create branches, commit changes, open PRs, monitor CI status, auto-fix failures, and merge. Works with gh CLI or falls back to git + GitHub REST API via curl.

tools

Spawn additional Hermes Agent instances as autonomous subprocesses for independent long-running tasks. Supports non-interactive one-shot mode (-q) and interactive PTY mode for multi-turn collaboration. Different from delegate_task — this runs a full separate hermes process.

data-ai

github-repo-management

Clone, create, fork, configure, and manage GitHub repositories. Manage remotes, secrets, releases, and workflows. Works with gh CLI or falls back to git + GitHub REST API via curl.

tools

skills/postiz

# Postiz Social Media Skill ## Overview Postiz is the self-hosted social media scheduling platform for Garrett's brands. Sabrina uses this to schedule and publish posts across all connected social media accounts. ## Instance - **URL**: `https://postiz-production-14aa.up.railway.app` - **API Key**: Set via `POSTIZ_API_KEY` environment variable - **Auth Method**: Cookie-based JWT (login once, reuse token) OR API key for supported endpoints ## Authentication Postiz uses a cookie-based JWT. The a

development

qdrant-vector-search

High-performance vector similarity search engine for RAG and semantic search. Use when building production RAG systems requiring fast nearest neighbor search, hybrid search with filtering, or scalable vector storage with Rust-powered performance.

development

nano-pdf

Edit PDFs with natural-language instructions using the nano-pdf CLI. Modify text, fix typos, update titles, and make content changes to specific pages without manual editing.

tools

fine-tuning-with-trl

Fine-tune LLMs using reinforcement learning with TRL - SFT for instruction tuning, DPO for preference alignment, PPO/GRPO for reward optimization, and reward model training. Use when need RLHF, align model with preferences, or train from human feedback. Works with HuggingFace Transformers.

data-ai

obsidian

Read, search, and create notes in the Obsidian vault.

tools

unsloth

Expert guidance for fast fine-tuning with Unsloth - 2-5x faster training, 50-80% less memory, LoRA/QLoRA optimization

data-ai

pinecone

Managed vector database for production AI applications. Fully managed, auto-scaling, with hybrid search (dense + sparse), metadata filtering, and namespaces. Low latency (<100ms p95). Use for production RAG, recommendation systems, or semantic search at scale. Best for serverless, managed infrastructure.

devops

google-workspace

Gmail, Calendar, Drive, Contacts, Sheets, and Docs integration via Python. Uses OAuth2 with automatic token refresh. No external binaries needed — runs entirely with Google's Python client libraries in the Hermes venv.

tools

notion

Notion API for creating and managing pages, databases, and blocks via curl. Search, create, update, and query Notion workspaces directly from the terminal.

development

ocr-and-documents

Extract text from PDFs and scanned documents. Use web_extract for remote URLs, pymupdf for local text-based PDFs, marker-pdf for OCR/scanned docs. For DOCX use python-docx, for PPTX see the powerpoint skill.

development

railway-deployer

Deploy and manage Railway services using templates via the Railway API. Use for deploying Railway templates (FastAPI, Next.js, databases), creating multi-service projects, managing Railway deployments, setting up infrastructure from pre-configured templates, or deploying custom template definitions.

development

arxiv

Search and retrieve academic papers from arXiv using their free REST API. No API key needed. Search by keyword, author, category, or ID. Combine with web_extract or the ocr-and-documents skill to read full paper content.

development

blogwatcher

Monitor blogs and RSS/Atom feeds for updates using the blogwatcher CLI. Add blogs, scan for new articles, and track what you've read.

tools

domain-intel

Passive domain reconnaissance using Python stdlib. Subdomain discovery, SSL certificate inspection, WHOIS lookups, DNS records, domain availability checks, and bulk multi-domain analysis. No API keys required.

development

powerpoint

Use this skill any time a .pptx file is involved in any way — as input, output, or both. This includes: creating slide decks, pitch decks, or presentations; reading, parsing, or extracting text from any .pptx file (even if the extracted content will be used elsewhere, like in an email or summary); editing, modifying, or updating existing presentations; combining or splitting slide files; working with templates, layouts, speaker notes, or comments. Trigger whenever the user mentions "deck," "slides," "presentation," or references a .pptx filename, regardless of what they plan to do with the content afterward. If a .pptx file needs to be opened, created, or touched, use this skill.

documentation

similarweb-analytics

Analyze websites and domains using SimilarWeb traffic data. Get traffic metrics, engagement stats, global rankings, traffic sources, and geographic distribution for comprehensive website research.

development

openhue

Control Philips Hue lights, rooms, and scenes via the OpenHue CLI. Turn lights on/off, adjust brightness, color, color temperature, and activate scenes.

tools

code-review

Guidelines for performing thorough code reviews with security and quality focus

development

subagent-driven-development

Use when executing implementation plans with independent tasks. Dispatches fresh delegate_task per task with two-stage review (spec compliance then code quality).

development

test-driven-development

Use when implementing any feature or bugfix, before writing implementation code. Enforces RED-GREEN-REFACTOR cycle with test-first approach.

development

video-generator

Professional AI video production workflow. Use when creating videos, short films, commercials, or any video content using AI generation tools.

tools

polymarket

Query Polymarket prediction market data — search markets, get prices, orderbooks, and price history. Read-only via public REST APIs, no API key needed.

development

n8n-automation-builder

Build and deploy n8n workflows from scratch. Use for creating n8n automation workflows, exploring credentials, and deploying to the user's n8n instance.

tools

duckduckgo-search

Free web search via DuckDuckGo — text, news, images, videos. No API key needed. Use the Python DDGS library or CLI to search, then web_extract for full content.

tools

skills/task-planner

# Task Planner This skill replicates the core functionality of the Manus.im `plan` tool. It allows you to create, manage, and display a structured, multi-phase task plan to guide your execution. ## When to Use Use this skill at the beginning of every complex, multi-step task. Update the plan whenever the user changes requirements or you discover significant new information. Advance the plan when you complete a phase. ## Commands ### Create or Update a Plan ``` /task-planner update --goal "

tools

skills/voice_sanitizer

# Voice Sanitizer This skill cleans up text before it is sent to the Text-to-Speech (TTS) engine. It removes technical jargon, code blocks, and long URLs to ensure the agent sounds natural and conversational in voice chat. ## Usage To sanitize text for speech, run the following command in the terminal: ```bash python3 /app/skills/voice_sanitizer/sanitizer.py "Your long, technical text with `code` and https://links.com/long-url" ``` ### Example Output ```text Your long, technical text with a

development

writing-plans

Use when you have a spec or requirements for a multi-step task. Creates comprehensive implementation plans with bite-sized tasks, exact file paths, and complete code examples.

development

internet-skill-finder

Search and recommend Agent Skills from verified GitHub repositories. Use when users ask to find, discover, search for, or recommend skills/plugins for specific tasks, domains, or workflows.

tools

distributed-llm-pretraining-torchtitan

Provides PyTorch-native distributed LLM pretraining using torchtitan with 4D parallelism (FSDP2, TP, PP, CP). Use when pretraining Llama 3.1, DeepSeek V3, or custom models at scale from 8 to 512+ GPUs with Float8, torch.compile, and distributed checkpointing.

development

chroma

Open-source embedding database for AI applications. Store embeddings and metadata, perform vector and full-text search, filter by metadata. Simple 4-function API. Scales from notebooks to production clusters. Use for semantic search, RAG applications, or document retrieval. Best for local development and open-source projects.

development

requesting-code-review

Use when completing tasks, implementing major features, or before merging. Validates work meets requirements through systematic review process.

development

peft-fine-tuning

Parameter-efficient fine-tuning for LLMs using LoRA, QLoRA, and 25+ methods. Use when fine-tuning large models (7B-70B) with limited GPU memory, when you need to train <1% of parameters with minimal accuracy loss, or for multi-adapter serving. HuggingFace's official library integrated with transformers ecosystem.

development

himalaya

CLI to manage emails via IMAP/SMTP. Use himalaya to list, read, write, reply, forward, search, and organize emails from the terminal. Supports multiple accounts and message composition with MML (MIME Meta Language).

tools

blackbox

Delegate coding tasks to Blackbox AI CLI agent. Multi-model agent with built-in judge that runs tasks through multiple LLMs and picks the best result. Requires the blackbox CLI and a Blackbox AI API key.

tools

github-auth

Set up GitHub authentication for the agent using git (universally available) or the gh CLI. Covers HTTPS tokens, SSH keys, credential helpers, and gh auth — with a detection flow to pick the right method automatically.

tools

sparse-autoencoder-training

Provides guidance for training and analyzing Sparse Autoencoders (SAEs) using SAELens to decompose neural network activations into interpretable features. Use when discovering interpretable features, analyzing superposition, or studying monosemantic representations in language models.

development

slime-rl-training

Provides guidance for LLM post-training with RL using slime, a Megatron+SGLang framework. Use when training GLM models, implementing custom data generation workflows, or needing tight Megatron-LM integration for RL scaling.

development

guidance

Control LLM output with regex and grammars, guarantee valid JSON/XML/code generation, enforce structured formats, and build multi-step workflows with Guidance - Microsoft Research's constrained generation framework

development

llama-cpp

Runs LLM inference on CPU, Apple Silicon, and consumer GPUs without NVIDIA hardware. Use for edge deployment, M1/M2/M3 Macs, AMD/Intel GPUs, or when CUDA is unavailable. Supports GGUF quantization (1.5-8 bit) for reduced memory and 4-10× speedup vs PyTorch on CPU.

devops

openclaw-migration

Migrate a user's OpenClaw customization footprint into Hermes Agent. Imports Hermes-compatible memories, SOUL.md, command allowlists, user skills, and selected workspace assets from ~/.openclaw, then reports exactly what could not be migrated and why.

data-ai

dogfood

Systematic exploratory QA testing of web applications — find bugs, capture evidence, and generate structured reports

development

find-nearby

Find nearby places (restaurants, cafes, bars, pharmacies, etc.) using OpenStreetMap. Works with coordinates, addresses, cities, zip codes, or Telegram location pins. No API keys needed.

development

github-code-review

Review code changes by analyzing git diffs, leaving inline comments on PRs, and performing thorough pre-push review. Works with gh CLI or falls back to git + GitHub REST API via curl.

tools

agentmail

Give the agent its own dedicated email inbox via AgentMail. Send, receive, and manage email autonomously using agent-owned email addresses (e.g. [email protected]).

data-ai

documenso-api

Interact with Documenso API for document signing workflows. Use for creating documents/templates, managing recipients and signature fields, sending documents for signing, tracking signing status, and downloading completed documents. Supports all Documenso API v2 operations including envelopes, recipients, fields, items, and attachments.

development

pokemon-player

Play Pokemon games autonomously via headless emulation. Starts a game server, reads structured game state from RAM, makes strategic decisions, and sends button inputs — all from the terminal.

testing

google-drive

Google Drive, Docs, Sheets, and Slides access for all agents. Use to read, write, share, and organize files in Garrett's vowsok.com Google Drive. Requires gws CLI authentication.

tools

gguf-quantization

GGUF format and llama.cpp quantization for efficient CPU/GPU inference. Use when deploying models on consumer hardware, Apple Silicon, or when needing flexible quantization from 2-8 bit without GPU requirements.

development

solana

Query Solana blockchain data with USD pricing — wallet balances, token portfolios with values, transaction details, NFTs, whale detection, and live network stats. Uses Solana RPC + CoinGecko. No API key required.

development

imessage

Send and receive iMessages/SMS via the imsg CLI on macOS.

tools

calcom

Cal.com scheduling and booking management. Use to check Garrett's calendar, create bookings, view availability, and manage appointments for all three businesses (DJ, Real Estate, Cana).

testing

excalidraw

Create hand-drawn style diagrams using Excalidraw JSON format. Generate .excalidraw files for architecture diagrams, flowcharts, sequence diagrams, concept maps, and more. Files can be opened at excalidraw.com or uploaded for shareable links.

development

claude-code

Delegate coding tasks to Claude Code (Anthropic's CLI agent). Use for building features, refactoring, PR reviews, and iterative coding. Requires the claude CLI installed.

tools

codex

Delegate coding tasks to OpenAI Codex CLI agent. Use for building features, refactoring, PR reviews, and batch issue fixing. Requires the codex CLI and a git repository.

tools

minecraft-modpack-server

Set up a modded Minecraft server from a CurseForge/Modrinth server pack zip. Covers NeoForge/Forge install, Java version, JVM tuning, firewall, LAN config, backups, and launch scripts.

development

excel-generator

Professional Excel spreadsheet creation with a focus on aesthetics and data analysis. Use when creating spreadsheets for organizing, analyzing, and presenting structured data in a clear and professional format.

development

github-gem-seeker

Search GitHub for battle-tested solutions instead of reinventing the wheel. Use when the user's problem is universal enough that open source developers have probably solved it already—especially for: format conversion (video/audio/image/document), media downloading, file manipulation, web scraping/archiving, automation scripts, and CLI tools. Prefer this skill over writing custom code for well-trodden problems.

tools

massive-api

Comprehensive access to Massive.com API (evolution of polygon.io) for financial market data. Use for retrieving stock prices, options chains, futures contracts, forex rates, cryptocurrency data, economic indicators, company fundamentals, corporate actions, analyst ratings, and real-time market data across all major U.S. exchanges and global markets.

development

mcporter

Use the mcporter CLI to list, configure, auth, and call MCP servers/tools directly (HTTP or stdio), including ad-hoc servers, config edits, and CLI/type generation.

tools

gif-search

Search and download GIFs from Tenor using curl. No dependencies beyond curl and jq. Useful for finding reaction GIFs, creating visual content, and sending GIFs in chat.

development

heartmula

Set up and run HeartMuLa, the open-source music generation model family (Suno-like). Generates full songs from lyrics + tags with multilingual support.

data-ai

songsee

Generate spectrograms and audio feature visualizations (mel, chroma, MFCC, tempogram, etc.) from audio files via CLI. Useful for audio analysis, music production debugging, and visual documentation.

tools

youtube-content

Fetch YouTube video transcripts and transform them into structured content (chapters, summaries, threads, blog posts).

content-media

lambda-labs-gpu-cloud

Reserved and on-demand GPU cloud instances for ML training and inference. Use when you need dedicated GPU instances with simple SSH access, persistent filesystems, or high-performance multi-node clusters for large-scale training.

devops

nemo-curator

GPU-accelerated data curation for LLM training. Supports text/image/video/audio. Features fuzzy deduplication (16× faster), quality filtering (30+ heuristics), semantic deduplication, PII redaction, NSFW detection. Scales across GPUs with RAPIDS. Use for preparing high-quality training datasets, cleaning web data, or deduplicating large corpora.

development

tensorrt-llm

Optimizes LLM inference with NVIDIA TensorRT for maximum throughput and lowest latency. Use for production deployment on NVIDIA GPUs (A100/H100), when you need 10-100x faster inference than PyTorch, or for serving models with quantization (FP8/INT4), in-flight batching, and multi-GPU scaling.

devops

obliteratus

Remove refusal behaviors from open-weight LLMs using OBLITERATUS — mechanistic interpretability techniques (diff-in-means, SVD, whitened SVD, LEACE, SAE decomposition, etc.) to excise guardrails while preserving reasoning. 9 CLI methods, 28 analysis modules, 116 model presets across 5 compute tiers, tournament evaluation, and telemetry-driven recommendations. Use when a user wants to uncensor, abliterate, or remove refusal from an LLM.

tools

instructor

Extract structured data from LLM responses with Pydantic validation, retry failed extractions automatically, parse complex JSON with type safety, and stream partial results with Instructor - battle-tested structured output library

development

audiocraft-audio-generation

PyTorch library for audio generation including text-to-music (MusicGen) and text-to-sound (AudioGen). Use when you need to generate music from text descriptions, create sound effects, or perform melody-conditioned music generation.

development

codebase-inspection

Inspect and analyze codebases using pygount for LOC counting, language breakdown, and code-vs-comment ratios. Use when asked to check lines of code, repo size, language composition, or codebase stats.

development

meta-ads-analyzer

Provides expert-level analysis and diagnosis for Meta Ads campaigns. Use this skill to interpret performance data, identify root causes of issues, and generate actionable recommendations, with a special focus on correctly handling the 'Breakdown Effect'.

testing

skills/hive_mind

# Hive Mind — Shared Memory System The Hive Mind is the team's shared knowledge base stored in Redis. All agents can read from it. All agents can submit lessons to it. Only the Lexi curates and distributes knowledge. ## How to Use ### Reading Knowledge (All Agents) Search the hive mind for lessons relevant to your current task: ```bash python3 /app/skills/hive_mind/hive_search.py --query "your search terms" --agent "your_name" ``` This searches across all categories using keyword matching

development

skills/inter_agent_comm

# Inter-Agent Communication Skill This skill enables agents to communicate with each other through a shared Redis message bus. Use this to delegate tasks to other agents, request information, or report status. ## How It Works Each agent has a Redis queue named `agent:{agent_name}:inbox`. To send a task to another agent, push a JSON message to their inbox queue. To check for incoming messages, read from your own inbox. ## Usage ### Send a task to another agent ```bash python3 /app/skills/in

development

gws-best-practices

Best practices for using the gws CLI with supported Google Workspace services (Drive, Docs, Sheets, Slides). Use when performing any operation with the gws CLI.

tools

ascii-video

Production pipeline for ASCII art video — any format. Converts video/audio/images/generative input into colored ASCII character video output (MP4, GIF, image sequence). Covers: video-to-ASCII conversion, audio-reactive music visualizers, generative ASCII art animations, hybrid video+audio reactive, text/lyrics overlays, real-time terminal rendering. Use when users request: ASCII video, text art video, terminal-style video, character art animation, retro text visualization, audio visualizer in ASCII, converting video to ASCII art, matrix-style effects, or any animated ASCII output.

development

github-issues

Create, manage, triage, and close GitHub issues. Search existing issues, add labels, assign people, and link to PRs. Works with gh CLI or falls back to git + GitHub REST API via curl.

tools

ascii-art

Generate ASCII art using pyfiglet (571 fonts), cowsay, boxes, toilet, image-to-ascii, remote APIs (asciified, ascii.co.uk), and LLM fallback. No API keys required.

development

qmd

Search personal knowledge bases, notes, docs, and meeting transcripts locally using qmd — a hybrid retrieval engine with BM25, vector search, and LLM reranking. Supports CLI and MCP integration.

tools

apple-notes

Manage Apple Notes via the memo CLI on macOS (create, view, search, edit).

tools

evaluating-llms-harness

Evaluates LLMs across 60+ academic benchmarks (MMLU, HumanEval, GSM8K, TruthfulQA, HellaSwag). Use when benchmarking model quality, comparing models, reporting academic results, or tracking training progress. Industry standard used by EleutherAI, HuggingFace, and major labs. Supports HuggingFace, vLLM, APIs.

development

weights-and-biases

Track ML experiments with automatic logging, visualize training in real-time, optimize hyperparameters with sweeps, and manage model registry with W&B - collaborative MLOps platform

data-ai

apple-reminders

Manage Apple Reminders via remindctl CLI (list, add, complete, delete).

tools

modal-serverless-gpu

Serverless GPU cloud platform for running ML workloads. Use when you need on-demand GPU access without infrastructure management, deploying ML models as APIs, or running batch jobs with automatic scaling.

development

serving-llms-vllm

Serves LLMs with high throughput using vLLM's PagedAttention and continuous batching. Use when deploying production LLM APIs, optimizing inference latency/throughput, or serving models with limited GPU memory. Supports OpenAI-compatible endpoints, quantization (GPTQ/AWQ/FP8), and tensor parallelism.

development

grpo-rl-training

Expert guidance for GRPO/RL fine-tuning with TRL for reasoning and task-specific model training

testing

pytorch-lightning

High-level PyTorch framework with Trainer class, automatic distributed training (DDP/FSDP/DeepSpeed), callbacks system, and minimal boilerplate. Scales from laptop to supercomputer with same code. Use when you want clean training loops with built-in best practices.

development

dspy

Build complex AI systems with declarative programming, optimize prompts automatically, create modular RAG systems and agents with DSPy - Stanford NLP's framework for systematic LM programming

development

stable-diffusion-image-generation

State-of-the-art text-to-image generation with Stable Diffusion models via HuggingFace Diffusers. Use when generating images from text prompts, performing image-to-image translation, inpainting, or building custom diffusion pipelines.

development

whisper

OpenAI's general-purpose speech recognition model. Supports 99 languages, transcription, translation to English, and language identification. Six model sizes from tiny (39M params) to large (1550M params). Use for speech-to-text, podcast transcription, or multilingual audio processing. Best for robust, multilingual ASR.

data-ai

axolotl

Expert guidance for fine-tuning LLMs with Axolotl - YAML configs, 100+ models, LoRA/QLoRA, DPO/KTO/ORPO/GRPO, multimodal support

data-ai

optimizing-attention-flash

Optimizes transformer attention with Flash Attention for 2-4x speedup and 10-20x memory reduction. Use when training/running transformers with long sequences (>512 tokens), encountering GPU memory issues with attention, or need faster inference. Supports PyTorch native SDPA, flash-attn library, H100 FP8, and sliding window attention.

development

hermes-atropos-environments

Build, test, and debug Hermes Agent RL environments for Atropos training. Covers the HermesAgentBaseEnv interface, reward functions, agent loop integration, evaluation with tools, wandb logging, and the three CLI modes (serve/process/evaluate). Use when creating, reviewing, or fixing RL environments in the hermes-agent repo.

tools

huggingface-accelerate

Simplest distributed training API. 4 lines to add distributed support to any PyTorch script. Unified API for DeepSpeed/FSDP/Megatron/DDP. Automatic device placement, mixed precision (FP16/BF16/FP8). Interactive config, single launch command. HuggingFace ecosystem standard.

development

systematic-debugging

Use when encountering any bug, test failure, or unexpected behavior. 4-phase root cause investigation — NO fixes without understanding the problem first.

development

stock-analysis

Analyze stocks and companies using financial market data. Get company profiles, technical insights, price charts, insider holdings, and SEC filings for comprehensive stock research.

testing

agent-email-system

Multi-agent email system with Google Service Accounts, comprehensive safeguards (recipient allowlisting, rate limiting, keyword filtering, manual approval queue), and REST API service for secure agent-to-agent email communication.

development

native-mcp

Built-in MCP (Model Context Protocol) client that connects to external MCP servers, discovers their tools, and registers them as native Hermes Agent tools. Supports stdio and HTTP transports with automatic reconnection, security filtering, and zero-config tool injection.

tools

clip

OpenAI's model connecting vision and language. Enables zero-shot image classification, image-text matching, and cross-modal retrieval. Trained on 400M image-text pairs. Use for image search, content moderation, or vision-language tasks without fine-tuning. Best for general-purpose image understanding.

tools

llava

Large Language and Vision Assistant. Enables visual instruction tuning and image-based conversations. Combines CLIP vision encoder with Vicuna/LLaMA language models. Supports multi-turn image chat, visual question answering, and instruction following. Use for vision-language chatbots or image understanding tasks. Best for conversational image analysis.

tools

skills/task_board

# Task Board — Persistent Task Tracking for Open Manus This skill provides a shared task board backed by Redis. Harmony uses it to track delegated work across all agents, and agents use it to report progress and completion. ## When to Use - **Harmony**: Use this whenever you delegate a task to an agent. Add the task to the board, then check the board periodically to follow up. - **Worker Agents**: Use this to update your task status or mark tasks as complete. ## Commands ### Add a new task

testing

vault_client

Secure API key access from the centralized vault. Fetch keys on-demand without storing them in environment variables.

tools

ml-paper-writing

Write publication-ready ML/AI papers for NeurIPS, ICML, ICLR, ACL, AAAI, COLM. Use when drafting papers from research repos, structuring arguments, verifying citations, or preparing camera-ready submissions. Includes LaTeX templates, reviewer guidelines, and citation verification workflows.

testing

faiss

Facebook's library for efficient similarity search and clustering of dense vectors. Supports billions of vectors, GPU acceleration, and various index types (Flat, IVF, HNSW). Use for fast k-NN search, large-scale vector retrieval, or when you need pure similarity search without metadata. Best for high-performance applications.

development

huggingface-tokenizers

Fast tokenizers optimized for research and production. Rust-based implementation tokenizes 1GB in <20 seconds. Supports BPE, WordPiece, and Unigram algorithms. Train custom vocabularies, track alignments, handle padding/truncation. Integrates seamlessly with transformers. Use when you need high-performance tokenization or custom tokenizer training.

development

simpo-training

Simple Preference Optimization for LLM alignment. Reference-free alternative to DPO with better performance (+6.4 points on AlpacaEval 2.0). No reference model needed, more efficient than DPO. Use for preference alignment when want simpler, faster training than DPO/PPO.

testing

skill-creator

Guide for creating or updating skills that extend Manus via specialized knowledge, workflows, or tool integrations. For any modification or improvement request, MUST first read this skill and follow its update workflow instead of editing files directly.

tools

skills/image-generation

# Image Generation Skill (Cora) ## Overview Cora uses OpenRouter to access AI image generation models. This skill provides access to multiple image generation models through a single unified API. ## API Configuration - **Provider**: OpenRouter (`https://openrouter.ai/api/v1`) - **API Key**: Set via `OPENROUTER_API_KEY` environment variable - **Primary Model**: `black-forest-labs/flux-1.1-pro` (best quality) - **Fast Model**: `black-forest-labs/flux-schnell` (faster, cheaper) - **Alternative**:

development

segment-anything-model

Foundation model for image segmentation with zero-shot transfer. Use when you need to segment any object in images using points, boxes, or masks as prompts, or automatically generate all object masks in an image.

data-ai

outlines

Guarantee valid JSON/XML/code structure during generation, use Pydantic models for type-safe outputs, support local models (Transformers, vLLM), and maximize inference speed with Outlines - dottxt.ai's structured generation library

development

garrettroi

pytorch-fsdp

findmy

github-pr-workflow

hermes-agent-spawning

github-repo-management

skills/postiz

qdrant-vector-search

nano-pdf

fine-tuning-with-trl

obsidian

unsloth

pinecone

google-workspace

notion

ocr-and-documents

railway-deployer

arxiv

blogwatcher

domain-intel

powerpoint

similarweb-analytics

openhue

code-review

subagent-driven-development

test-driven-development

video-generator

polymarket

n8n-automation-builder

duckduckgo-search

skills/task-planner

skills/voice_sanitizer

writing-plans

internet-skill-finder

distributed-llm-pretraining-torchtitan

chroma

requesting-code-review

peft-fine-tuning

himalaya

blackbox

github-auth

sparse-autoencoder-training

slime-rl-training

guidance

llama-cpp

openclaw-migration

dogfood

find-nearby

github-code-review

agentmail

documenso-api

pokemon-player

google-drive

gguf-quantization

solana

imessage

calcom

excalidraw

claude-code

codex

minecraft-modpack-server

excel-generator

github-gem-seeker

massive-api

mcporter

gif-search

heartmula

songsee

youtube-content

lambda-labs-gpu-cloud

nemo-curator

tensorrt-llm

obliteratus

instructor

audiocraft-audio-generation

codebase-inspection

meta-ads-analyzer

skills/hive_mind

skills/inter_agent_comm

gws-best-practices