sdk/python-sdk/SKILL.md
Python SDK for inference.sh - run AI apps, build agents, and integrate with 250+ models. Package: inferencesh (pip install inferencesh). Supports sync/async, streaming, file uploads. Build agents with template or ad-hoc patterns, tool builder API, skills, and human approval. Use for: Python integration, AI apps, agent development, RAG pipelines, automation. Triggers: python sdk, inferencesh, pip install, python api, python client, async inference, python agent, tool builder python, programmatic ai, python integration, sdk python
npx skillsauth add inference-sh-6/skills python-sdkInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Install the belt CLI skill:
npx skills add belt-sh/cli
Build AI applications with the inference.sh Python SDK.

pip install inferencesh
from inferencesh import inference
client = inference(api_key="inf_your_key")
# Run an AI app
result = client.run({
"app": "infsh/flux-1-dev",
"input": {"prompt": "A sunset over mountains"}
})
print(result["output"])
# Standard installation
pip install inferencesh
# With async support
pip install inferencesh[async]
Requirements: Python 3.8+
import os
from inferencesh import inference
# Direct API key
client = inference(api_key="inf_your_key")
# From environment variable (recommended)
client = inference(api_key=os.environ["INFERENCE_API_KEY"])
Get your API key: Settings → API Keys → Create API Key
result = client.run({
"app": "infsh/flux-1-dev",
"input": {"prompt": "A cat astronaut"}
})
print(result["status"]) # "completed"
print(result["output"]) # Output data
task = client.run({
"app": "google/veo-3-1-fast",
"input": {"prompt": "Drone flying over mountains"}
}, wait=False)
print(f"Task ID: {task['id']}")
# Check later with client.get_task(task['id'])
for update in client.run({
"app": "google/veo-3-1-fast",
"input": {"prompt": "Ocean waves at sunset"}
}, stream=True):
print(f"Status: {update['status']}")
if update.get("logs"):
print(update["logs"][-1])
| Parameter | Type | Description |
|-----------|------|-------------|
| app | string | App ID (namespace/name@version) |
| input | dict | Input matching app schema |
| setup | dict | Hidden setup configuration |
| infra | string | 'cloud' or 'private' |
| session | string | Session ID for stateful execution |
| session_timeout | int | Idle timeout (1-3600 seconds) |
result = client.run({
"app": "image-processor",
"input": {
"image": "/path/to/image.png" # Auto-uploaded
}
})
from inferencesh import UploadFileOptions
# Basic upload
file = client.upload_file("/path/to/image.png")
# With options
file = client.upload_file(
"/path/to/image.png",
UploadFileOptions(
filename="custom_name.png",
content_type="image/png",
public=True
)
)
result = client.run({
"app": "image-processor",
"input": {"image": file["uri"]}
})
Keep workers warm across multiple calls:
# Start new session
result = client.run({
"app": "my-app",
"input": {"action": "init"},
"session": "new",
"session_timeout": 300 # 5 minutes
})
session_id = result["session_id"]
# Continue in same session
result = client.run({
"app": "my-app",
"input": {"action": "process"},
"session": session_id
})
Use pre-built agents from your workspace:
agent = client.agent("my-team/support-agent@latest")
# Send message
response = agent.send_message("Hello!")
print(response.text)
# Multi-turn conversation
response = agent.send_message("Tell me more")
# Reset conversation
agent.reset()
# Get chat history
chat = agent.get_chat()
Create custom agents programmatically:
from inferencesh import tool, string, number, app_tool
# Define tools
calculator = (
tool("calculate")
.describe("Perform a calculation")
.param("expression", string("Math expression"))
.build()
)
image_gen = (
app_tool("generate_image", "infsh/flux-1-dev@latest")
.describe("Generate an image")
.param("prompt", string("Image description"))
.build()
)
# Create agent
agent = client.agent({
"core_app": {"ref": "infsh/claude-sonnet-4@latest"},
"system_prompt": "You are a helpful assistant.",
"tools": [calculator, image_gen],
"temperature": 0.7,
"max_tokens": 4096
})
response = agent.send_message("What is 25 * 4?")
| Model | App Reference |
|-------|---------------|
| Claude Sonnet 4 | infsh/claude-sonnet-4@latest |
| Claude 3.5 Haiku | infsh/claude-haiku-35@latest |
| GPT-4o | infsh/gpt-4o@latest |
| GPT-4o Mini | infsh/gpt-4o-mini@latest |
from inferencesh import (
string, number, integer, boolean,
enum_of, array, obj, optional
)
name = string("User's name")
age = integer("Age in years")
score = number("Score 0-1")
active = boolean("Is active")
priority = enum_of(["low", "medium", "high"], "Priority")
tags = array(string("Tag"), "List of tags")
address = obj({
"street": string("Street"),
"city": string("City"),
"zip": optional(string("ZIP"))
}, "Address")
greet = (
tool("greet")
.display("Greet User")
.describe("Greets a user by name")
.param("name", string("Name to greet"))
.require_approval()
.build()
)
generate = (
app_tool("generate_image", "infsh/flux-1-dev@latest")
.describe("Generate an image from text")
.param("prompt", string("Image description"))
.setup({"model": "schnell"})
.input({"steps": 20})
.require_approval()
.build()
)
from inferencesh import agent_tool
researcher = (
agent_tool("research", "my-org/researcher@v1")
.describe("Research a topic")
.param("topic", string("Topic to research"))
.build()
)
from inferencesh import webhook_tool
notify = (
webhook_tool("slack", "https://hooks.slack.com/...")
.describe("Send Slack notification")
.secret("SLACK_SECRET")
.param("channel", string("Channel"))
.param("message", string("Message"))
.build()
)
from inferencesh import internal_tools
config = (
internal_tools()
.plan()
.memory()
.web_search(True)
.code_execution(True)
.image_generation({
"enabled": True,
"app_ref": "infsh/flux@latest"
})
.build()
)
agent = client.agent({
"core_app": {"ref": "infsh/claude-sonnet-4@latest"},
"internal_tools": config
})
def handle_message(msg):
if msg.get("content"):
print(msg["content"], end="", flush=True)
def handle_tool(call):
print(f"\n[Tool: {call.name}]")
result = execute_tool(call.name, call.args)
agent.submit_tool_result(call.id, result)
response = agent.send_message(
"Explain quantum computing",
on_message=handle_message,
on_tool_call=handle_tool
)
# From file path
with open("image.png", "rb") as f:
response = agent.send_message(
"What's in this image?",
files=[f.read()]
)
# From base64
response = agent.send_message(
"Analyze this",
files=["data:image/png;base64,iVBORw0KGgo..."]
)
agent = client.agent({
"core_app": {"ref": "infsh/claude-sonnet-4@latest"},
"skills": [
{
"name": "code-review",
"description": "Code review guidelines",
"content": "# Code Review\n\n1. Check security\n2. Check performance..."
},
{
"name": "api-docs",
"description": "API documentation",
"url": "https://example.com/skills/api-docs.md"
}
]
})
from inferencesh import async_inference
import asyncio
async def main():
client = async_inference(api_key="inf_...")
# Async app execution
result = await client.run({
"app": "infsh/flux-1-dev",
"input": {"prompt": "A galaxy"}
})
# Async agent
agent = client.agent("my-org/assistant@latest")
response = await agent.send_message("Hello!")
# Async streaming
async for msg in agent.stream_messages():
print(msg)
asyncio.run(main())
from inferencesh import RequirementsNotMetException
try:
result = client.run({"app": "my-app", "input": {...}})
except RequirementsNotMetException as e:
print(f"Missing requirements:")
for err in e.errors:
print(f" - {err['type']}: {err['key']}")
except RuntimeError as e:
print(f"Error: {e}")
def handle_tool(call):
if call.requires_approval:
# Show to user, get confirmation
approved = prompt_user(f"Allow {call.name}?")
if approved:
result = execute_tool(call.name, call.args)
agent.submit_tool_result(call.id, result)
else:
agent.submit_tool_result(call.id, {"error": "Denied by user"})
response = agent.send_message(
"Delete all temp files",
on_tool_call=handle_tool
)
# JavaScript SDK
npx skills add inference-sh/skills@javascript-sdk
# Full platform skill (all 250+ apps via CLI)
npx skills add inference-sh/skills@infsh-cli
# LLM models
npx skills add inference-sh/skills@llm-models
# Image generation
npx skills add inference-sh/skills@ai-image-generation
data-ai
Generate multi-person talking head podcast videos from scratch using AI — character creation, TTS, avatar animation, and video stitching. Use when the user wants to create a podcast, talking head video, or multi-speaker conversation video.
tools
AI voice generation, text-to-speech, and voice synthesis via inference.sh CLI. Models: Inworld TTS-2 (100+ languages, emotion/non-verbal steering), Inworld TTS 1.5 (ultra-low latency), ElevenLabs (22+ premium voices, 32 languages), Kokoro TTS, DIA, Chatterbox, Higgs, VibeVoice for natural speech. Capabilities: multiple voices, emotions, accents, long-form narration, conversation, voice transformation, delivery mode control, character voices. Use for: voiceovers, audiobooks, podcasts, video narration, accessibility, gaming NPCs, avatar audio, UGC. Triggers: voice cloning, tts, text to speech, ai voice, voice generation, voice synthesis, voice over, narration, speech synthesis, ai narrator, elevenlabs, eleven labs, natural voice, realistic speech, voice ai, voice changer, inworld, inworld tts, character voice, npc voice
tools
Generate AI music and songs with ElevenLabs, Diffrythm, Tencent Song Generation via inference.sh CLI. Models: ElevenLabs Music (up to 10 min, commercial license), Diffrythm (fast song generation), Tencent Song Generation (full songs with vocals). Capabilities: text-to-music, song generation, instrumental, lyrics to song, soundtrack creation. Use for: background music, social media content, game soundtracks, podcasts, royalty-free music. Triggers: music generation, ai music, generate song, ai composer, text to music, song generator, create music with ai, suno alternative, udio alternative, ai song, ai soundtrack, generate soundtrack, ai jingle, music ai, beat generator, elevenlabs music, eleven labs music
tools
Run 250+ AI apps via inference.sh CLI - image generation, video creation, LLMs, search, 3D, Twitter automation. Models: FLUX, Veo, Gemini, Grok, Claude, Seedance, OmniHuman, Tavily, Exa, OpenRouter, and many more. Use when running AI apps, generating images/videos, calling LLMs, web search, or automating Twitter. Triggers: inference.sh, infsh, ai model, run ai, serverless ai, ai api, flux, veo, claude api, image generation, video generation, openrouter, tavily, exa search, twitter api, grok