cmd/sgai/skel/.sgai/skills/new-openai-sdk-app/SKILL.md
Create and setup a new OpenAI Agents SDK application with interactive guidance for language choice, agent type selection (Basic, Voice, Realtime), project setup, and automatic verification.
npx skillsauth add sandgardenhq/sgai new-openai-sdk-appInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
You are tasked with helping the user create a new OpenAI Agents SDK application. Follow these steps carefully:
Before starting, review the official documentation to ensure you provide accurate and up-to-date guidance. Use WebFetch to read these pages:
Start with the overview:
Based on the user's language and agent type choice, read the appropriate SDK reference:
Read relevant guides mentioned in the overview such as:
IMPORTANT: Always check for and use the latest versions of packages. Use WebSearch or WebFetch to verify current versions before installation.
IMPORTANT: Ask these questions one at a time. Wait for the user's response before asking the next question. This makes it easier for the user to respond.
Ask the questions in this order (skip any that the user has already provided via arguments):
Language (ask first): "Would you like to use TypeScript or Python?"
Project name (ask second): "What would you like to name your project?"
Agent type (ask third): "What type of agent would you like to create?
Basic Agent: Standard text-based agent for chat, coding assistance, or automation tasks
Voice Agent: Speech-to-text and text-to-speech pipeline for voice interactions
Realtime Agent: Low-latency voice conversations using OpenAI's Realtime API (Beta)"
Wait for response before continuing
Agent purpose (ask fourth, but skip if #3 was sufficiently detailed): "What kind of agent are you building? Some examples:
Starting point (ask fifth): "Would you like:
Tooling choice (ask sixth): Let the user know what tools you'll use, and confirm with them that these are the tools they want to use (for example, they may prefer pnpm or bun over npm for TypeScript, or poetry over pip for Python). Respect the user's preferences when executing on the requirements.
After all questions are answered, proceed to create the setup plan.
Based on the user's answers, create a plan that includes:
Project initialization:
npm init -y and setup package.json with type: "module" and scripts (include a "typecheck" script)requirements.txt or use poetry inittsconfig.json with proper settings for the SDKCheck for Latest Versions:
SDK Installation:
npm install @openai/agents@latestnpm install @openai/agents@latest (voice is included in the main package)npm install @openai/agents@latest (realtime is included in the main package)pip install openai-agentspip install 'openai-agents[voice]' (includes sounddevice, numpy dependencies)pip install openai-agents (realtime is included in the main package)npm list @openai/agentspip show openai-agentsCreate starter files:
Based on the agent type, create appropriate starter code:
import { Agent, run } from '@openai/agents';
const agent = new Agent({
name: 'Assistant',
instructions: 'You are a helpful assistant.',
});
async function main() {
const result = await run(agent, 'Hello! How can you help me?');
console.log(result.finalOutput);
}
main().catch(console.error);
from agents import Agent, Runner
agent = Agent(
name="Assistant",
instructions="You are a helpful assistant."
)
result = Runner.run_sync(agent, "Hello! How can you help me?")
print(result.final_output)
import asyncio
import numpy as np
import sounddevice as sd
from agents import Agent, function_tool
from agents.voice import AudioInput, SingleAgentVoiceWorkflow, VoicePipeline
@function_tool
def get_weather(city: str) -> str:
"""Get the weather for a given city."""
return f"The weather in {city} is sunny."
agent = Agent(
name="VoiceAssistant",
instructions="You are a helpful voice assistant. Be concise.",
tools=[get_weather],
)
async def main():
pipeline = VoicePipeline(workflow=SingleAgentVoiceWorkflow(agent))
# For demo: 3 seconds of silence (replace with actual microphone input)
buffer = np.zeros(24000 * 3, dtype=np.int16)
audio_input = AudioInput(buffer=buffer)
result = await pipeline.run(audio_input)
player = sd.OutputStream(samplerate=24000, channels=1, dtype=np.int16)
player.start()
async for event in result.stream():
if event.type == "voice_stream_event_audio":
player.write(event.data)
if __name__ == "__main__":
asyncio.run(main())
import { Agent, tool } from '@openai/agents';
import { RealtimeAgent, RealtimeSession, OpenAIRealtimeWebSocket } from '@openai/agents/realtime';
import { z } from 'zod';
const getWeather = tool({
name: 'get_weather',
description: 'Get the weather for a given city',
parameters: z.object({
city: z.string().describe('The city to get weather for'),
}),
execute: async ({ city }) => {
return `The weather in ${city} is sunny.`;
},
});
const agent = new RealtimeAgent({
name: 'VoiceAssistant',
instructions: 'You are a helpful voice assistant. Be concise.',
tools: [getWeather],
});
async function main() {
const transport = new OpenAIRealtimeWebSocket();
const session = new RealtimeSession(agent, { transport });
await session.connect();
console.log('Voice session connected! Ready for audio input.');
// Handle session events
session.on('audio', (event) => {
// Process audio output
console.log('Received audio chunk');
});
session.on('error', (event) => {
console.error('Error:', event.error);
});
}
main().catch(console.error);
import asyncio
from agents.realtime import RealtimeAgent, RealtimeRunner
agent = RealtimeAgent(
name="RealtimeAssistant",
instructions="You are a helpful voice assistant. Keep responses brief and conversational.",
)
async def main():
runner = RealtimeRunner(
starting_agent=agent,
config={
"model_settings": {
"model_name": "gpt-realtime",
"voice": "ash",
"modalities": ["audio"],
"input_audio_format": "pcm16",
"output_audio_format": "pcm16",
"input_audio_transcription": {"model": "gpt-4o-mini-transcribe"},
"turn_detection": {"type": "semantic_vad", "interrupt_response": True},
}
},
)
session = await runner.run()
async with session:
print("Realtime session started! Streaming audio responses...")
async for event in session:
if event.type == "agent_start":
print(f"Agent started: {event.agent.name}")
elif event.type == "audio":
# Handle audio output
pass
elif event.type == "error":
print(f"Error: {event.error}")
if __name__ == "__main__":
asyncio.run(main())
import { RealtimeAgent, RealtimeSession, OpenAIRealtimeWebSocket } from '@openai/agents/realtime';
const agent = new RealtimeAgent({
name: 'RealtimeAssistant',
instructions: 'You are a helpful voice assistant. Keep responses brief and conversational.',
});
async function main() {
const transport = new OpenAIRealtimeWebSocket({
model: 'gpt-4o-realtime-preview',
});
const session = new RealtimeSession(agent, {
transport,
config: {
voice: 'ash',
modalities: ['audio'],
inputAudioFormat: 'pcm16',
outputAudioFormat: 'pcm16',
turnDetection: { type: 'semantic_vad', interruptResponse: true },
},
});
await session.connect();
console.log('Realtime session connected!');
session.on('agent_start', (event) => {
console.log(`Agent started: ${event.agent.name}`);
});
session.on('audio', (event) => {
// Handle audio output chunks
console.log('Received audio chunk');
});
session.on('error', (event) => {
console.error('Error:', event.error);
});
}
main().catch(console.error);
Environment setup:
.env.example file with OPENAI_API_KEY=your_api_key_here.env to .gitignoreOptional: Create .sgai directory structure:
.sgai/ directory for agents, commands, and settingsAfter gathering requirements and getting user confirmation on the plan:
npx tsc --noEmit to check for type errorspython -m py_compile <file>After all files are created and dependencies are installed, use the appropriate verifier agent to validate that the Agent SDK application is properly configured and ready for use:
Once setup is complete and verified, provide the user with:
Next steps:
export OPENAI_API_KEY=your_api_key_herenpx ts-node index.ts or npm startpython main.pyUseful resources:
Common next steps:
@function_tool (Python) or tool() (TypeScript)npx tsc --noEmit and fix ALL type errors before finishingpython -m py_compileBegin by asking the FIRST requirement question only. Wait for the user's answer before proceeding to the next question.
documentation
Start, stop, and steer agentic sessions in sgai workspaces. Use when you need to launch AI agent sessions, halt running sessions, or inject steering instructions to guide the agent mid-execution without stopping it.
development
Monitor sgai workspace status, events, progress, diffs, and workflow diagrams. Use when you need to observe what agents are doing, track progress, get the current state of all workspaces, subscribe to real-time updates via SSE, or inspect code changes.
development
Access agents, skills, and code snippets available in sgai workspaces. Use when you need to discover what agents are defined in a workspace, browse available skills, get skill instructions, find code snippets by language, or retrieve snippet content for a specific task.
data-ai
Handle agent questions and work gates in sgai workspaces. Use when an agent is blocked waiting for human input, when you need to respond to multi-choice questions, approve work gates, or provide free-text answers to agent queries.