Chanjing TTS

功能说明

调用蝉镜 TTS Open API：列举音色、创建合成任务、轮询并从接口返回 URL 下载音频。脚本不依赖 ffmpeg/ffprobe。

运行依赖

python3 与同仓库 scripts/*.py
无 ffmpeg/ffprobe 门控

环境变量与机器可读声明

环境变量键名与说明：manifest.yaml（environment 段）及本文
变量、凭据、合规 permissions、clientPermissions、agentPolicy：manifest.yaml

使用命令

ClawHub（slug 以注册表为准）：clawhub run chanjing-tts
本仓库：python skills/chanjing-tts/scripts/create_task.py …（见正文 How to Use）

登记与审稿（单一事实来源）

主凭据、下载行为、primaryEnv 省略等：以 manifest.yaml 为准。本篇 How to Use 起为 API 步骤说明。

When to Use This Skill

Use this skill when the user needs to generate audio from text.

Chanjing TTS supports:

both Chinese and English
multiple system voices
adjustment of speech speed
sentence-level timestamp in result

How to Use This Skill

前置条件（权限验证）：执行本 Skill 前，必须先通过 chanjing-credentials-guard 完成 AK/SK 与 Token 校验。本 Skill 与 guard 使用同一套凭证（~/.chanjing/credentials.json）；脚本在无凭证时会执行 open_login_page.py 脚本，在默认浏览器打开 AK/SK 注册/登录页，并提示配置命令。凭据与审稿对表见 manifest.yaml。

Security & credentials（引用）

详见 manifest.yaml 中 credentials 与 clientPermissions（及合规顶层 permissions）。

Multiple APIs need to be invoked. All share the domain: "https://open-api.chanjing.cc". All requests communicate using json. You should use utf-8 to encode and decode text throughout this task.

Obtain an access_token, which is required for all subsequent API calls
List all voice IDs and select one to use
Call the Create Speech API, record task_id
Poll the Query Speech Status API until success, then download generated audio file using the url in response

Obtain AccessToken

从 ~/.chanjing/credentials.json 读取 app_id 和 secret_key，若无有效 Token 则调用：

POST /open/v1/access_token
Content-Type: application/json

请求体（使用本地配置的 app_id、secret_key）：

{
  "app_id": "<从 credentials.json 读取>",
  "secret_key": "<从 credentials.json 读取>"
}

Response example:

{
  "trace_id": "8ff3fcd57b33566048ef28568c6cee96",
  "code": 0,
  "msg": "success",
  "data": {
    "access_token": "1208CuZcV1Vlzj8MxqbO0kd1Wcl4yxwoHl6pYIzvAGoP3DpwmCCa73zmgR5NCrNu",
    "expire_in": 1721289220
  }
}

Response field description:

| First-level Field | Second-level Field | Description | |---|---|---| | code | | Response status code | | msg | | Response message | | data | | Response data | | | access_token | Valid for one day, previous token will be invalidated | | | expire_in | Token expiration time |

Response Status Code Description

| Code | Description | |---|---| | 0 | Success | | 400 | Invalid parameter format | | 40000 | Parameter error | | 50000 | System internal error |

Select a Voice ID

Obtain all available voice IDs via API, and select one that fits the task at hand. The dialect/accent can be deduced from the voice name.

GET /open/v1/list_common_audio
access_token: {{access_token}}

Use the following request body:

{
  "page": 1,
  "size": 100
}

Response example:

```json
{
  "trace_id": "25eb6794ffdaaf3672c25ed9efbe49c6",
  "code": 0,
  "msg": "success",
  "data": {
    "list": [
      {
        "id": "f9248f3b1b42447fb9282829321cfcf2",
        "grade": 0,
        "name": "带货小芸",
        "gender": "female",
        "lang": "multilingual",
        "desc": "",
        "speed": 1,
        "pitch": 1,
        "audition": "https://res.chanjing.cc/chanjing/res/upload/ms/2025-06-05/7945e0474b8cb526e884ee7e28e4af8d.wav"
      },
      {
        "id": "f5e69c1bbe414bec860da3294e177625",
        "grade": 0,
        "name": "方言口音老奶奶",
        "gender": "female",
        "lang": "multilingual",
        "desc": "",
        "speed": 1,
        "pitch": 1,
        "audition": "https://res.chanjing.cc/chanjing/res/upload/ms/2025-04-30/1b248ad05953028db5a6bcba9a951164.wav"
      },
      ...
    ],
    "page_info": {
      "page": 1,
      "size": 100,
      "total_count": 98,
      "total_page": 1
    }
  }
}

Response field description:

| First-level Field | Second-level Field | Third-level Field | Description | |---|---|---|---| | code | | | Response status code | | message | | | Response message | | data | | | Response data | | | list | List data | Public voice - list data | | | | id | Voice ID | | | | name | Voice name, if it includes a place name, the generated speech is in dialect | | | | gender | Gender | | | | lang | Language | | | | desc | Description | | | | speed | Speech speed | | | | pitch | Pitch | | | | audition | Audition link | | | | grade | Grade |

Response status code description:

| Code | Description | |---|---| | 0 | Response successful | | 10400 | AccessToken verification failed | | 40000 | Parameter error | | 50000 | System internal error | | 51000 | System internal error |

Create Speech API

Submit a speech creating task, which returns a task ID for polling later.

POST /open/v1/create_audio_task
access_token: {{access_token}}
Content-Type: application/json

Request body example:

{
  "audio_man": "89843d52ccd04e2d854decd28d6143ce ",
  "speed": 1,
  "pitch": 1,
  "text": {
    "text": "Hello, I am your AI assistant."
  }
}

Request field description:

| Parameter Name | Type | Nested Key | Required | Example | Description | |---|---|---|---|---|---| | audio_man | string | | Yes | 89843d52ccd04e2d854decd28d6143ce | Voice ID | | speed | number | | Yes | 1 | Speech speed: 0.5 (slow) - 2 (fast) | | pitch | number | | Yes | 1 | Just set to 1 | | text | object | text | Yes | Hello, I am your AI assistant. | Rich text, length must be less than 4000 characters | | aigc_watermark | bool | | No | false | Whether to add visible watermark to audio, default to false |

Response example:

{
  "trace_id": "dd09f123a25b43cf2119a2449daea6de",
  "code": 0,
  "msg": "success",
  "data": {
    "task_id": "88f635dd9b8e4a898abb9d4679e0edc8"
  }
}

Response field description:

| Field | Description | |---|---| | code | Response status code | | msg | Response message | | task_id | Task ID, to be used in subsequent polling step |

Response status code description:

| code | Description | | --- | --- | | 0 | Response successful | | 400 | Invalid parameter format | | 10400 | AccessToken verification failed | | 40000 | Parameter error | | 40001 | Exceeds QPS limit | | 40002 | Production duration reached limit | | 50000 | System internal error |

Poll Query Speech Status API

Poll the following API until speech is generated.

POST /open/v1/audio_task_state
access_token: {{access_token}}
Content-Type: application/json

Request example:

{
  "task_id": "88f635dd9b8e4a898abb9d4679e0edc8"
}

Request field description:

| Parameter Name | Type | Required | Example | Description | |---|---|---|---|---| | task_id | string | Yes | 88f789dd9b8e4a121abb9d4679e0edc8 | Speech synthesis task ID |

Response example:

{
  "trace_id": "ab18b14574bbcc31df864099d474080e",
  "code": 0,
  "msg": "success",
  "data": {
    "id": "9546a0fb1f0a4ae3b5c7489b77e4a94d",
    "type": "tts",
    "status": 9,
    "text": [
      "猫在跌落时能够在空中调整身体，通常能够四脚着地，这种”猫右自己“反射显示了它们惊人的身体协调能力和灵活性。核磁共振成像技术通过利用人体细胞中氢原子的磁性来生成详细的内部图像，为医学诊断提供了重要工具。"
    ],
    "full": {
      "url": "https://cy-cds-test-innovation.cds8.cn/chanjing/res/upload/tts/2025-04-08/093a59021d85a72d28a491f21820ece4.wav",
      "path": "093a59013d85a72d28a491f21820ece4.wav",
      "duration": 18.81
    },
    "slice": null,
    "errMsg": "",
    "errReason": "",
    "subtitles": [
      {
        "key": "20c53ff8cce9831a8d9c347263a400a54d72be15",
        "start_time": 0,
        "end_time": 2.77,
        "subtitle": "猫在跌落时能够在空中调整身体"
      },
      {
        "key": "e19f481b6cd2219225fa4ff67836448e054b2271",
        "start_time": 2.77,
        "end_time": 4.49,
        "subtitle": "通常能够四脚着地"
      },
      {
        "key": "140beae4046bd7a99fbe4706295c19aedfeeb843",
        "start_time": 4.49,
        "end_time": 5.73,
        "subtitle": "这种，猫右自己"
      },
      {
        "key": "e851881271876ab5a90f4be754fde2dc6b5498fd",
        "start_time": 5.73,
        "end_time": 7.97,
        "subtitle": "反射显示了它们惊人的身体"
      },
      {
        "key": "fbb0b4138bad189b9fc02669fe1f95116e9991b4",
        "start_time": 7.97,
        "end_time": 9.45,
        "subtitle": "协调能力和灵活性"
      },
      {
        "key": "f73404d135feaf84dd8fbea13af32eac847ac26d",
        "start_time": 9.45,
        "end_time": 12.49,
        "subtitle": "核磁共振成像技术通过利用人体"
      },
      {
        "key": "e18827931223962e477b14b2b8046947039ac222",
        "start_time": 12.49,
        "end_time": 14.77,
        "subtitle": "细胞中氢原子的磁性来生成"
      },
      {
        "key": "d137bf2b0c8b7a39e3f6753b7cf5d92bd877d2d9",
        "start_time": 14.77,
        "end_time": 15.97,
        "subtitle": "详细的内部图像"
      },
      {
        "key": "0773911ae0dbaa763a64352abdb6bdac3ff8f149",
        "start_time": 15.97,
        "end_time": 18.41,
        "subtitle": "为医学诊断提供了重要工具"
      }
    ]
  }
}

Response field description:

| First-level Field | Second-level Field | Third-level Field | Description | |---|---|---|---| | code | | | Response status code | | msg | | | Response message | | data | id | | Audio ID | | | type | | | | | status | | 1: generating; 9: completed | | | text | | Speech text | | | full | url | url to download the generated audio file | | | | path | | | | | duration | Audio duration | | | slice | | | | | errMsg | | Error message | | | errReason | | Error reason | | | subtitles (array type) | key | Subtitle ID | | | | start_time | Subtitle start time | | | | end_time | Subtitle end time | | | | subtitle | Subtitle text |

Response status code description:

| code | Description | |---|---| | 0 | Response successful | | 10400 | AccessToken verification failed | | 40000 | Parameter error | | 50000 | System internal error |

Scripts

本 Skill 提供脚本（skills/chanjing-tts/scripts/），带权限验证：与 chanjing-credentials-guard 使用同一配置文件；无 AK/SK 时会执行 guard 的 open_login_page.py 脚本，在浏览器打开注册/登录页，并提示配置命令。

| 脚本 | 说明 | |------|------| | list_voices.py | 列出公共声音人，默认输出 id/name 表，可选 --json 输出完整数据 | | create_task.py | 创建 TTS 任务，输出 task_id | | poll_task.py | 轮询任务直到完成，输出音频下载 URL（full.url） |

示例（在项目根或 skill 目录下执行）：

# 1. 列出可用声音，选取一个 id
python skills/chanjing-tts/scripts/list_voices.py

# 2. 创建合成任务
TASK_ID=$(python skills/chanjing-tts/scripts/create_task.py \
  --audio-man "f9248f3b1b42447fb9282829321cfcf2" \
  --text "Hello, I am your AI assistant.")

# 3. 轮询到完成，得到音频下载链接
python skills/chanjing-tts/scripts/poll_task.py --task-id "$TASK_ID"

Chanjing TTS

功能说明

调用蝉镜 TTS Open API：列举音色、创建合成任务、轮询并从接口返回 URL 下载音频。脚本不依赖 ffmpeg/ffprobe。

运行依赖

python3 与同仓库 scripts/*.py
无 ffmpeg/ffprobe 门控

环境变量与机器可读声明

环境变量键名与说明：manifest.yaml（environment 段）及本文
变量、凭据、合规 permissions、clientPermissions、agentPolicy：manifest.yaml

使用命令

ClawHub（slug 以注册表为准）：clawhub run chanjing-tts
本仓库：python skills/chanjing-tts/scripts/create_task.py …（见正文 How to Use）

登记与审稿（单一事实来源）

主凭据、下载行为、primaryEnv 省略等：以 manifest.yaml 为准。本篇 How to Use 起为 API 步骤说明。

When to Use This Skill

Use this skill when the user needs to generate audio from text.

Chanjing TTS supports:

both Chinese and English
multiple system voices
adjustment of speech speed
sentence-level timestamp in result

How to Use This Skill

Security & credentials（引用）

详见 manifest.yaml 中 credentials 与 clientPermissions（及合规顶层 permissions）。

Multiple APIs need to be invoked. All share the domain: "https://open-api.chanjing.cc". All requests communicate using json. You should use utf-8 to encode and decode text throughout this task.

Obtain an access_token, which is required for all subsequent API calls
List all voice IDs and select one to use
Call the Create Speech API, record task_id
Poll the Query Speech Status API until success, then download generated audio file using the url in response

Obtain AccessToken

从 ~/.chanjing/credentials.json 读取 app_id 和 secret_key，若无有效 Token 则调用：

POST /open/v1/access_token
Content-Type: application/json

请求体（使用本地配置的 app_id、secret_key）：

{
  "app_id": "<从 credentials.json 读取>",
  "secret_key": "<从 credentials.json 读取>"
}

Response example:

{
  "trace_id": "8ff3fcd57b33566048ef28568c6cee96",
  "code": 0,
  "msg": "success",
  "data": {
    "access_token": "1208CuZcV1Vlzj8MxqbO0kd1Wcl4yxwoHl6pYIzvAGoP3DpwmCCa73zmgR5NCrNu",
    "expire_in": 1721289220
  }
}

Response field description:

Response Status Code Description

| Code | Description | |---|---| | 0 | Success | | 400 | Invalid parameter format | | 40000 | Parameter error | | 50000 | System internal error |

Select a Voice ID

Obtain all available voice IDs via API, and select one that fits the task at hand. The dialect/accent can be deduced from the voice name.

GET /open/v1/list_common_audio
access_token: {{access_token}}

Use the following request body:

{
  "page": 1,
  "size": 100
}

Response example:

```json
{
  "trace_id": "25eb6794ffdaaf3672c25ed9efbe49c6",
  "code": 0,
  "msg": "success",
  "data": {
    "list": [
      {
        "id": "f9248f3b1b42447fb9282829321cfcf2",
        "grade": 0,
        "name": "带货小芸",
        "gender": "female",
        "lang": "multilingual",
        "desc": "",
        "speed": 1,
        "pitch": 1,
        "audition": "https://res.chanjing.cc/chanjing/res/upload/ms/2025-06-05/7945e0474b8cb526e884ee7e28e4af8d.wav"
      },
      {
        "id": "f5e69c1bbe414bec860da3294e177625",
        "grade": 0,
        "name": "方言口音老奶奶",
        "gender": "female",
        "lang": "multilingual",
        "desc": "",
        "speed": 1,
        "pitch": 1,
        "audition": "https://res.chanjing.cc/chanjing/res/upload/ms/2025-04-30/1b248ad05953028db5a6bcba9a951164.wav"
      },
      ...
    ],
    "page_info": {
      "page": 1,
      "size": 100,
      "total_count": 98,
      "total_page": 1
    }
  }
}

Response field description:

Response status code description:

Create Speech API

Submit a speech creating task, which returns a task ID for polling later.

POST /open/v1/create_audio_task
access_token: {{access_token}}
Content-Type: application/json

Request body example:

{
  "audio_man": "89843d52ccd04e2d854decd28d6143ce ",
  "speed": 1,
  "pitch": 1,
  "text": {
    "text": "Hello, I am your AI assistant."
  }
}

Request field description:

Response example:

{
  "trace_id": "dd09f123a25b43cf2119a2449daea6de",
  "code": 0,
  "msg": "success",
  "data": {
    "task_id": "88f635dd9b8e4a898abb9d4679e0edc8"
  }
}

Response field description:

| Field | Description | |---|---| | code | Response status code | | msg | Response message | | task_id | Task ID, to be used in subsequent polling step |

Response status code description:

Poll Query Speech Status API

Poll the following API until speech is generated.

POST /open/v1/audio_task_state
access_token: {{access_token}}
Content-Type: application/json

Request example:

{
  "task_id": "88f635dd9b8e4a898abb9d4679e0edc8"
}

Request field description:

| Parameter Name | Type | Required | Example | Description | |---|---|---|---|---| | task_id | string | Yes | 88f789dd9b8e4a121abb9d4679e0edc8 | Speech synthesis task ID |

Response example:

{
  "trace_id": "ab18b14574bbcc31df864099d474080e",
  "code": 0,
  "msg": "success",
  "data": {
    "id": "9546a0fb1f0a4ae3b5c7489b77e4a94d",
    "type": "tts",
    "status": 9,
    "text": [
      "猫在跌落时能够在空中调整身体，通常能够四脚着地，这种”猫右自己“反射显示了它们惊人的身体协调能力和灵活性。核磁共振成像技术通过利用人体细胞中氢原子的磁性来生成详细的内部图像，为医学诊断提供了重要工具。"
    ],
    "full": {
      "url": "https://cy-cds-test-innovation.cds8.cn/chanjing/res/upload/tts/2025-04-08/093a59021d85a72d28a491f21820ece4.wav",
      "path": "093a59013d85a72d28a491f21820ece4.wav",
      "duration": 18.81
    },
    "slice": null,
    "errMsg": "",
    "errReason": "",
    "subtitles": [
      {
        "key": "20c53ff8cce9831a8d9c347263a400a54d72be15",
        "start_time": 0,
        "end_time": 2.77,
        "subtitle": "猫在跌落时能够在空中调整身体"
      },
      {
        "key": "e19f481b6cd2219225fa4ff67836448e054b2271",
        "start_time": 2.77,
        "end_time": 4.49,
        "subtitle": "通常能够四脚着地"
      },
      {
        "key": "140beae4046bd7a99fbe4706295c19aedfeeb843",
        "start_time": 4.49,
        "end_time": 5.73,
        "subtitle": "这种，猫右自己"
      },
      {
        "key": "e851881271876ab5a90f4be754fde2dc6b5498fd",
        "start_time": 5.73,
        "end_time": 7.97,
        "subtitle": "反射显示了它们惊人的身体"
      },
      {
        "key": "fbb0b4138bad189b9fc02669fe1f95116e9991b4",
        "start_time": 7.97,
        "end_time": 9.45,
        "subtitle": "协调能力和灵活性"
      },
      {
        "key": "f73404d135feaf84dd8fbea13af32eac847ac26d",
        "start_time": 9.45,
        "end_time": 12.49,
        "subtitle": "核磁共振成像技术通过利用人体"
      },
      {
        "key": "e18827931223962e477b14b2b8046947039ac222",
        "start_time": 12.49,
        "end_time": 14.77,
        "subtitle": "细胞中氢原子的磁性来生成"
      },
      {
        "key": "d137bf2b0c8b7a39e3f6753b7cf5d92bd877d2d9",
        "start_time": 14.77,
        "end_time": 15.97,
        "subtitle": "详细的内部图像"
      },
      {
        "key": "0773911ae0dbaa763a64352abdb6bdac3ff8f149",
        "start_time": 15.97,
        "end_time": 18.41,
        "subtitle": "为医学诊断提供了重要工具"
      }
    ]
  }
}

Response field description:

Response status code description:

| code | Description | |---|---| | 0 | Response successful | | 10400 | AccessToken verification failed | | 40000 | Parameter error | | 50000 | System internal error |

Scripts

示例（在项目根或 skill 目录下执行）：

# 1. 列出可用声音，选取一个 id
python skills/chanjing-tts/scripts/list_voices.py

# 2. 创建合成任务
TASK_ID=$(python skills/chanjing-tts/scripts/create_task.py \
  --audio-man "f9248f3b1b42447fb9282829321cfcf2" \
  --text "Hello, I am your AI assistant.")

# 3. 轮询到完成，得到音频下载链接
python skills/chanjing-tts/scripts/poll_task.py --task-id "$TASK_ID"

Adoption

chanjing-ai/chanjing-tts

$ install --global

Security Scan Results

SKILL.md

Chanjing TTS

功能说明

运行依赖

环境变量与机器可读声明

使用命令

登记与审稿（单一事实来源）

When to Use This Skill

How to Use This Skill

Security & credentials（引用）

Obtain AccessToken

Select a Voice ID

Create Speech API

Poll Query Speech Status API

Scripts

Related Skills

chanjing-ai/chanjing-video-compose