You need to enable JavaScript to run this app.
文档中心
全站加速

全站加速

复制全文
下载 pdf
开发指南
模型调用示例代码
复制全文
下载 pdf
模型调用示例代码

本文提供了通过 AI 加速网关调用大模型服务的代码示例,涵盖 OpenAI 兼容协议和协议透传两种调用方式。

调用方式概述

AI 加速网关支持两种 API 调用方式:

  • OpenAI 兼容协议:网关将各模型厂商的请求和响应统一转换为 OpenAI 格式。无论后端接入的是哪家模型服务商,您都可以使用标准的 OpenAI SDK 和协议进行调用。
  • 协议透传:协议透传是指网关原样转发各模型厂商的请求和响应(包括请求头、请求体和响应体),不做协议转换。该方式仅支持请求加速能力,不支持其他能力(如模型路由、语义缓存和限速等)。

前置操作

完成创建 AI 加速网关实例,并在 实例详情 页获取调用所需信息。

BaseUrl 的构成

实例的 BaseUrl 因 调用方式 而异。控制台 实例详情 > 基础信息 页会根据您所选的调用方式(在 请求方式 区域切换 OpenAI 兼容协议协议透传)自动拼接完整的 BaseUrl,您可直接复制使用,无需手动拼接:

调用方式BaseUrl 形态说明
OpenAI 兼容协议https://{加速域名}/v1/{实例 ID}由加速域名和实例 ID 组成。
协议透传https://{加速域名}/v1/{实例 ID}/{提供商 ID}在加速域名和实例 ID 之后追加供应商标识(如 zhipuanthropic 等),用于将请求路由到指定的模型服务商。

各调用方式所需信息

  • OpenAI 兼容协议:获取 BaseUrl 和网关 API Key。

    注意

    使用 OpenAI 兼容协议要求创建实例时已传入模型 API Key。若未传入,请使用协议透传方式。

    Image
  • 协议透传:获取目标供应商对应的 BaseUrl,并使用 模型厂商自身的 API Key 进行鉴权。
    Image

OpenAI 兼容协议

在 OpenAI 兼容协议调用方式下,网关会将各模型厂商的原始响应统一转换为 OpenAI 格式输出。
以下示例中,请将变量替换为您的实际值:

  • $BASE_URL:实例的 BaseUrl,格式为 https://{加速域名}/v1/{实例 ID},可在 实例详情 > 基础信息 页直接复制。
  • $AI_GATEWAY_API_KEY:网关 API Key。
  • $MODEL_NAME:您在网关中配置的模型名称。

文本生成(Text)

Curl

curl $BASE_URL/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $AI_GATEWAY_API_KEY" \
  -d '{
     "model": "$MODEL_NAME",
     "messages": [{"role": "user", "content": "Say this is a test!"}],
     "temperature": 0.7
   }'

Python

# pip install openai
# https://platform.openai.com/docs/api-reference
from openai import OpenAI
client = OpenAI(
    base_url="$BASE_URL",
    api_key="$AI_GATEWAY_API_KEY",
)


completion = client.chat.completions.create(
  model="$MODEL_NAME",
  messages=[
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Hello!"}
  ],
)

print(completion.choices[0].message)

图像生成(Image)

Curl

curl $BASE_URL/images/generations \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $AI_GATEWAY_API_KEY" \
  -d '{
     "model": "$MODEL_NAME",
     "prompt": "A cute baby sea otter"
   }'

Python

# pip install openai
# https://platform.openai.com/docs/api-reference
from openai import OpenAI
client = OpenAI(
    base_url="$BASE_URL",
    api_key="$AI_GATEWAY_API_KEY",
)

images = client.images.generate(
    model="$MODEL_NAME",
    size="512x512",
    response_format="url",
)

print(images.data[0].url)

语音合成(Speech/TTS)

Python

# prerequisites
# pip install websockets==12.0 numpy soundfile scipy
import asyncio
import base64
import json
import numpy as np
from scipy.signal import resample
import websockets


def resample_audio(audio_data, original_sample_rate, target_sample_rate):
    number_of_samples = round(len(audio_data) * float(target_sample_rate) / original_sample_rate)
    resampled_audio = resample(audio_data, number_of_samples)
    return resampled_audio.astype(np.int16)


async def send_text(client, text: str):
    for t in text:
        await asyncio.sleep(0.05)
        event = {
            "type": "input_text.append",
            "delta": t
        }
        await client.send(json.dumps(event))
    event = {
        "type": "input_text.done"
    }
    await client.send(json.dumps(event))


# 定义一个函数来写入音频数据
def write_audio_data(stream, data):
    stream.write(data)


async def receive_messages(client, file_path="response_audio.pcm"):
    audio_list = bytearray()
    while not client.closed:
        message = await client.recv()
        if message is None:
            print("===None Message===")
            continue
        event = json.loads(message)
        message_type = event.get("type")
        if message_type == "response.audio.delta":
            audio_bytes = base64.b64decode(event["delta"])
            audio_list.extend(audio_bytes)
            del event['delta']
            print(event)
            continue

        print(event)

        with open(file_path, 'wb') as ff:
            ff.write(audio_list)

        if message_type == "response.audio.done":
            break

        continue


def get_session_update_msg():
    config = {
        "voice": "your_voice",
        "output_audio_format": "pcm",
        "output_audio_sample_rate": 24000,  # your_sample_rate
    }
    event = {
        "type": "tts_session.update",
        "session": config
    }
    return json.dumps(event)


async def with_openai():
    key = "$AI_GATEWAY_API_KEY"
    ws_url = "wss://$BASE_URL/realtime?intent=text-to-speech&model=$MODEL_NAME"

    headers = {
        "Authorization": f"Bearer {key}",
    }
    async with websockets.connect(ws_url, ping_interval=None, extra_headers=headers) as client:
        session_msg = get_session_update_msg()
        await client.send(session_msg)
        await asyncio.gather(send_text(client, "你好呀"), receive_messages(client))


if __name__ == "__main__":
    asyncio.run(with_openai())

语音识别(Audio/ASR)

Python

# prerequisites
# pip install websockets==12.0 numpy soundfile scipy
import asyncio
import base64
import json
import numpy as np
import soundfile as sf
from scipy.signal import resample
import websockets

SAMPLE_RATE = 16000  # your_sample_rate


def resample_audio(audio_data, original_sample_rate, target_sample_rate):
    number_of_samples = round(len(audio_data) * float(target_sample_rate) / original_sample_rate)
    resampled_audio = resample(audio_data, number_of_samples)
    return resampled_audio.astype(np.int16)


async def send_audio(client, audio_file_path: str):
    duration_ms = 100
    samples_per_chunk = SAMPLE_RATE * (duration_ms / 1000)
    bytes_per_sample = 2
    bytes_per_chunk = int(samples_per_chunk * bytes_per_sample)

    extra_params = {}
    if audio_file_path.endswith(".raw"):
        extra_params = {
            "samplerate": SAMPLE_RATE,
            "channels": 1,
            "subtype": "PCM_16",
        }

    audio_data, original_sample_rate = sf.read(audio_file_path, dtype="int16", **extra_params)

    if original_sample_rate != SAMPLE_RATE:
        audio_data = resample_audio(audio_data, original_sample_rate, SAMPLE_RATE)

    audio_bytes = audio_data.tobytes()
    for i in range(0, len(audio_bytes), bytes_per_chunk):
        await asyncio.sleep((duration_ms - 20) / 1000)
        chunk = audio_bytes[i: i + bytes_per_chunk]
        base64_audio = base64.b64encode(chunk).decode("utf-8")
        append_event = {
            "type": "input_audio_buffer.append",
            "audio": base64_audio
        }
        await client.send(json.dumps(append_event))
    print("send complete")

    commit_event = {
        "type": "input_audio_buffer.commit"
    }
    await client.send(json.dumps(commit_event))


async def receive_messages(client):
    while not client.closed:
        message = await client.recv()
        print(message)
        event = json.loads(message)
        if event.get("type") == "conversation.item.input_audio_transcription.completed":
            return


def get_session_update_msg():
    config = {
        "input_audio_format": "pcm",
        "input_audio_sample_rate": SAMPLE_RATE,
        "input_audio_bits": 16,
        "input_audio_channel": 1,
    }
    event = {
        "type": "transcription_session.update",
        "session": config
    }
    return json.dumps(event)


async def with_openai(audio_file_path: str):
    ws_url = "wss://$BASE_URL/realtime?intent=transcription&model=$MODEL_NAME"
    key = "$AI_GATEWAY_API_KEY"

    headers = {
        "Authorization": f"Bearer {key}",
    }

    async with websockets.connect(ws_url, ping_interval=None, extra_headers=headers) as client:
        session_msg = get_session_update_msg()
        await client.send(session_msg)
        await asyncio.gather(send_audio(client, audio_file_path), receive_messages(client))


if __name__ == "__main__":
    file_path = "recording.mp3"  # your_audio_file
    asyncio.run(with_openai(file_path))

向量模型(Embedding)

Curl

curl $BASE_URL/embeddings \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $AI_GATEWAY_API_KEY" \
  -d '{
     "model": "$MODEL_NAME",
     "input": "The food was delicious and the waiter...",
     "encoding_format": "float"
   }'

Python

# pip install openai
# https://platform.openai.com/docs/api-reference
from openai import OpenAI
client = OpenAI(
    base_url="$BASE_URL",
    api_key="$AI_GATEWAY_API_KEY",
)


client.embeddings.create(
  model="$MODEL_NAME",
  input="The food was delicious and the waiter...",
  encoding_format="float"
)

协议透传

协议透传是指网关原样透传各模型厂商各自的接口协议(包括请求头和请求体),不做协议的转换和兼容。网关仅针对特定路径(如 /chat/completions/messages 等)的请求尝试解析响应体中的 usage 字段,进行 Token 计量。
与 OpenAI 兼容协议的主要区别如下:

对比项

OpenAI 兼容协议

协议透传

协议转换

网关统一转换为 OpenAI 格式

原样透传模型厂商协议,不做转换

鉴权方式

使用网关生成的 $AI_GATEWAY_API_KEY

使用模型厂商自身的密钥

请求/响应体

统一为 OpenAI 格式

与模型厂商接口完全一致

支持的网关能力

请求加速、模型路由(负载均衡 / 主备容灾)、语义缓存、限速等

仅请求加速

适用场景

希望统一管理多模型厂商调用协议

希望保留模型厂商原生接口行为

调用路径说明

协议透传方式下,BaseUrl 自动包含 /{提供商 ID} 后缀,调用路径由以下两部分组成:

{BaseUrl}/{模型厂商请求路径}

组成部分

说明

示例

BaseUrl

实例的 BaseUrl,在协议透传方式下已包含 /{提供商 ID} 后缀(如 /tencent/ali/bytedance 等),可在 实例详情 > 基础信息 页选择目标供应商后直接复制。

https://{加速域名}/v1/{实例 ID}/{提供商 ID}

/{模型厂商请求路径}

提供商原始 API 路径。

/v1/chat/completions

总的来说,使用协议透传时,您只需将原本指向三方模型厂商的域名替换为该实例的 BaseUrl,其余的请求路径、请求头和请求体与模型厂商接口完全一致。

各提供商调用路径与示例

以下表格列出了各提供商在协议透传方式下的调用路径对照。示例中的变量说明:

  • $BASE_URL:实例在该提供商下的 BaseUrl,格式为 https://{加速域名}/v1/{实例 ID}/{提供商 ID}。在控制台 实例详情 页将 调用方式 切换为 协议透传 并选择对应供应商后,可直接复制。
  • $KEY模型厂商自身的 API Key(非网关 API Key)。
  • $MODEL_NAME:模型厂商的模型名称。

腾讯混元

模型厂商原始路径

网关调用路径

https://api.hunyuan.cloud.tencent.com/v1/chat/completions

$BASE_URL/v1/chat/completions

curl $BASE_URL/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $KEY" \
  -d '{
     "model": "$MODEL_NAME",
     "messages": [{"role": "user", "content": "Say this is a test!"}],
     "temperature": 0.7
   }'

阿里云百炼

模型厂商原始路径

网关调用路径

https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions

$BASE_URL/compatible-mode/v1/chat/completions

curl $BASE_URL/compatible-mode/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $KEY" \
  -d '{
     "model": "$MODEL_NAME",
     "messages": [{"role": "user", "content": "Say this is a test!"}],
     "temperature": 0.7
   }'

百度千帆

模型厂商原始路径

网关调用路径

https://qianfan.baidubce.com/v2/chat/completions

$BASE_URL/v2/chat/completions

curl $BASE_URL/v2/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $KEY" \
  -d '{
     "model": "$MODEL_NAME",
     "messages": [{"role": "user", "content": "Say this is a test!"}],
     "temperature": 0.7
   }'

智谱 AI

模型厂商原始路径

网关调用路径

https://open.bigmodel.cn/api/paas/v4/chat/completions

$BASE_URL/api/paas/v4/chat/completions

curl $BASE_URL/api/paas/v4/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $KEY" \
  -d '{
     "model": "$MODEL_NAME",
     "messages": [{"role": "user", "content": "Say this is a test!"}],
     "temperature": 0.7
   }'

MiniMax

模型厂商原始路径

网关调用路径

https://api.minimaxi.com/v1/chat/completions

$BASE_URL/v1/chat/completions

curl $BASE_URL/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $KEY" \
  -d '{
     "model": "$MODEL_NAME",
     "messages": [{"role": "user", "content": "Say this is a test!"}],
     "temperature": 0.7
   }'

零一万物

模型厂商原始路径

网关调用路径

https://api.lingyiwanwu.com/v1/chat/completions

$BASE_URL/v1/chat/completions

curl $BASE_URL/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $KEY" \
  -d '{
     "model": "$MODEL_NAME",
     "messages": [{"role": "user", "content": "Say this is a test!"}],
     "temperature": 0.7
   }'

DeepSeek

模型厂商原始路径

网关调用路径

https://api.deepseek.com/v1/chat/completions

$BASE_URL/v1/chat/completions

curl $BASE_URL/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $KEY" \
  -d '{
     "model": "$MODEL_NAME",
     "messages": [{"role": "user", "content": "Say this is a test!"}],
     "temperature": 0.7
   }'

Kimi(Moonshot)

模型厂商原始路径

网关调用路径

https://api.moonshot.cn/v1/chat/completions

$BASE_URL/v1/chat/completions

curl $BASE_URL/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $KEY" \
  -d '{
     "model": "$MODEL_NAME",
     "messages": [{"role": "user", "content": "Say this is a test!"}],
     "temperature": 0.7
   }'

讯飞星辰

模型厂商原始路径

网关调用路径

https://maas-api.cn-huabei-1.xf-yun.com/v2/chat/completions

$BASE_URL/v2/chat/completions

curl $BASE_URL/v2/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $KEY" \
  -d '{
     "model": "$MODEL_NAME",
     "messages": [{"role": "user", "content": "Say this is a test!"}],
     "temperature": 0.7
   }'

硅基流动

模型厂商原始路径

网关调用路径

https://api.siliconflow.cn/v1/chat/completions

$BASE_URL/v1/chat/completions

curl $BASE_URL/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $KEY" \
  -d '{
     "model": "$MODEL_NAME",
     "messages": [{"role": "user", "content": "Say this is a test!"}],
     "temperature": 0.7
   }'

字节跳动火山方舟

火山方舟在协议透传方式下支持多种接口,包括但不限于对话(Chat)API、Responses API 和 WebSocket 等。

模型厂商原始路径

网关调用路径

https://ark.cn-beijing.volces.com/api/v3/chat/completions

$BASE_URL/api/v3/chat/completions

https://ark.cn-beijing.volces.com/api/v3/responses

$BASE_URL/api/v3/responses

wss://openspeech.bytedance.com/api/v3/sauc/bigmodel

wss://{BaseUrl 去除协议前缀}/api/v3/sauc/bigmodel

对话(Chat)API 示例

curl $BASE_URL/api/v3/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $KEY" \
  -d '{
     "model": "$MODEL_NAME",
     "messages": [{"role": "user", "content": "Say this is a test!"}],
     "temperature": 0.7
   }'

Responses API 示例

curl $BASE_URL/api/v3/responses \
--header 'Authorization: Bearer $KEY' \
--header 'Content-Type: application/json' \
--data '{
    "model": "doubao-seed-1-6-250615",
    "input": "你好呀。",
    "stream":true
}'

语音识别(WebSocket)
请参考大模型流式语音识别 API 文档,将 WebSocket 连接地址替换为 wss://{BaseUrl 去除 https:// 前缀}/api/v3/sauc/bigmodel

Anthropic

Anthropic 使用自有的 Messages API 协议。

模型厂商原始路径

网关调用路径

https://api.anthropic.com/v1/messages

$BASE_URL/v1/messages

说明

Anthropic 使用 x-api-key 头部进行认证,并需指定 anthropic-version 头部。

curl $BASE_URL/v1/messages \
  -H "Content-Type: application/json" \
  -H "x-api-key: $KEY" \
  -H "anthropic-version: 2023-06-01" \
  -d '{
     "model": "$MODEL_NAME",
     "max_tokens": 1024,
     "messages": [{"role": "user", "content": "Say this is a test!"}]
   }'

OpenAI

OpenAI 使用标准的 Chat Completions API 协议。

模型厂商原始路径

网关调用路径

https://api.openai.com/v1/chat/completions

$BASE_URL/v1/chat/completions

curl $BASE_URL/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $KEY" \
  -d '{
     "model": "$MODEL_NAME",
     "messages": [{"role": "user", "content": "Say this is a test!"}],
     "temperature": 0.7
   }'

Google

以 Google Gemini 原生的 generateContent 接口为例:

模型厂商原始路径

网关调用路径

https://generativelanguage.googleapis.com/v1beta/models/$MODEL_NAME:generateContent

$BASE_URL/v1beta/models/$MODEL_NAME:generateContent

curl $BASE_URL/v1beta/models/$MODEL_NAME:generateContent \
  -H "Content-Type: application/json" \
  -H "x-goog-api-key: $KEY" \
  -d '{
     "contents": [
       {
         "parts": [
           {"text": "Say this is a test!"}
         ]
       }
     ]
   }'
最近更新时间:2026.06.11 21:20:01
这个页面对您有帮助吗?
有用
有用
无用
无用