AI Voice Platform：配置参考

总览

平台各组件的配置文件与环境变量统一汇总在这里。

组件	配置来源	说明
callflow-esl	`apps/callflow-esl/config.json`	ESL + HTTP + audioFork + tts + db
sherpa-asr-online-server	`onnx-platform/sherpa-asr-online-server/config.json`	WS + ASR + VAD
sherpa-tts-server	`onnx-platform/sherpa-tts-server/config.json`	HTTP + TTS pool（默认 Kokoro）
emotion-analysis-server	`onnx-platform/emotion-analysis-server/config.json`	HTTP + 文本/音频情绪模型目录
callout-server	`apps/callout-server/config.json` + `CALLOUT_*` 环境变量	server + 鉴权 + 调度 + 策略 + 外部API + freeswitch + redis + db（+ 回铃检测 / 情绪分析）
callflow-server	环境变量（无 config.json）	管理台 API + 数据库连接
callflow-webpage / callout-webpage	`VITE_*` 环境变量	前端连后端的 API 地址
mod_audio_fork	环境变量 + channel variable	运行时
FreeSWITCH	`conf/dialplan/*.xml` 等	信令 + 路由

callflow-esl `config.json`

完整结构：

{
  "esl-server": {
    "host": "0.0.0.0",
    "port": 9911,
    "commandTimeoutMs": 15000,
    "asrTimeoutMs": 30000,
    "settleDelayMs": 1000
  },
  "logging": {
    "dir": "./logs",
    "level": "info",
    "maxSize": "20m",
    "maxFiles": "14d"
  },
  "http-server": {
    "host": "0.0.0.0",
    "port": 9912
  },
  "freeswitch": {
    "host": "192.168.2.184",
    "port": 8021,
    "password": "ClueCon",
    "callbackHost": "192.168.2.246",
    "callbackPort": 9911,
    "originateDialStringTemplate": "user/{{destinationNumber}}"
  },
  "audioFork": {
    "wsUrl": "ws://192.168.2.246:10096/audio",
    "bugName": "callflow_asr",
    "mixType": "mono",
    "sampleRate": "16k",
    "bidirectionalAudioEnabled": false,
    "bidirectionalAudioStreamEnabled": false,
    "bidirectionalAudioStreamSampleRate": 16000,
    "connectTimeoutMs": 5000,
    "eventSubscriptionMode": "all"
  },
  "tts": {
    "sherpaHttpEndpoint": "http://127.0.0.1:9080/tts",
    "streamingEnabled": true,
    "sherpaStreamEndpoint": "http://127.0.0.1:9080/tts-stream",
    "streamPlaybackPrefix": "shout://",
    "speakerId": 0,
    "speed": 1,
    "requestTimeoutMs": 10000,
    "playbackTarget": "wav-url",
    "fsPlaybackBaseDir": ""
  },
  "recording": {
    "directory": "/usr/local/freeswitch/record",
    "publicBaseUrl": ""
  },
  "db": {
    "url": "postgres://postgres:postgres@localhost:5432/freeswitch"
  }
}

`esl-server`

字段	说明
`host` / `port`	ESL outbound 监听地址与端口
`commandTimeoutMs`	ESL 命令等待 reply 的超时
`asrTimeoutMs`	单轮 `hear()` 默认超时（业务可在入参覆盖）
`settleDelayMs`	手动 `answer` 后建议的等待，给信令稳定

`http-server`

字段	说明
`host` / `port`	HTTP 监听地址，承载 `/outbound-calls`

`freeswitch`

字段	说明
`host` / `port` / `password`	inbound ESL（用于 originate）
`callbackHost` / `callbackPort`	FreeSWITCH 在外呼接通后回拨当前 ESL 服务使用的地址，必须 FS 进程可达
`originateDialStringTemplate`	所有外呼场景的默认拨号串模板，必须含 `{{destinationNumber}}`

`audioFork`

字段	说明
`wsUrl`	mod_audio_fork 要连接的 ASR WebSocket 地址
`bugName`	media bug 名称，用于 `uuid_audio_fork stop` 精确定位
`mixType` / `sampleRate`	上行音频通道与采样率；`mono`/`16k` 适合大多数 ASR
`bidirectionalAudioEnabled` / `bidirectionalAudioStreamEnabled`	下行音频通道；hear-only 模式保持关闭
`bidirectionalAudioStreamSampleRate`	下行音频流采样率
`connectTimeoutMs`	等待 WebSocket 连接成功事件的超时
`eventSubscriptionMode`	ESL 事件订阅策略：`"all"`（推荐）/ `"custom-subclass"` / `"channel-plus-custom"`

`tts`

字段	说明
`sherpaHttpEndpoint`	sherpa-tts-server 的 `POST /tts` 地址
`streamingEnabled`	是否启用流式 TTS；`true` 时 `ctx.speak({kind:"tts"})` 先请求 `sherpaStreamEndpoint` 拿到流式链接，由 FreeSWITCH 边拉边播
`sherpaStreamEndpoint`	sherpa-tts-server 的 `POST /tts-stream` 地址
`streamPlaybackPrefix`	流式链接交给 FreeSWITCH `playback` 前追加的前缀，已启用 `mod_shout` 时通常为 `"shout://"`
`speakerId` / `speed`	`ctx.speak({kind:"tts"})` 未指定时的默认值
`requestTimeoutMs`	HTTP 请求超时
`playbackTarget`	`"wav-url"`（默认）或 `"file-path"`
`fsPlaybackBaseDir`	`playbackTarget=file-path` 时使用的目录前缀

`recording`

字段	说明
`directory`	FreeSWITCH 进程可写的录音目录
`publicBaseUrl`	可选公开访问前缀；为空时 `ctx.startRecording()` 返回 `fileUrl=null`

`db`

PostgreSQL 连接信息。当前 Drizzle schema 使用 callflow schema：
callflow.call_businesses、callflow.call_business_number_mappings 和
callflow.callflow_runtime_configs，定义见 apps/callflow-esl/src/db/schema.ts。

sherpa-asr-online-server `config.json`

服务与 IO

字段	默认	说明
`listenHost`	`127.0.0.1`	监听地址
`listenPort`	`10096`	监听端口
`healthPath`	`/health`	HTTP 健康检查路径
`wsPath`	`/audio`	WebSocket 升级路径
`wsSubprotocol`	`audio.drachtio.org`	子协议；空则不强制
`maxSessions`	`16`	并发会话上限
`ioWorkers`	CPU 核心数	I/O worker 线程数
`sessionPoolSize`	`16`	预热的 ASR session 数
`maxFrameBytes`	`4194304`	单帧最大字节数
`readBufferBytes`	`65536`	每连接 recv 缓冲
`writeBufferBytes`	`65536`	每连接 send 缓冲（预留）
`socketRecvBufferBytes`	`262144`	`SO_RCVBUF`，0=不调整
`socketSendBufferBytes`	`262144`	`SO_SNDBUF`，0=不调整
`acceptBacklog`	`SOMAXCONN`	`listen(backlog)`
`workerPollTimeoutMs`	`10`	worker 单轮 select 超时，夹紧 `[1, 1000]`
`tcpNoDelay`	`true`	启用 `TCP_NODELAY`
`debugLogTextFrames`	`false`	打印每条 metadata 文本帧
`debugLogAudioFrames`	`false`	每个二进制帧落识别状态日志
`debugLogRecognitionState`	`false`	识别状态变化都落日志
`recordAudioEnabled`	`false`	启用上行音频录制
`recordAudioDir`	`recordings`	录制目录

Sherpa ASR / VAD

字段	默认	说明
`asrModelName`	`sherpa-onnx-streaming-zipformer-zh-int8-2025-06-30`	内置 ASR 模型预设名（默认中文 int8 流式 zipformer）；可切到 `sherpa-onnx-streaming-zipformer-bilingual-zh-en`（旧版中英双语）
`asrEncoder` / `asrDecoder` / `asrJoiner` / `asrTokens`	由 `asrModelName` 推导	ONNX 模型路径；显式配置时覆盖预设
`asrBpeVocab`	`""`	可选 BPE 词表（英语场景）
`vadModel`	`silero_vad.onnx`	silero-vad onnx
`provider`	`cpu`	ONNX 推理后端
`sampleRate`	`16000`	上行采样率（仅支持 16kHz）
`numThreads`	`1`	ONNX 线程数
`debug`	`false`	sherpa-onnx 内部 debug
`enableVad`	`true`	启用 VAD 分段
`enableEndpoint`	`true`	启用 endpoint 检测
`sendPartialResults`	`true`	是否下发 partial 文本帧
`endpointRule1MinTrailingSilence`	`2.4`	sherpa endpoint 规则 1
`endpointRule2MinTrailingSilence`	`1.2`	sherpa endpoint 规则 2
`endpointRule3MinUtteranceLength`	`20.0`	sherpa endpoint 规则 3
`asrModelType`	`zipformer`	sherpa-onnx 模型类型
`asrModelingUnit`	`cjkchar`	建模单元
`partialMinIntervalMs`	`300`	同句 partial 最小下发间隔
`decodeBatchMs`	`80`	批量 decode 节奏
`vadBufferSizeSeconds`	`30.0`	silero-vad ring buffer
`vadThreshold`	`0.5`	VAD 置信度阈值
`vadMinSilenceDuration`	`0.5`	触发段尾的最小静音时长
`vadMinSpeechDuration`	`0.25`	视为有效语音的最小时长
`vadWindowSize`	`512`	VAD 窗大小（采样点）
`vadMaxSpeechDuration`	`20.0`	单段最大语音时长

sherpa-tts-server `config.json`

字段	默认	说明
`listenHost`	`127.0.0.1`	实际监听地址
`listenPort`	`9080`	实际监听端口
`publicBaseUrl`	`http://127.0.0.1:9080`	返给客户端的 wav URL 前缀；反代后必须改
`workerThreads`	CPU 核心数	HTTP 处理线程数
`ttsPoolSize`	`max(CPU/2, 1)`	预热的 `OfflineTts` 实例数
`mp3EncoderPath`	`ffmpeg`	流式 TTS 把 PCM 实时编码为 MP3 使用的 `ffmpeg` 可执行路径；`mod_shout` 拉流必需
`mp3BitrateKbps`	`192`	流式 MP3 的 CBR 码率，范围 `64..320`
`mp3VolumeGainDb`	`3.0`	流式 MP3 响度增益（dB），范围 `-20..20`；正增益同时启用 limiter
`wavSendChunkBytes`	`65536`	流式发送块大小
`maxRequestBodyBytes`	`1048576`	单请求体最大字节数
`maxQueuedRequests`	`0`（不限）	等待队列上限，超出立刻 503
`startupScanCache`	`true`	启动时扫描 `public/wav/` 建文件索引
`ttsModelType`	`kokoro`	默认 TTS 模型类型：`kokoro`、`matcha` 或 `vits`
`ttsModelDir`	`kokoro-multi-lang-v1_1`	相对 `onnx-platform/models/` 的模型目录
`ttsKokoro` / `ttsMatcha` / `ttsVits*`	见 README	各模型类型对应的模型文件名（默认 Kokoro 用 `model.onnx` / `voices.bin` / `tokens.txt` / `lexicon-us-en.txt,lexicon-zh.txt` / `espeak-ng-data` / `dict`）

emotion-analysis-server `config.json`

通话后情绪分析服务（C++ HTTP，默认 127.0.0.1:9090）。是否调用由外呼侧
emotionAnalysis.enabled 决定，本服务自身只有少量配置：

{
  "listenHost": "127.0.0.1",
  "listenPort": 9090,
  "workerThreads": 2,
  "maxQueuedRequests": 32,
  "maxRequestBodyBytes": 2097152,
  "textModelDir": "emotion-text",
  "audioModelDir": "emotion-audio"
}

字段	默认	说明
`listenHost` / `listenPort`	`127.0.0.1` / `9090`	HTTP 监听地址
`workerThreads`	`2`	HTTP 处理线程数
`maxQueuedRequests`	`32`	等待队列上限，超出 503
`maxRequestBodyBytes`	`2097152`	单请求体最大字节数
`textModelDir` / `audioModelDir`	`emotion-text` / `emotion-audio`	相对 `onnx-platform/models/` 的模型目录

详见情绪分析服务。

callout-server `config.json`

外呼子系统配置较多，按功能分组。下面是仓库默认 config.json（完整字段与环境变量覆盖以
apps/callout-server/src/config.ts 与智能外呼为准）：

{
  "server": {
    "host": "0.0.0.0",
    "port": 9920,
    "allowedOrigins": ["http://localhost:19920", "http://127.0.0.1:19920"]
  },
  "auth": { "enabled": true, "sessionTtlSeconds": 86400 },
  "scheduler": { "enabled": true, "intervalMs": 5000, "batchSize": 20 },
  "redis": {
    "host": "127.0.0.1", "port": 6379, "username": "default", "password": "redis",
    "db": 0, "keyPrefix": "callout", "lockTtlMs": 60000
  },
  "emotionAnalysis": {
    "enabled": true, "endpoint": "http://127.0.0.1:9090/analyze",
    "timeoutMs": 30000, "intervalMs": 5000, "maxConcurrency": 2,
    "retryIntervalMs": 60000, "maxRetries": 3
  },
  "internalApi": { "callResultToken": "" },
  "freeswitch": {
    "host": "192.168.2.184", "port": 8021, "password": "ClueCon",
    "callbackHost": "192.168.2.246", "callbackPort": 9911,
    "originateDialStringTemplate": "user/{{destinationNumber}}",
    "commandTimeoutMs": 15000
  },
  "db": { "url": "postgres://postgres:postgres@localhost:5432/freeswitch" }
}

分组	说明
`server`	HTTP 监听地址；`allowedOrigins` 为管理台跨域白名单（可用 `CALLOUT_CORS_ALLOWED_ORIGINS` 覆盖）
`auth`	管理 API 角色鉴权开关与会话有效期；默认开启，本地联调可 `CALLOUT_AUTH_ENABLED=false` 关闭，启用后自动创建引导管理员
`scheduler`	后台派发器开关、轮询间隔、每轮每活动批量
`redis`	多实例分布式锁（缺省时单实例运行；`CALLOUT_REDIS_ENABLED=false` 可强制关闭）
`emotionAnalysis`	通话后情绪分析开关、`emotion-analysis-server` 端点、超时/轮询/并发/重试
`internalApi.callResultToken`	callflow-esl 回写 `/api/call-results` 的共享令牌（请求头 `X-Callout-Internal-Token`）
`freeswitch`	inbound ESL + 回拨地址 + 拨号串模板（`agentDialStringTemplate` 为坐席软电话呼出模板，默认 `user/{{agentExtension}}`）
`db.url`	PostgreSQL；callout 从中派生 `callout` 与 `callflow` 两个 schema

此外 config.ts 还支持这些带内置默认值、按需添加的分组：lifecycleReconcile（卡死联系人补偿）、
conferenceReconcile（转人工会议僵尸会话对账，默认 30s 一轮 / 60s 宽限）、importJobs（异步导入 worker）、
notifications（运行事件告警 webhook）、ringbackDetection（回铃音 / 嘟声检测），以及
freeswitchEvents（inbound ESL 接听/挂机兜底）。

可用 CALLOUT_ 前缀环境变量覆盖（如 CALLOUT_SERVER_PORT、CALLOUT_FREESWITCH_HOST、
CALLOUT_ORIGINATE_DIAL_STRING_TEMPLATE、CALLOUT_AUTH_ENABLED、CALLOUT_EMOTION_ANALYSIS_ENABLED）。
回铃音检测由 ringbackDetection 配置块控制（代码默认关闭，但仓库内 config.json 默认开启，含
mod_spandsp 嘟声检测），也可用 CALLOUT_RINGBACK_DETECTION_ENABLED、..._WS_URL、..._MIX_TYPE、
..._SAMPLE_RATE、..._TIMEOUT_MS、..._TONE_ENABLED、..._TONE_FREQUENCY、..._TONE_HITS 覆盖。活动 /
联系人 / 线路级策略（并发、重试、工作时段、禁呼、频控等）存在数据库里，不在配置文件中，详见智能外呼。

管理台与前端环境变量

callflow-server（管理控制台 API，无 config.json，全部走环境变量）：

环境变量	默认	说明
`CALLFLOW_SERVER_HOST` / `CALLFLOW_SERVER_PORT`	`0.0.0.0` / `9913`	HTTP 监听地址
`FREESWITCH_DB_URL`	`…@192.168.2.246:5432/freeswitch`	FreeSWITCH 数据库（只读视图）
`CALLFLOW_DB_URL`	`…@localhost:5432/freeswitch?options=-c search_path=callflow`	callflow 数据库（读写）
`CALLFLOW_ESL_BASE_URL`	`http://127.0.0.1:9912`	callflow-esl HTTP，用于运行时配置重载

前端（Quasar，构建/开发期注入 VITE_*）：

应用	环境变量	默认	dev 端口
callflow-webpage	`VITE_CALLFLOW_API_BASE`	`http://127.0.0.1:9913`	`19913`
callout-webpage	`VITE_CALLOUT_API_BASE_URL`	`http://127.0.0.1:9920`	`19920`

管理控制台把 TTS / 录音运行时配置写进 callflow-esl 同一套 callflow schema，再触发
热重载，因此控制台与 callflow-esl 必须连到同一个 callflow 数据库。

mod_audio_fork 环境变量

变量	默认	说明
`MOD_AUDIO_FORK_SUBPROTOCOL_NAME`	`audio.drachtio.org`	WebSocket 子协议名
`MOD_AUDIO_FORK_SERVICE_THREADS`	`1`	服务线程数（1-5）
`MOD_AUDIO_FORK_BUFFER_SECS`	`2`	缓冲秒数（1-5）

mod_audio_fork channel variable

可在 dialplan 或 uuid_setvar 中设置：

变量	用途
`MOD_AUDIO_BASIC_AUTH_USERNAME`	WSS 端的 Basic Auth 用户名
`MOD_AUDIO_BASIC_AUTH_PASSWORD`	WSS 端的 Basic Auth 密码
`MOD_AUDIO_FORK_ALLOW_SELFSIGNED`	TLS 允许自签证书
`MOD_AUDIO_FORK_SKIP_SERVER_CERT_HOSTNAME_CHECK`	跳过 TLS hostname 校验
`MOD_AUDIO_FORK_ALLOW_EXPIRED`	允许过期证书

FreeSWITCH dialplan 关键片段

把通话路由到 callflow-esl：

<extension name="route-to-callflow">
  <condition field="destination_number" expression="^(.+)$">
    <action application="set" data="hangup_after_bridge=true"/>
    <action application="socket" data="<callflow-esl-host>:9911 async full"/>
  </condition>
</extension>

可选：在 dialplan 提前设 channel variable，业务侧可读：

1 2	<action application="set" data="campaign_id=may-test"/> <action application="set" data="business_code=my-business"/>

启用 mod_audio_fork：

1 2	<!-- conf/autoload_configs/modules.conf.xml --> <load module="mod_audio_fork"/>

ESL inbound（callflow-esl 调用 FS 用）：

<!-- conf/autoload_configs/event_socket.conf.xml -->
<configuration name="event_socket.conf" description="Socket Client">
  <settings>
    <param name="listen-ip" value="0.0.0.0"/>
    <param name="listen-port" value="8021"/>
    <param name="password" value="ClueCon"/>
    <param name="apply-inbound-acl" value="localnet.auto"/>
  </settings>
</configuration>

配置变更速查

想做	改哪个
改 ASR 监听端口	`sherpa-asr-online-server/config.json` `listenPort` + `audioFork.wsUrl`
改 TTS 监听端口	`sherpa-tts-server/config.json` `listenPort` + `tts.sherpaHttpEndpoint` + `tts.sherpaStreamEndpoint` + `publicBaseUrl`
改 callflow-esl ESL/HTTP 端口	`esl-server.port` / `http-server.port` + FS dialplan socket 地址
改 FS 反向回拨地址	`freeswitch.callbackHost`
改 ASR 模型	`sherpa-asr-online-server/config.json` `asrModelName`（内置预设）或显式 `asrEncoder/Decoder/Joiner/Tokens`
改 TTS 模型	`ttsModelType` + `ttsModelDir`；默认使用 `kokoro-multi-lang-v1_1`，可切回 `matcha` / `vits`
改 hear 默认超时	`esl-server.asrTimeoutMs`（业务可在 `hear()` 入参覆盖）
改 ASR 并发	`maxSessions` + `sessionPoolSize`
改 TTS 并发	`workerThreads` + `ttsPoolSize`
启用 / 关闭流式 TTS	`tts.streamingEnabled`；启用前确认 sherpa-tts-server 与 FS 都可用 `mod_shout`
切换 `ffmpeg` 路径	`sherpa-tts-server/config.json` `mp3EncoderPath`
开启 ASR 上行录音	`recordAudioEnabled: true`
切换业务路由	PostgreSQL 表 `callflow.call_businesses` + `callflow.call_business_number_mappings`
改外呼 API 端口	`apps/callout-server/config.json` `server.port`（或 `CALLOUT_SERVER_PORT`）
调外呼并发 / 重试 / 工作时段	活动级字段（存数据库），见智能外呼
启用回铃音检测	`CALLOUT_RINGBACK_DETECTION_ENABLED=true` + `..._WS_URL`
启用通话后情绪分析	callout `emotionAnalysis.enabled=true` + `endpoint`；并启动 emotion-analysis-server，见情绪分析服务
改管理台 API 端口	`CALLFLOW_SERVER_PORT` + 前端 `VITE_CALLFLOW_API_BASE`
在线调 TTS / 录音配置	管理控制台 `/callflow` 页保存后点重载，见管理控制台

配置参考

总览

callflow-esl config.json

esl-server

http-server

freeswitch

audioFork

tts

recording

db

sherpa-asr-online-server config.json