video_translate/docs/plans/2026-03-19-tts-language-contract-design.md
Song367 04072dc94b
All checks were successful
Gitea Actions Demo / Explore-Gitea-Actions (push) Successful in 1m6s
commit code
2026-03-19 20:13:24 +08:00

42 lines
1.7 KiB
Markdown

# TTS Language Contract Design
## Goal
Upgrade subtitle generation so the model always produces English subtitle text for display, plus a separate TTS translation and language code for dubbing.
## Scope
- Keep current upload and editor UI unchanged for this step.
- Add backend contract support for `ttsText` and `ttsLanguage`.
- Let dubbing prefer `ttsText` over `translatedText`.
- Keep existing calls backward compatible by defaulting `ttsLanguage` to the current target language when none is provided.
## Design
### Prompt contract
- `translatedText` becomes the on-screen subtitle text and must always be English.
- `ttsText` becomes the spoken dubbing text in the requested TTS language.
- `ttsLanguage` must be returned on every subtitle item and must exactly match the requested TTS language code.
- The system and user prompts should clearly separate subtitle language from TTS language.
### Data model
- Extend `Subtitle` with optional `ttsText` and `ttsLanguage`.
- Extend raw model subtitle parsing to accept these fields.
- Extend pipeline result metadata to track `ttsLanguage`.
### Runtime behavior
- Subtitle generation should accept an optional `ttsLanguage`.
- If not provided, use `targetLanguage` to avoid breaking existing flows.
- Voice catalog selection should use the TTS language, not the subtitle language.
- TTS generation should read `subtitle.ttsText` first, then fall back to `translatedText`, then `text`.
### Testing
- Add prompt tests asserting the new system and user prompt text references English subtitles plus TTS language.
- Add parsing tests asserting `ttsText` and `ttsLanguage` are normalized into subtitles.
- Add service tests asserting `ttsLanguage` is forwarded through the subtitle pipeline request body.