video_translate/docs/plans/2026-03-19-tts-language-contract-design.md
Song367 04072dc94b
All checks were successful
Gitea Actions Demo / Explore-Gitea-Actions (push) Successful in 1m6s
commit code
2026-03-19 20:13:24 +08:00

1.7 KiB

TTS Language Contract Design

Goal

Upgrade subtitle generation so the model always produces English subtitle text for display, plus a separate TTS translation and language code for dubbing.

Scope

  • Keep current upload and editor UI unchanged for this step.
  • Add backend contract support for ttsText and ttsLanguage.
  • Let dubbing prefer ttsText over translatedText.
  • Keep existing calls backward compatible by defaulting ttsLanguage to the current target language when none is provided.

Design

Prompt contract

  • translatedText becomes the on-screen subtitle text and must always be English.
  • ttsText becomes the spoken dubbing text in the requested TTS language.
  • ttsLanguage must be returned on every subtitle item and must exactly match the requested TTS language code.
  • The system and user prompts should clearly separate subtitle language from TTS language.

Data model

  • Extend Subtitle with optional ttsText and ttsLanguage.
  • Extend raw model subtitle parsing to accept these fields.
  • Extend pipeline result metadata to track ttsLanguage.

Runtime behavior

  • Subtitle generation should accept an optional ttsLanguage.
  • If not provided, use targetLanguage to avoid breaking existing flows.
  • Voice catalog selection should use the TTS language, not the subtitle language.
  • TTS generation should read subtitle.ttsText first, then fall back to translatedText, then text.

Testing

  • Add prompt tests asserting the new system and user prompt text references English subtitles plus TTS language.
  • Add parsing tests asserting ttsText and ttsLanguage are normalized into subtitles.
  • Add service tests asserting ttsLanguage is forwarded through the subtitle pipeline request body.