9.4 KiB
4+1 Subtitle Pipeline Implementation Plan
For Claude: REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
Goal: Replace the current one-shot subtitle generation flow with a 4+1 staged pipeline that isolates transcription, segmentation, translation, voice matching, and validation.
Architecture: Introduce stage-specific server modules under a new subtitleStages folder and route the existing /generate-subtitles backend entry point through a new orchestrator. Keep the final SubtitlePipelineResult contract stable for the editor while adding internal stage contracts and diagnostics.
Tech Stack: TypeScript, Node server pipeline, Vitest, React client services, async subtitle job polling
Task 1: Define stage contracts and lock them with tests
Files:
- Modify:
E:\Downloads\ai-video-dubbing-&-translation\src\types.ts - Create:
E:\Downloads\ai-video-dubbing-&-translation\src\server\subtitleStages\stageTypes.ts - Create:
E:\Downloads\ai-video-dubbing-&-translation\src\server\subtitleStages\stageTypes.test.ts
Step 1: Write the failing test
- Assert stage types support:
- transcription output with
confidenceandneedsReview - translated output with
ttsTextandttsLanguage - validation issue output with
codeandseverity
- transcription output with
Step 2: Run test to verify it fails
Run:
npm run test -- src/server/subtitleStages/stageTypes.test.ts
Expected: FAIL because the new stage files and contracts do not exist yet.
Step 3: Write minimal implementation
- Create
stageTypes.tswith:TranscriptSegmentSegmentedSubtitleTranslatedSubtitleVoiceMatchedSubtitleValidationIssue- any stage diagnostics helpers needed by the orchestrator
- Extend
src/types.tsonly where the public result contract needs optional diagnostics.
Step 4: Run test to verify it passes
Run:
npm run test -- src/server/subtitleStages/stageTypes.test.ts
Expected: PASS
Step 5: Commit
Skip commit for now.
Task 2: Add the transcription stage
Files:
- Create:
E:\Downloads\ai-video-dubbing-&-translation\src\server\subtitleStages\transcriptionStage.ts - Create:
E:\Downloads\ai-video-dubbing-&-translation\src\server\subtitleStages\transcriptionStage.test.ts - Modify:
E:\Downloads\ai-video-dubbing-&-translation\src\server\videoSubtitleGeneration.ts
Step 1: Write the failing tests
- Assert the transcription stage prompt asks only for:
- faithful transcription
- timestamps
- speaker metadata
- Assert it does not request translation or voice selection.
- Assert parser output normalizes low-confidence and missing speaker fields safely.
Step 2: Run tests to verify they fail
Run:
npm run test -- src/server/subtitleStages/transcriptionStage.test.ts src/server/videoSubtitleGeneration.test.ts
Expected: FAIL because the transcription stage does not exist and current prompt is still all-in-one.
Step 3: Write minimal implementation
- Extract provider-specific transcription logic from
videoSubtitleGeneration.tsintotranscriptionStage.ts. - Narrow the transcription prompt and JSON schema to transcription-only fields.
- Return
TranscriptSegment[].
Step 4: Run tests to verify they pass
Run:
npm run test -- src/server/subtitleStages/transcriptionStage.test.ts src/server/videoSubtitleGeneration.test.ts
Expected: PASS
Step 5: Commit
Skip commit for now.
Task 3: Add the segmentation stage
Files:
- Create:
E:\Downloads\ai-video-dubbing-&-translation\src\server\subtitleStages\segmentationStage.ts - Create:
E:\Downloads\ai-video-dubbing-&-translation\src\server\subtitleStages\segmentationStage.test.ts - Modify:
E:\Downloads\ai-video-dubbing-&-translation\src\server\subtitlePipeline.ts
Step 1: Write the failing tests
- Assert long transcript segments are split into subtitle-friendly chunks.
- Assert segmentation preserves
originalText, timing order, and speaker identity. - Assert no paraphrasing occurs during segmentation.
Step 2: Run tests to verify they fail
Run:
npm run test -- src/server/subtitleStages/segmentationStage.test.ts src/server/subtitlePipeline.test.ts
Expected: FAIL because there is no explicit segmentation stage.
Step 3: Write minimal implementation
- Reuse normalization helpers from
subtitlePipeline.ts. - Implement deterministic segmentation that:
- preserves chronology
- keeps original text intact
- marks impossible cases for later review instead of rewriting
Step 4: Run tests to verify it passes
Run:
npm run test -- src/server/subtitleStages/segmentationStage.test.ts src/server/subtitlePipeline.test.ts
Expected: PASS
Step 5: Commit
Skip commit for now.
Task 4: Add the translation stage
Files:
- Create:
E:\Downloads\ai-video-dubbing-&-translation\src\server\subtitleStages\translationStage.ts - Create:
E:\Downloads\ai-video-dubbing-&-translation\src\server\subtitleStages\translationStage.test.ts - Modify:
E:\Downloads\ai-video-dubbing-&-translation\src\server\subtitleGeneration.ts
Step 1: Write the failing tests
- Assert translation stage input is
originalTextfrom segmentation, not raw provider output. - Assert it returns:
translatedTextttsTextttsLanguage
- Assert it never changes timestamps.
Step 2: Run tests to verify they fail
Run:
npm run test -- src/server/subtitleStages/translationStage.test.ts src/server/subtitleGeneration.test.ts
Expected: FAIL because translation is not separated yet.
Step 3: Write minimal implementation
- Build a translation-only stage that consumes segmented subtitles.
- Keep English subtitle generation and TTS-language generation separate but paired.
- Return
TranslatedSubtitle[].
Step 4: Run tests to verify they pass
Run:
npm run test -- src/server/subtitleStages/translationStage.test.ts src/server/subtitleGeneration.test.ts
Expected: PASS
Step 5: Commit
Skip commit for now.
Task 5: Add voice matching and validation stages
Files:
- Create:
E:\Downloads\ai-video-dubbing-&-translation\src\server\subtitleStages\voiceMatchingStage.ts - Create:
E:\Downloads\ai-video-dubbing-&-translation\src\server\subtitleStages\voiceMatchingStage.test.ts - Create:
E:\Downloads\ai-video-dubbing-&-translation\src\server\subtitleStages\validationStage.ts - Create:
E:\Downloads\ai-video-dubbing-&-translation\src\server\subtitleStages\validationStage.test.ts - Modify:
E:\Downloads\ai-video-dubbing-&-translation\src\voices.ts
Step 1: Write the failing tests
- Assert voice matching only picks from the current language-specific catalog.
- Assert it falls back safely when gender or speaker tone is missing.
- Assert validation returns warnings for:
- low confidence transcript
- voice language mismatch
- empty translation
- timing overlap
Step 2: Run tests to verify they fail
Run:
npm run test -- src/server/subtitleStages/voiceMatchingStage.test.ts src/server/subtitleStages/validationStage.test.ts
Expected: FAIL because neither stage exists yet.
Step 3: Write minimal implementation
- Implement a pure voice matcher that adds
voiceIdand never rewrites text. - Implement a validator that inspects final subtitles and returns
ValidationIssue[].
Step 4: Run tests to verify they pass
Run:
npm run test -- src/server/subtitleStages/voiceMatchingStage.test.ts src/server/subtitleStages/validationStage.test.ts
Expected: PASS
Step 5: Commit
Skip commit for now.
Task 6: Integrate the orchestrator and async job progress
Files:
- Create:
E:\Downloads\ai-video-dubbing-&-translation\src\server\multiStageSubtitleGeneration.ts - Create:
E:\Downloads\ai-video-dubbing-&-translation\src\server\multiStageSubtitleGeneration.test.ts - Modify:
E:\Downloads\ai-video-dubbing-&-translation\src\server\subtitleGeneration.ts - Modify:
E:\Downloads\ai-video-dubbing-&-translation\src\server\subtitleJobs.ts - Modify:
E:\Downloads\ai-video-dubbing-&-translation\src\services\subtitleService.ts - Modify:
E:\Downloads\ai-video-dubbing-&-translation\src\types.ts - Modify:
E:\Downloads\ai-video-dubbing-&-translation\server.ts
Step 1: Write the failing tests
- Assert the orchestrator runs stages in order:
- transcription
- segmentation
- translation
- voice matching
- validation
- Assert async progress updates expose stage-specific messages.
- Assert final
SubtitlePipelineResultstays backward compatible for the editor.
Step 2: Run tests to verify they fail
Run:
npm run test -- src/server/multiStageSubtitleGeneration.test.ts src/server/subtitleJobs.test.ts src/services/subtitleService.test.ts
Expected: FAIL because the orchestrator and new stage progress do not exist yet.
Step 3: Write minimal implementation
- Add
multiStageSubtitleGeneration.ts. - Route existing backend entry points through the orchestrator.
- Keep
/generate-subtitlesand polling payloads stable. - Include optional validation diagnostics in the final result.
Step 4: Run tests to verify they pass
Run:
npm run test -- src/server/multiStageSubtitleGeneration.test.ts src/server/subtitleJobs.test.ts src/services/subtitleService.test.ts
Expected: PASS
Step 5: Run focused regression tests
Run:
npm run test -- src/server/videoSubtitleGeneration.test.ts src/server/subtitleGeneration.test.ts src/server/subtitleJobs.test.ts src/services/subtitleService.test.ts src/components/EditorScreen.test.tsx
Expected: PASS
Step 6: Commit
Skip commit for now.