651 lines
19 KiB
Markdown
651 lines
19 KiB
Markdown
# Precise Dialogue Localization Implementation Plan
|
|
|
|
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
|
|
|
|
**Goal:** Build a high-precision subtitle pipeline that returns accurate sentence boundaries, word-level timings, and real speaker attribution while preserving the current editor flow.
|
|
|
|
**Architecture:** Keep the React app and `server.ts` as the public entry points, but move timing-critical work into a dedicated alignment adapter. The backend normalizes aligned words into sentence subtitles, translates text without changing timing, and returns quality metadata so the editor can enable or disable precision UI safely.
|
|
|
|
**Tech Stack:** React 19, TypeScript, Vite, Express, FFmpeg, OpenAI SDK, a new test runner (`vitest`), and a high-precision alignment backend adapter.
|
|
|
|
---
|
|
|
|
### Task 1: Add Test Infrastructure
|
|
|
|
**Files:**
|
|
- Modify: `E:\Downloads\ai-video-dubbing-&-translation\package.json`
|
|
- Create: `E:\Downloads\ai-video-dubbing-&-translation\vitest.config.ts`
|
|
- Create: `E:\Downloads\ai-video-dubbing-&-translation\src\test\setup.ts`
|
|
|
|
**Step 1: Write the failing test**
|
|
|
|
Create a minimal smoke test first so the test runner has a real target.
|
|
|
|
```ts
|
|
import { describe, expect, it } from 'vitest';
|
|
|
|
describe('test harness', () => {
|
|
it('runs vitest in this workspace', () => {
|
|
expect(true).toBe(true);
|
|
});
|
|
});
|
|
```
|
|
|
|
**Step 2: Run test to verify it fails**
|
|
|
|
Run: `npm test -- --run`
|
|
Expected: FAIL because no `test` script or Vitest config exists yet.
|
|
|
|
**Step 3: Write minimal implementation**
|
|
|
|
1. Add `test` and `test:watch` scripts to `package.json`.
|
|
2. Add dev dependencies for `vitest`.
|
|
3. Create `vitest.config.ts` with a Node environment default.
|
|
4. Add `src/test/setup.ts` for shared setup.
|
|
|
|
```ts
|
|
import { defineConfig } from 'vitest/config';
|
|
|
|
export default defineConfig({
|
|
test: {
|
|
environment: 'node',
|
|
setupFiles: ['./src/test/setup.ts'],
|
|
},
|
|
});
|
|
```
|
|
|
|
**Step 4: Run test to verify it passes**
|
|
|
|
Run: `npm test -- --run`
|
|
Expected: PASS with the smoke test.
|
|
|
|
**Step 5: Commit**
|
|
|
|
```bash
|
|
git add package.json vitest.config.ts src/test/setup.ts
|
|
git commit -m "test: add vitest infrastructure"
|
|
```
|
|
|
|
### Task 2: Extract Subtitle Pipeline Types and Normalizers
|
|
|
|
**Files:**
|
|
- Modify: `E:\Downloads\ai-video-dubbing-&-translation\src\types.ts`
|
|
- Create: `E:\Downloads\ai-video-dubbing-&-translation\src\lib\subtitlePipeline.ts`
|
|
- Create: `E:\Downloads\ai-video-dubbing-&-translation\src\lib\subtitlePipeline.test.ts`
|
|
|
|
**Step 1: Write the failing test**
|
|
|
|
Write tests for normalization from aligned word payloads to UI-ready subtitles.
|
|
|
|
```ts
|
|
it('derives subtitle boundaries from first and last word', () => {
|
|
const result = normalizeAlignedSentence({
|
|
id: 's1',
|
|
speakerId: 'spk_0',
|
|
words: [
|
|
{ text: 'Hello', startTime: 1.2, endTime: 1.5, speakerId: 'spk_0', confidence: 0.99 },
|
|
{ text: 'world', startTime: 1.6, endTime: 2.0, speakerId: 'spk_0', confidence: 0.98 },
|
|
],
|
|
originalText: 'Hello world',
|
|
translatedText: '你好世界',
|
|
});
|
|
|
|
expect(result.startTime).toBe(1.2);
|
|
expect(result.endTime).toBe(2.0);
|
|
});
|
|
```
|
|
|
|
**Step 2: Run test to verify it fails**
|
|
|
|
Run: `npm test -- --run src/lib/subtitlePipeline.test.ts`
|
|
Expected: FAIL because the new module and extended types do not exist.
|
|
|
|
**Step 3: Write minimal implementation**
|
|
|
|
1. Extend `Subtitle` in `src/types.ts` with `speakerId`, `words`, and `confidence`.
|
|
2. Create a pure helper module that normalizes backend payloads into frontend subtitles.
|
|
|
|
```ts
|
|
export const deriveSubtitleBounds = (words: WordTiming[]) => ({
|
|
startTime: words[0]?.startTime ?? 0,
|
|
endTime: words[words.length - 1]?.endTime ?? 0,
|
|
});
|
|
```
|
|
|
|
**Step 4: Run test to verify it passes**
|
|
|
|
Run: `npm test -- --run src/lib/subtitlePipeline.test.ts`
|
|
Expected: PASS.
|
|
|
|
**Step 5: Commit**
|
|
|
|
```bash
|
|
git add src/types.ts src/lib/subtitlePipeline.ts src/lib/subtitlePipeline.test.ts
|
|
git commit -m "feat: add subtitle pipeline normalizers"
|
|
```
|
|
|
|
### Task 3: Implement Sentence Reconstruction Helpers
|
|
|
|
**Files:**
|
|
- Create: `E:\Downloads\ai-video-dubbing-&-translation\src\lib\alignment\sentenceReconstruction.ts`
|
|
- Create: `E:\Downloads\ai-video-dubbing-&-translation\src\lib\alignment\sentenceReconstruction.test.ts`
|
|
|
|
**Step 1: Write the failing test**
|
|
|
|
Cover pause splitting and speaker splitting.
|
|
|
|
```ts
|
|
it('splits sentences when speaker changes', () => {
|
|
const result = rebuildSentences([
|
|
{ text: 'Hi', startTime: 0.0, endTime: 0.2, speakerId: 'spk_0', confidence: 0.9 },
|
|
{ text: 'there', startTime: 0.25, endTime: 0.5, speakerId: 'spk_0', confidence: 0.9 },
|
|
{ text: 'no', startTime: 0.55, endTime: 0.7, speakerId: 'spk_1', confidence: 0.9 },
|
|
]);
|
|
|
|
expect(result).toHaveLength(2);
|
|
});
|
|
```
|
|
|
|
**Step 2: Run test to verify it fails**
|
|
|
|
Run: `npm test -- --run src/lib/alignment/sentenceReconstruction.test.ts`
|
|
Expected: FAIL because the helper module is missing.
|
|
|
|
**Step 3: Write minimal implementation**
|
|
|
|
Implement pure splitting rules:
|
|
|
|
1. Split on `speakerId` change.
|
|
2. Split when word gaps exceed `0.45`.
|
|
3. Split when sentence duration exceeds `8`.
|
|
|
|
```ts
|
|
if (nextWord.speakerId !== currentSpeakerId) {
|
|
flushSentence();
|
|
}
|
|
```
|
|
|
|
**Step 4: Run test to verify it passes**
|
|
|
|
Run: `npm test -- --run src/lib/alignment/sentenceReconstruction.test.ts`
|
|
Expected: PASS.
|
|
|
|
**Step 5: Commit**
|
|
|
|
```bash
|
|
git add src/lib/alignment/sentenceReconstruction.ts src/lib/alignment/sentenceReconstruction.test.ts
|
|
git commit -m "feat: add sentence reconstruction rules"
|
|
```
|
|
|
|
### Task 4: Implement Speaker Assignment Helpers
|
|
|
|
**Files:**
|
|
- Create: `E:\Downloads\ai-video-dubbing-&-translation\src\lib\alignment\speakerAssignment.ts`
|
|
- Create: `E:\Downloads\ai-video-dubbing-&-translation\src\lib\alignment\speakerAssignment.test.ts`
|
|
|
|
**Step 1: Write the failing test**
|
|
|
|
Test overlap-based speaker assignment.
|
|
|
|
```ts
|
|
it('assigns each word to the speaker segment with maximum overlap', () => {
|
|
const word = { text: 'hello', startTime: 1.0, endTime: 1.4 };
|
|
const speakers = [
|
|
{ speakerId: 'spk_0', startTime: 0.8, endTime: 1.1 },
|
|
{ speakerId: 'spk_1', startTime: 1.1, endTime: 1.6 },
|
|
];
|
|
|
|
expect(assignSpeakerToWord(word, speakers)).toBe('spk_1');
|
|
});
|
|
```
|
|
|
|
**Step 2: Run test to verify it fails**
|
|
|
|
Run: `npm test -- --run src/lib/alignment/speakerAssignment.test.ts`
|
|
Expected: FAIL because speaker assignment logic does not exist.
|
|
|
|
**Step 3: Write minimal implementation**
|
|
|
|
Add a pure overlap calculator and default to `unknown` when no segment overlaps.
|
|
|
|
```ts
|
|
const overlap = Math.max(
|
|
0,
|
|
Math.min(word.endTime, segment.endTime) - Math.max(word.startTime, segment.startTime),
|
|
);
|
|
```
|
|
|
|
**Step 4: Run test to verify it passes**
|
|
|
|
Run: `npm test -- --run src/lib/alignment/speakerAssignment.test.ts`
|
|
Expected: PASS.
|
|
|
|
**Step 5: Commit**
|
|
|
|
```bash
|
|
git add src/lib/alignment/speakerAssignment.ts src/lib/alignment/speakerAssignment.test.ts
|
|
git commit -m "feat: add speaker assignment helpers"
|
|
```
|
|
|
|
### Task 5: Isolate Backend Pipeline Logic from `server.ts`
|
|
|
|
**Files:**
|
|
- Create: `E:\Downloads\ai-video-dubbing-&-translation\src\server\subtitlePipeline.ts`
|
|
- Create: `E:\Downloads\ai-video-dubbing-&-translation\src\server\subtitlePipeline.test.ts`
|
|
- Modify: `E:\Downloads\ai-video-dubbing-&-translation\server.ts`
|
|
|
|
**Step 1: Write the failing test**
|
|
|
|
Add tests for orchestration-level fallback behavior.
|
|
|
|
```ts
|
|
it('returns partial quality when diarization is unavailable', async () => {
|
|
const result = await buildSubtitlePayload({
|
|
alignmentResult: {
|
|
words: [{ text: 'hi', startTime: 0, endTime: 0.2, speakerId: 'unknown', confidence: 0.9 }],
|
|
speakerSegments: [],
|
|
quality: 'partial',
|
|
},
|
|
});
|
|
|
|
expect(result.quality).toBe('partial');
|
|
});
|
|
```
|
|
|
|
**Step 2: Run test to verify it fails**
|
|
|
|
Run: `npm test -- --run src/server/subtitlePipeline.test.ts`
|
|
Expected: FAIL because orchestration code is still embedded in `server.ts`.
|
|
|
|
**Step 3: Write minimal implementation**
|
|
|
|
1. Move payload-building logic into `src/server/subtitlePipeline.ts`.
|
|
2. Make `server.ts` call the helper and only handle HTTP concerns.
|
|
|
|
```ts
|
|
export const buildSubtitlePayload = async (deps: SubtitlePipelineDeps) => {
|
|
// normalize alignment result
|
|
// translate text
|
|
// return { subtitles, speakers, quality, ... }
|
|
};
|
|
```
|
|
|
|
**Step 4: Run test to verify it passes**
|
|
|
|
Run: `npm test -- --run src/server/subtitlePipeline.test.ts`
|
|
Expected: PASS.
|
|
|
|
**Step 5: Commit**
|
|
|
|
```bash
|
|
git add src/server/subtitlePipeline.ts src/server/subtitlePipeline.test.ts server.ts
|
|
git commit -m "refactor: isolate subtitle pipeline orchestration"
|
|
```
|
|
|
|
### Task 6: Add an Alignment Service Adapter
|
|
|
|
**Files:**
|
|
- Create: `E:\Downloads\ai-video-dubbing-&-translation\src\server\alignmentAdapter.ts`
|
|
- Create: `E:\Downloads\ai-video-dubbing-&-translation\src\server\alignmentAdapter.test.ts`
|
|
- Modify: `E:\Downloads\ai-video-dubbing-&-translation\server.ts`
|
|
|
|
**Step 1: Write the failing test**
|
|
|
|
Test that the adapter maps raw alignment responses into normalized internal types.
|
|
|
|
```ts
|
|
it('maps aligned words and speaker segments from the adapter response', async () => {
|
|
const result = await parseAlignmentResponse({
|
|
words: [{ word: 'hello', start: 1.0, end: 1.2, speaker: 'spk_0', score: 0.95 }],
|
|
speakers: [{ speaker: 'spk_0', start: 0.8, end: 1.6 }],
|
|
});
|
|
|
|
expect(result.words[0].speakerId).toBe('spk_0');
|
|
});
|
|
```
|
|
|
|
**Step 2: Run test to verify it fails**
|
|
|
|
Run: `npm test -- --run src/server/alignmentAdapter.test.ts`
|
|
Expected: FAIL because no adapter exists.
|
|
|
|
**Step 3: Write minimal implementation**
|
|
|
|
Create an adapter boundary with one public function such as `requestAlignedTranscript(audioPath)`.
|
|
|
|
```ts
|
|
export const requestAlignedTranscript = async (audioPath: string) => {
|
|
// call local or remote alignment backend
|
|
// normalize response shape
|
|
};
|
|
```
|
|
|
|
**Step 4: Run test to verify it passes**
|
|
|
|
Run: `npm test -- --run src/server/alignmentAdapter.test.ts`
|
|
Expected: PASS.
|
|
|
|
**Step 5: Commit**
|
|
|
|
```bash
|
|
git add src/server/alignmentAdapter.ts src/server/alignmentAdapter.test.ts server.ts
|
|
git commit -m "feat: add alignment service adapter"
|
|
```
|
|
|
|
### Task 7: Upgrade `/api/process-audio-pipeline` Response Shape
|
|
|
|
**Files:**
|
|
- Modify: `E:\Downloads\ai-video-dubbing-&-translation\server.ts`
|
|
- Modify: `E:\Downloads\ai-video-dubbing-&-translation\src\services\geminiService.ts`
|
|
- Create: `E:\Downloads\ai-video-dubbing-&-translation\src\services\geminiService.test.ts`
|
|
|
|
**Step 1: Write the failing test**
|
|
|
|
Add a client-side test for parsing `quality`, `speakers`, and `words`.
|
|
|
|
```ts
|
|
it('maps the enriched audio pipeline response into subtitle objects', async () => {
|
|
const payload = {
|
|
subtitles: [
|
|
{
|
|
id: 'sub_1',
|
|
startTime: 1,
|
|
endTime: 2,
|
|
originalText: 'Hello',
|
|
translatedText: '你好',
|
|
speaker: 'Speaker 1',
|
|
speakerId: 'spk_0',
|
|
words: [{ text: 'Hello', startTime: 1, endTime: 2, speakerId: 'spk_0', confidence: 0.9 }],
|
|
confidence: 0.9,
|
|
},
|
|
],
|
|
speakers: [{ speakerId: 'spk_0', label: 'Speaker 1' }],
|
|
quality: 'full',
|
|
};
|
|
|
|
expect(mapPipelineResponse(payload).subtitles[0].words).toHaveLength(1);
|
|
});
|
|
```
|
|
|
|
**Step 2: Run test to verify it fails**
|
|
|
|
Run: `npm test -- --run src/services/geminiService.test.ts`
|
|
Expected: FAIL because the mapping helper does not exist.
|
|
|
|
**Step 3: Write minimal implementation**
|
|
|
|
1. Add a response-mapping helper in `src/services/geminiService.ts`.
|
|
2. Preserve the existing fallback path.
|
|
3. Carry `quality` metadata to the UI.
|
|
|
|
```ts
|
|
const quality = data.quality ?? 'fallback';
|
|
const subtitles = (data.subtitles ?? []).map(mapSubtitleFromApi);
|
|
```
|
|
|
|
**Step 4: Run test to verify it passes**
|
|
|
|
Run: `npm test -- --run src/services/geminiService.test.ts`
|
|
Expected: PASS.
|
|
|
|
**Step 5: Commit**
|
|
|
|
```bash
|
|
git add server.ts src/services/geminiService.ts src/services/geminiService.test.ts
|
|
git commit -m "feat: return enriched subtitle pipeline payloads"
|
|
```
|
|
|
|
### Task 8: Add Precision Metadata to Editor State
|
|
|
|
**Files:**
|
|
- Modify: `E:\Downloads\ai-video-dubbing-&-translation\src\components\EditorScreen.tsx`
|
|
- Create: `E:\Downloads\ai-video-dubbing-&-translation\src\components\EditorScreen.test.tsx`
|
|
|
|
**Step 1: Write the failing test**
|
|
|
|
Add a test for rendering a fallback warning when `quality` is low.
|
|
|
|
```tsx
|
|
it('shows a low-precision notice for fallback subtitle results', () => {
|
|
render(<EditorScreen ... />);
|
|
expect(screen.getByText(/low-precision/i)).toBeInTheDocument();
|
|
});
|
|
```
|
|
|
|
**Step 2: Run test to verify it fails**
|
|
|
|
Run: `npm test -- --run src/components/EditorScreen.test.tsx`
|
|
Expected: FAIL because the component does not track pipeline quality yet.
|
|
|
|
**Step 3: Write minimal implementation**
|
|
|
|
1. Add state for `quality` and `speakers`.
|
|
2. Surface a small status badge or warning banner.
|
|
3. Keep the existing sentence list and timeline intact.
|
|
|
|
```tsx
|
|
{quality === 'fallback' && (
|
|
<p className="text-xs text-amber-700">Low-precision timing detected. Manual review recommended.</p>
|
|
)}
|
|
```
|
|
|
|
**Step 4: Run test to verify it passes**
|
|
|
|
Run: `npm test -- --run src/components/EditorScreen.test.tsx`
|
|
Expected: PASS.
|
|
|
|
**Step 5: Commit**
|
|
|
|
```bash
|
|
git add src/components/EditorScreen.tsx src/components/EditorScreen.test.tsx
|
|
git commit -m "feat: surface subtitle precision status in editor"
|
|
```
|
|
|
|
### Task 9: Add Word-Level Playback Helpers
|
|
|
|
**Files:**
|
|
- Create: `E:\Downloads\ai-video-dubbing-&-translation\src\lib\playback\wordHighlight.ts`
|
|
- Create: `E:\Downloads\ai-video-dubbing-&-translation\src\lib\playback\wordHighlight.test.ts`
|
|
- Modify: `E:\Downloads\ai-video-dubbing-&-translation\src\components\EditorScreen.tsx`
|
|
|
|
**Step 1: Write the failing test**
|
|
|
|
Test the active-word lookup helper.
|
|
|
|
```ts
|
|
it('returns the active word for the current playback time', () => {
|
|
const activeWord = getActiveWord([
|
|
{ text: 'Hello', startTime: 1, endTime: 1.3, speakerId: 'spk_0', confidence: 0.9 },
|
|
], 1.1);
|
|
|
|
expect(activeWord?.text).toBe('Hello');
|
|
});
|
|
```
|
|
|
|
**Step 2: Run test to verify it fails**
|
|
|
|
Run: `npm test -- --run src/lib/playback/wordHighlight.test.ts`
|
|
Expected: FAIL because playback helpers do not exist.
|
|
|
|
**Step 3: Write minimal implementation**
|
|
|
|
1. Create a pure helper for active-word lookup.
|
|
2. Use it in `EditorScreen.tsx` to render highlighted word spans when `words` are present.
|
|
|
|
```ts
|
|
export const getActiveWord = (words: WordTiming[], currentTime: number) =>
|
|
words.find((word) => currentTime >= word.startTime && currentTime <= word.endTime);
|
|
```
|
|
|
|
**Step 4: Run test to verify it passes**
|
|
|
|
Run: `npm test -- --run src/lib/playback/wordHighlight.test.ts`
|
|
Expected: PASS.
|
|
|
|
**Step 5: Commit**
|
|
|
|
```bash
|
|
git add src/lib/playback/wordHighlight.ts src/lib/playback/wordHighlight.test.ts src/components/EditorScreen.tsx
|
|
git commit -m "feat: add word-level playback highlighting"
|
|
```
|
|
|
|
### Task 10: Snap Timeline Edges to Word Boundaries
|
|
|
|
**Files:**
|
|
- Create: `E:\Downloads\ai-video-dubbing-&-translation\src\lib\timeline\snapToWords.ts`
|
|
- Create: `E:\Downloads\ai-video-dubbing-&-translation\src\lib\timeline\snapToWords.test.ts`
|
|
- Modify: `E:\Downloads\ai-video-dubbing-&-translation\src\components\EditorScreen.tsx`
|
|
|
|
**Step 1: Write the failing test**
|
|
|
|
Test snapping to nearest word edges.
|
|
|
|
```ts
|
|
it('snaps a dragged start edge to the nearest word boundary', () => {
|
|
const next = snapTimeToNearestWordBoundary(
|
|
1.34,
|
|
[
|
|
{ text: 'Hello', startTime: 1.0, endTime: 1.3, speakerId: 'spk_0', confidence: 0.9 },
|
|
{ text: 'world', startTime: 1.35, endTime: 1.8, speakerId: 'spk_0', confidence: 0.9 },
|
|
],
|
|
);
|
|
|
|
expect(next).toBe(1.35);
|
|
});
|
|
```
|
|
|
|
**Step 2: Run test to verify it fails**
|
|
|
|
Run: `npm test -- --run src/lib/timeline/snapToWords.test.ts`
|
|
Expected: FAIL because no snapping helper exists.
|
|
|
|
**Step 3: Write minimal implementation**
|
|
|
|
1. Add a pure snapping helper with a small tolerance window.
|
|
2. Use it in the left and right resize timeline handlers.
|
|
|
|
```ts
|
|
export const snapTimeToNearestWordBoundary = (time: number, words: WordTiming[]) => {
|
|
// choose nearest start or end boundary within tolerance
|
|
};
|
|
```
|
|
|
|
**Step 4: Run test to verify it passes**
|
|
|
|
Run: `npm test -- --run src/lib/timeline/snapToWords.test.ts`
|
|
Expected: PASS.
|
|
|
|
**Step 5: Commit**
|
|
|
|
```bash
|
|
git add src/lib/timeline/snapToWords.ts src/lib/timeline/snapToWords.test.ts src/components/EditorScreen.tsx
|
|
git commit -m "feat: snap subtitle edits to word boundaries"
|
|
```
|
|
|
|
### Task 11: Add Speaker-Aware UI State
|
|
|
|
**Files:**
|
|
- Modify: `E:\Downloads\ai-video-dubbing-&-translation\src\components\EditorScreen.tsx`
|
|
- Modify: `E:\Downloads\ai-video-dubbing-&-translation\src\voices.ts`
|
|
- Create: `E:\Downloads\ai-video-dubbing-&-translation\src\lib\speakers\speakerPresentation.ts`
|
|
- Create: `E:\Downloads\ai-video-dubbing-&-translation\src\lib\speakers\speakerPresentation.test.ts`
|
|
|
|
**Step 1: Write the failing test**
|
|
|
|
Test stable color and label generation for speaker tracks.
|
|
|
|
```ts
|
|
it('creates stable display metadata for each speaker id', () => {
|
|
const speaker = buildSpeakerPresentation({ speakerId: 'spk_0', label: 'Speaker 1' });
|
|
expect(speaker.color).toMatch(/^#/);
|
|
});
|
|
```
|
|
|
|
**Step 2: Run test to verify it fails**
|
|
|
|
Run: `npm test -- --run src/lib/speakers/speakerPresentation.test.ts`
|
|
Expected: FAIL because no speaker presentation helper exists.
|
|
|
|
**Step 3: Write minimal implementation**
|
|
|
|
1. Create a helper that derives display color and fallback label from `speakerId`.
|
|
2. Use it to color sentence chips or timeline items.
|
|
3. Keep voice assignment behavior backward compatible.
|
|
|
|
```ts
|
|
export const buildSpeakerPresentation = ({ speakerId, label }: SpeakerTrack) => ({
|
|
speakerId,
|
|
label,
|
|
color: '#1677ff',
|
|
});
|
|
```
|
|
|
|
**Step 4: Run test to verify it passes**
|
|
|
|
Run: `npm test -- --run src/lib/speakers/speakerPresentation.test.ts`
|
|
Expected: PASS.
|
|
|
|
**Step 5: Commit**
|
|
|
|
```bash
|
|
git add src/components/EditorScreen.tsx src/voices.ts src/lib/speakers/speakerPresentation.ts src/lib/speakers/speakerPresentation.test.ts
|
|
git commit -m "feat: add speaker-aware editor presentation"
|
|
```
|
|
|
|
### Task 12: Verify End-to-End Behavior and Update Docs
|
|
|
|
**Files:**
|
|
- Modify: `E:\Downloads\ai-video-dubbing-&-translation\README.md`
|
|
- Modify: `E:\Downloads\ai-video-dubbing-&-translation\docs\plans\2026-03-17-precise-dialogue-localization-design.md`
|
|
|
|
**Step 1: Write the failing test**
|
|
|
|
Write down the manual verification checklist before changing docs so the release criteria are explicit.
|
|
|
|
```md
|
|
- [ ] Single-speaker clip returns `quality: full`
|
|
- [ ] Two-speaker clip shows distinct speaker IDs
|
|
- [ ] Fallback path shows low-precision notice
|
|
- [ ] Timeline resize snaps to word boundaries
|
|
```
|
|
|
|
**Step 2: Run test to verify it fails**
|
|
|
|
Run: `npm run lint`
|
|
Expected: PASS or FAIL depending on in-progress code, but manual verification is still incomplete until the checklist is executed.
|
|
|
|
**Step 3: Write minimal implementation**
|
|
|
|
1. Update `README.md` with new environment requirements and pipeline description.
|
|
2. Record the manual verification results in the design document or a linked note.
|
|
|
|
```md
|
|
## High-Precision Subtitle Mode
|
|
|
|
Set the alignment backend environment variables before running the app.
|
|
```
|
|
|
|
**Step 4: Run test to verify it passes**
|
|
|
|
Run: `npm test -- --run`
|
|
Expected: PASS.
|
|
|
|
Run: `npm run lint`
|
|
Expected: PASS.
|
|
|
|
Run: `npm run build`
|
|
Expected: PASS.
|
|
|
|
**Step 5: Commit**
|
|
|
|
```bash
|
|
git add README.md docs/plans/2026-03-17-precise-dialogue-localization-design.md
|
|
git commit -m "docs: document precise dialogue localization workflow"
|
|
```
|
|
|
|
## Notes for Execution
|
|
|
|
1. This workspace currently has no `.git` directory, so commit steps cannot be executed until the project is placed in a real Git checkout.
|
|
2. Introduce the alignment backend behind environment-based configuration so existing demos can still use the current fallback path.
|
|
3. Prefer pure functions for sentence reconstruction, speaker assignment, snapping, and word-highlighting logic so they remain easy to test.
|