video_translate/docs/plans/2026-03-17-export-preview-parity-design.md
2026-03-18 11:42:00 +08:00

77 lines
3.5 KiB
Markdown

# Export Preview Parity Design
**Date:** 2026-03-17
**Goal:** Make exported videos match the editor preview for audio mixing, subtitle timing, and visible subtitle styling.
## Current State
The editor preview and the export pipeline currently render the same edit session through different implementations:
1. The preview in `src/components/EditorScreen.tsx` overlays subtitle text with React and plays audio using the browser media elements plus per-subtitle `Audio` instances.
2. The export in `server.ts` rebuilds subtitles as SRT, mixes audio with FFmpeg, and trims the final output after subtitle timing and TTS delays have already been computed.
This creates three deterministic mismatches:
1. Export mixes original audio even when the preview has muted it because instrumental BGM is present.
2. Export uses relative subtitle times from the trimmed editor session but trims the final video afterward, shifting or cutting subtitle/TTS timing.
3. Export ignores `textStyles`, so the rendered subtitle look differs from the preview.
## Chosen Approach
Adopt preview-first export semantics:
1. Treat the editor state as the source of truth.
2. Serialize the preview-visible subtitle data, text styles, and audio volume data explicitly.
3. Convert preview-relative subtitle timing into export timeline timing before FFmpeg rendering.
4. Generate styled subtitle overlays in the backend instead of relying on FFmpeg defaults.
## Architecture
### Frontend
The editor passes a richer export payload:
1. Subtitle text
2. Subtitle timing
3. Subtitle audio volume
4. Global text style settings
5. Trim range
6. Instrumental BGM base64 when present
The preview itself stays unchanged and remains the reference behavior.
### Backend Export Layer
The export route should move the parity-sensitive logic into pure helpers:
1. Build an export subtitle timeline that shifts relative editor timings back onto the full-video timeline when trimming is enabled.
2. Build an audio mix plan that mirrors preview rules:
- Use instrumental BGM at preview volume when present.
- Exclude original source audio when instrumental BGM is present.
- Otherwise keep original source audio at preview volume.
- Apply each subtitle TTS clip at its configured volume.
3. Generate ASS subtitle content so font, color, alignment, bold, italic, and underline can be rendered intentionally.
## Data Flow
1. `EditorScreen` passes `textStyles` into `ExportModal`.
2. `ExportModal` builds a structured export payload instead of manually shaping subtitle fields inline.
3. `server.ts` parses `textStyles`, normalizes subtitle timing for export, builds ASS subtitle content, and applies the preview-equivalent audio plan.
4. FFmpeg burns styled subtitles and mixes the planned audio sources.
## Testing Strategy
Add regression coverage around pure helpers instead of FFmpeg end-to-end tests:
1. Frontend payload builder includes style and volume fields.
2. Export timeline normalization shifts subtitle timing correctly for trimmed clips.
3. Audio mix planning excludes original audio when BGM is present and keeps it at preview volume when BGM is absent.
4. ASS subtitle generation reflects the selected style settings.
## Risks
1. ASS subtitle rendering may still not be pixel-perfect relative to browser CSS.
2. Existing exports without style payload should remain backward compatible by falling back to safe defaults.
3. FFmpeg filter graph assembly becomes slightly more complex, so helper-level tests are required before touching route logic.