16 KiB
Doubao LLM Provider Implementation Plan
For Claude: REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
Goal: Add a user-visible LLM switcher that lets subtitle generation use Doubao or Gemini, defaults to Doubao, and keeps TTS fixed on MiniMax with all provider keys sourced from .env.
Architecture: Move subtitle generation behind a new server endpoint, introduce a provider abstraction for Gemini and Doubao, and update the editor to send the selected provider while continuing to use the existing subtitle shape. Keep MiniMax TTS separate and untouched except for regression coverage.
Tech Stack: React, TypeScript, Express, multer, fetch, Vitest
Task 1: Add provider types and configuration resolution
Files:
- Create:
E:\Downloads\ai-video-dubbing-&-translation\src\server\llmProvider.ts - Create:
E:\Downloads\ai-video-dubbing-&-translation\src\server\llmProvider.test.ts - Modify:
E:\Downloads\ai-video-dubbing-&-translation\src\server\audioPipelineConfig.ts - Modify:
E:\Downloads\ai-video-dubbing-&-translation\src\server\audioPipelineConfig.test.ts
Step 1: Write the failing test
import { describe, expect, it } from 'vitest';
import { normalizeLlmProvider, resolveLlmProviderConfig } from './llmProvider';
describe('llmProvider config', () => {
it('defaults to doubao when no provider override is set', () => {
expect(normalizeLlmProvider(undefined)).toBe('doubao');
});
it('returns the selected provider key from env', () => {
expect(
resolveLlmProviderConfig('doubao', {
ARK_API_KEY: 'ark-key',
GEMINI_API_KEY: 'gemini-key',
}),
).toEqual(expect.objectContaining({ provider: 'doubao', apiKey: 'ark-key' }));
});
});
Step 2: Run test to verify it fails
Run: node .\node_modules\vitest\vitest.mjs run src/server/llmProvider.test.ts src/server/audioPipelineConfig.test.ts
Expected: FAIL because llmProvider.ts does not exist and audioPipelineConfig.ts still only exposes Gemini config.
Step 3: Write minimal implementation
export type LlmProvider = 'doubao' | 'gemini';
export const normalizeLlmProvider = (value?: string): LlmProvider =>
value?.toLowerCase() === 'gemini' ? 'gemini' : 'doubao';
export const resolveLlmProviderConfig = (
provider: LlmProvider,
env: NodeJS.ProcessEnv,
) => {
if (provider === 'doubao') {
const apiKey = env.ARK_API_KEY?.trim();
if (!apiKey) throw new Error('ARK_API_KEY is required for Doubao subtitle generation.');
return {
provider,
apiKey,
model: env.DOUBAO_MODEL?.trim() || 'doubao-seed-2-0-pro-260215',
baseUrl: 'https://ark.cn-beijing.volces.com/api/v3/responses',
};
}
const apiKey = env.GEMINI_API_KEY?.trim();
if (!apiKey) throw new Error('GEMINI_API_KEY is required for Gemini subtitle generation.');
return {
provider,
apiKey,
model: 'gemini-2.5-flash',
};
};
Step 4: Run test to verify it passes
Run: node .\node_modules\vitest\vitest.mjs run src/server/llmProvider.test.ts src/server/audioPipelineConfig.test.ts
Expected: PASS
Step 5: Commit
git add src/server/llmProvider.ts src/server/llmProvider.test.ts src/server/audioPipelineConfig.ts src/server/audioPipelineConfig.test.ts
git commit -m "feat: add llm provider configuration"
Task 2: Add the Doubao provider parser and contract tests
Files:
- Create:
E:\Downloads\ai-video-dubbing-&-translation\src\server\doubaoProvider.ts - Create:
E:\Downloads\ai-video-dubbing-&-translation\src\server\doubaoProvider.test.ts
Step 1: Write the failing test
import { describe, expect, it } from 'vitest';
import { extractDoubaoTextOutput } from './doubaoProvider';
describe('extractDoubaoTextOutput', () => {
it('reconstructs text from the Ark output array', () => {
const text = extractDoubaoTextOutput({
output: [
{
type: 'message',
content: [{ type: 'output_text', text: '[{"id":"1","translatedText":"你好"}]' }],
},
],
});
expect(text).toContain('translatedText');
});
});
Step 2: Run test to verify it fails
Run: node .\node_modules\vitest\vitest.mjs run src/server/doubaoProvider.test.ts
Expected: FAIL because doubaoProvider.ts does not exist.
Step 3: Write minimal implementation
export const extractDoubaoTextOutput = (payload: any): string =>
(payload?.output ?? [])
.flatMap((item: any) => item?.content ?? [])
.filter((part: any) => part?.type === 'output_text')
.map((part: any) => part.text ?? '')
.join('')
.trim();
Step 4: Run test to verify it passes
Run: node .\node_modules\vitest\vitest.mjs run src/server/doubaoProvider.test.ts
Expected: PASS
Step 5: Commit
git add src/server/doubaoProvider.ts src/server/doubaoProvider.test.ts
git commit -m "feat: add doubao response parsing"
Task 3: Add provider-backed translation adapters
Files:
- Modify:
E:\Downloads\ai-video-dubbing-&-translation\src\server\geminiTranslation.ts - Create:
E:\Downloads\ai-video-dubbing-&-translation\src\server\providerTranslation.ts - Create:
E:\Downloads\ai-video-dubbing-&-translation\src\server\providerTranslation.test.ts - Test:
E:\Downloads\ai-video-dubbing-&-translation\src\server\geminiTranslation.test.ts
Step 1: Write the failing test
import { describe, expect, it } from 'vitest';
import { createSentenceTranslator } from './providerTranslation';
describe('createSentenceTranslator', () => {
it('returns a Doubao translator when provider is doubao', () => {
const translator = createSentenceTranslator({
provider: 'doubao',
apiKey: 'ark-key',
model: 'doubao-seed-2-0-pro-260215',
});
expect(typeof translator).toBe('function');
});
});
Step 2: Run test to verify it fails
Run: node .\node_modules\vitest\vitest.mjs run src/server/providerTranslation.test.ts src/server/geminiTranslation.test.ts
Expected: FAIL because the provider selection layer does not exist.
Step 3: Write minimal implementation
export const createSentenceTranslator = (config: ProviderConfig) => {
if (config.provider === 'doubao') {
return createDoubaoSentenceTranslator(config);
}
return createGeminiSentenceTranslator(config);
};
Step 4: Run test to verify it passes
Run: node .\node_modules\vitest\vitest.mjs run src/server/providerTranslation.test.ts src/server/geminiTranslation.test.ts
Expected: PASS
Step 5: Commit
git add src/server/providerTranslation.ts src/server/providerTranslation.test.ts src/server/geminiTranslation.ts src/server/geminiTranslation.test.ts
git commit -m "feat: add provider-based translation adapters"
Task 4: Add a dedicated subtitle-generation endpoint
Files:
- Modify:
E:\Downloads\ai-video-dubbing-&-translation\server.ts - Create:
E:\Downloads\ai-video-dubbing-&-translation\src\server\subtitleRequest.ts - Create:
E:\Downloads\ai-video-dubbing-&-translation\src\server\subtitleRequest.test.ts - Test:
E:\Downloads\ai-video-dubbing-&-translation\src\server\subtitlePipeline.test.ts
Step 1: Write the failing test
import { describe, expect, it } from 'vitest';
import { parseSubtitleRequest } from './subtitleRequest';
describe('parseSubtitleRequest', () => {
it('defaults provider to doubao', () => {
expect(parseSubtitleRequest({ body: {} as any }).provider).toBe('doubao');
});
});
Step 2: Run test to verify it fails
Run: node .\node_modules\vitest\vitest.mjs run src/server/subtitleRequest.test.ts src/server/subtitlePipeline.test.ts
Expected: FAIL because the request parser does not exist.
Step 3: Write minimal implementation
export const parseSubtitleRequest = (req: { body: Record<string, unknown> }) => ({
provider: normalizeLlmProvider(String(req.body.provider || 'doubao')),
targetLanguage: String(req.body.targetLanguage || ''),
});
Then update server.ts to expose POST /api/generate-subtitles, validate input, resolve provider config, and return normalized subtitles.
Step 4: Run test to verify it passes
Run: node .\node_modules\vitest\vitest.mjs run src/server/subtitleRequest.test.ts src/server/subtitlePipeline.test.ts
Expected: PASS
Step 5: Commit
git add server.ts src/server/subtitleRequest.ts src/server/subtitleRequest.test.ts src/server/subtitlePipeline.test.ts
git commit -m "feat: add subtitle generation endpoint"
Task 5: Update the frontend subtitle service to use the new endpoint
Files:
- Modify:
E:\Downloads\ai-video-dubbing-&-translation\src\services\geminiService.ts - Create:
E:\Downloads\ai-video-dubbing-&-translation\src\services\subtitleService.ts - Create:
E:\Downloads\ai-video-dubbing-&-translation\src\services\subtitleService.test.ts - Test:
E:\Downloads\ai-video-dubbing-&-translation\src\components\EditorScreen.test.tsx
Step 1: Write the failing test
import { describe, expect, it, vi } from 'vitest';
import { generateSubtitles } from './subtitleService';
describe('generateSubtitles', () => {
it('posts the selected provider to the server', async () => {
const fetchMock = vi.fn(async () => ({
ok: true,
json: async () => ({ subtitles: [] }),
}));
await generateSubtitles(new File(['x'], 'clip.mp4'), 'English', 'doubao', null, fetchMock as any);
expect(fetchMock.mock.calls[0][0]).toBe('/api/generate-subtitles');
});
});
Step 2: Run test to verify it fails
Run: node .\node_modules\vitest\vitest.mjs run src/services/subtitleService.test.ts src/components/EditorScreen.test.tsx
Expected: FAIL because the new service does not exist and the editor still uses the Gemini-specific service directly.
Step 3: Write minimal implementation
export const generateSubtitles = async (
videoFile: File,
targetLanguage: string,
provider: 'doubao' | 'gemini',
trimRange?: { start: number; end: number } | null,
fetchImpl: typeof fetch = fetch,
) => {
const formData = new FormData();
formData.append('video', videoFile);
formData.append('targetLanguage', targetLanguage);
formData.append('provider', provider);
if (trimRange) {
formData.append('trimRange', JSON.stringify(trimRange));
}
const response = await fetchImpl('/api/generate-subtitles', {
method: 'POST',
body: formData,
});
return response.json();
};
Step 4: Run test to verify it passes
Run: node .\node_modules\vitest\vitest.mjs run src/services/subtitleService.test.ts src/components/EditorScreen.test.tsx
Expected: PASS
Step 5: Commit
git add src/services/subtitleService.ts src/services/subtitleService.test.ts src/services/geminiService.ts src/components/EditorScreen.test.tsx
git commit -m "feat: route subtitle generation through the server"
Task 6: Add the editor LLM selector and default it to Doubao
Files:
- Modify:
E:\Downloads\ai-video-dubbing-&-translation\src\components\EditorScreen.tsx - Modify:
E:\Downloads\ai-video-dubbing-&-translation\src\components\EditorScreen.test.tsx
Step 1: Write the failing test
it('defaults the llm selector to Doubao', () => {
render(<EditorScreen videoFile={file} targetLanguage="English" onBack={() => {}} />);
expect(screen.getByLabelText(/llm/i)).toHaveValue('doubao');
});
Step 2: Run test to verify it fails
Run: node .\node_modules\vitest\vitest.mjs run src/components/EditorScreen.test.tsx
Expected: FAIL because the selector does not exist.
Step 3: Write minimal implementation
const [llmProvider, setLlmProvider] = useState<'doubao' | 'gemini'>('doubao');
<label>
LLM
<select
aria-label="LLM"
value={llmProvider}
onChange={(event) => setLlmProvider(event.target.value as 'doubao' | 'gemini')}
>
<option value="doubao">Doubao</option>
<option value="gemini">Gemini</option>
</select>
</label>
Then pass llmProvider into the subtitle-generation service.
Step 4: Run test to verify it passes
Run: node .\node_modules\vitest\vitest.mjs run src/components/EditorScreen.test.tsx
Expected: PASS
Step 5: Commit
git add src/components/EditorScreen.tsx src/components/EditorScreen.test.tsx
git commit -m "feat: add llm selector to the editor"
Task 7: Add end-to-end provider and regression coverage
Files:
- Modify:
E:\Downloads\ai-video-dubbing-&-translation\src\server\subtitlePipeline.test.ts - Modify:
E:\Downloads\ai-video-dubbing-&-translation\src\services\geminiService.test.ts - Modify:
E:\Downloads\ai-video-dubbing-&-translation\src\server\minimaxTts.test.ts
Step 1: Write the failing test
it('does not change TTS behavior when the llm provider changes', async () => {
expect(true).toBe(true);
});
Step 2: Run test to verify it fails meaningfully
Run: node .\node_modules\vitest\vitest.mjs run src/server/subtitlePipeline.test.ts src/services/geminiService.test.ts src/server/minimaxTts.test.ts
Expected: FAIL or require stronger assertions until the new provider path is covered.
Step 3: Write minimal implementation
Add regression tests that prove:
- selected provider is forwarded correctly
- Doubao auth failures surface clearly
- Gemini still works when selected
- MiniMax TTS tests continue to pass unchanged
Step 4: Run test to verify it passes
Run: node .\node_modules\vitest\vitest.mjs run src/server/llmProvider.test.ts src/server/doubaoProvider.test.ts src/server/providerTranslation.test.ts src/server/subtitleRequest.test.ts src/server/subtitlePipeline.test.ts src/services/subtitleService.test.ts src/components/EditorScreen.test.tsx src/server/minimaxTts.test.ts src/services/geminiService.test.ts
Expected: PASS
Step 5: Commit
git add src/server/llmProvider.test.ts src/server/doubaoProvider.test.ts src/server/providerTranslation.test.ts src/server/subtitleRequest.test.ts src/server/subtitlePipeline.test.ts src/services/subtitleService.test.ts src/components/EditorScreen.test.tsx src/server/minimaxTts.test.ts src/services/geminiService.test.ts
git commit -m "test: cover llm provider switching"
Task 8: Verify the live app behavior
Files:
- Modify:
E:\Downloads\ai-video-dubbing-&-translation\.env.example - Modify:
E:\Downloads\ai-video-dubbing-&-translation\README.md
Step 1: Write the failing doc check
Add docs assertions by inspection:
.env.exampledocumentsARK_API_KEYand optionalDOUBAO_MODEL- README explains the editor LLM switcher and that MiniMax remains the TTS engine
Step 2: Run verification commands
Run: node .\node_modules\vitest\vitest.mjs run
Expected: PASS for the new targeted suites or clear identification of pre-existing unrelated failures.
Run: Invoke-WebRequest -UseBasicParsing http://localhost:3000/
Expected: 200
Run manual checks:
- open the editor
- confirm the
LLMselector defaults toDoubao - generate subtitles with
Doubao - switch to
Gemini - generate subtitles again
- confirm TTS still uses MiniMax
Step 3: Write minimal documentation updates
Document:
- required env keys
- default provider
- how the editor switcher works
Step 4: Re-run verification
Run: node .\node_modules\vitest\vitest.mjs run src/server/llmProvider.test.ts src/server/doubaoProvider.test.ts src/server/providerTranslation.test.ts src/server/subtitleRequest.test.ts src/services/subtitleService.test.ts src/components/EditorScreen.test.tsx src/server/minimaxTts.test.ts
Expected: PASS
Step 5: Commit
git add .env.example README.md
git commit -m "docs: document llm provider switching"
Notes
- This workspace is not a Git repository, so the commit steps may not be executable here.
- Existing unrelated TypeScript baseline issues in
src/lib/*andsrc/server/*should be treated as pre-existing unless the new work touches them directly.