Frequently Asked Questions

Question 1

What is Speechlab?

Accepted Answer

Speechlab is a speech-to-speech AI platform for video and audio localization and accessibility. Upload spoken content — video, audio, podcasts, audiobooks — and dub, caption, or subtitle it in 50+ languages with a full editor. This is not document or file translation; Speechlab works exclusively on spoken audio and video.

Question 2

How is Speechlab different from Google Translate or DeepL?

Accepted Answer

Google Translate and DeepL translate text documents. Speechlab translates spoken content — the input is audio or video, and the output is dubbed audio, captions, or subtitles. The entire pipeline is speech-to-speech: ASR transcribes, AI translates, TTS generates the dubbed voice. You get localized media, not a translated text file.

Question 3

What file types and formats does Speechlab support?

Accepted Answer

Input: Video (MP4, MOV, MKV, WebM), audio (MP3, WAV, M4A, FLAC), YouTube link paste, and SRT import. Files up to 1.5 GB. Output: Dubbed video, dubbed audio, SRT subtitles, VTT subtitles, captions (sidecar or burned-in), and plain-text transcripts.

Question 4

How many languages does Speechlab support?

Accepted Answer

50+ languages for dubbing, captioning, and subtitling — including major European, Asian, Middle Eastern, and African languages. Each language has native-voice options. Voice cloning availability varies by language pair.

Question 5

Do I need to install anything?

Accepted Answer

No. Speechlab runs entirely in the browser. No desktop software, no plugins, no downloads.

Question 6

Can I try Speechlab for free?

Accepted Answer

Yes. 2 free projects of dubbing. All features. No credit card required.

Question 7

What is AI dubbing?

Accepted Answer

AI dubbing replaces the spoken audio in a video or audio file with synthesized speech in another language. Unlike subtitles, the audience hears the content — they don't read it. It combines automatic speech recognition (ASR), machine translation, and text-to-speech synthesis (TTS) to produce a dubbed version of the original content without a recording studio or voice actors.

Question 8

How does AI dubbing work with Speechlab?

Accepted Answer

The pipeline works in steps: (1) upload your video or audio, (2) ASR transcribes with speaker diarization, (3) AI translates segment by segment, (4) you assign a voice per speaker — clone the original, pick a native voice from the Voice Library, (5) TTS generates the dubbed audio, (6) you export dubbed video, audio, or subtitles. Every step is editable.

Question 9

How is dubbing different from subtitles?

Accepted Answer

Subtitles are text overlaid on video — the audience reads them. Dubbing replaces the audio — the audience hears the content in their language. Dubbing is the only localization option for audio-only content (podcasts, audiobooks) and produces a more natural experience for video where reading and watching compete for attention.

Question 10

Can AI dubbing clone the original speaker's voice?

Accepted Answer

Yes. Source-clone mode captures the original speaker's voice characteristics and synthesizes them speaking the target language. For multi-speaker content, each speaker can be cloned independently.

Question 11

What voice options are available?

Accepted Answer

Two modes per speaker: (1) Source clone — the AI replicates the original voice in the new language, (2) Native speaker voice from voice library — a natural-sounding voice native to the target language from the Speechlab catalogue for consistent brand voice across projects.

Question 12

Can I fix one sentence without re-dubbing the entire file?

Accepted Answer

Yes. Speechlab supports segment-level re-rendering. Change a word in the translation, click "Merge Changes to Dub," and only the affected segments re-render. Your credits pay for the fix, not a full re-run.

Question 13

How does Speechlab handle multiple speakers?

Accepted Answer

ASR automatically identifies and labels speakers (diarization). Each speaker gets their own voice assignment. You can rename, merge, or reassign speakers across the project. Voices are controlled per speaker, not per file.

Question 14

Does Speechlab offer lip-sync?

Accepted Answer

Lip-sync is available for enterprise accounts on request. Contact sales for details.

Question 15

What content types work for dubbing?

Accepted Answer

Any spoken content: video (films, documentaries, YouTube, product demos), audio (podcasts, audiobooks, training modules, lectures), and more. Any format, any length.

Question 16

How does transcription work?

Accepted Answer

Upload a video or audio file (or paste a YouTube link). ASR produces a diarized transcript with timestamps, speaker labels, and editable segments. The transcript appears in the editor where you can fix errors inline, adjust timing, and lock reviewed segments.

Question 17

What languages does Speechlab transcribe?

Accepted Answer

50+ languages, including English, Spanish, French, German, Portuguese, Japanese, Chinese, Arabic, Hindi, Korean, and many more.

Question 18

How accurate is AI transcription?

Accepted Answer

Accuracy depends on audio quality, accent, and background noise. On clean audio, modern ASR models achieve 95%+ word accuracy. Speechlab's inline editor lets you fix any errors directly — no export/re-import cycle.

Question 19

Can I transcribe a YouTube video?

Accepted Answer

Yes. Paste the YouTube URL and Speechlab fetches and transcribes it automatically. No need to download the video first.

Question 20

How long does transcription take?

Accepted Answer

Most files are transcribed in under a minute per 10 minutes of audio. Longer files and bulk uploads are queued and processed sequentially.

Question 21

Can I import an existing SRT instead of transcribing?

Accepted Answer

Yes. Import a .srt file and skip ASR entirely. Speechlab parses the segments, timestamps, and text so you can continue with translation, dubbing, or subtitle editing.

Question 22

Can multiple people edit the same transcript?

Accepted Answer

Yes. Speechlab supports concurrent editing with conflict detection — you'll see who else is editing and which segments they're working on. No silent overwrites.

Question 23

How does translation work in Speechlab?

Accepted Answer

Translation works on the speech in your video or audio

. Speechlab transcribes your media, then AI translates each segment preserving speaker attribution and timing. You edit inline, then the translation feeds directly into dubbing, captions, or subtitles.

Question 24

What translation engines does Speechlab use?

Accepted Answer

Claude, DeepL, and GPT-4 — selected per language pair for best quality. The AI translates segment by segment, preserving the structure of the spoken content.

Question 25

Can I edit the AI translation before it's dubbed?

Accepted Answer

Yes. Every translated segment is editable inline. Fix errors, adjust phrasing, match the register your audience expects. Lock segments you've reviewed to protect them from re-processing.

Question 26

How many languages can I translate into?

Accepted Answer

50+ target languages. Each language gets its own tab within the project. Add as many target languages as you need from a single source.

Question 27

Does translating into more languages cost more per language?

Accepted Answer

Yes. Each language you add costs credits based on the source media duration.

Question 28

What happens if I edit a translation after generating a dub?

Accepted Answer

The dub marks the edited segments as out of sync. Click "Merge Changes to Dub" to re-render only the changed segments — not the entire project.

Question 29

Is this the same as Google Translate?

Accepted Answer

No. Google Translate is a text-to-text tool for documents and web pages. Speechlab translates the speech in your video or audio as part of a localization pipeline — the output is dubbed audio, captions, or subtitles, not a translated text file.

Question 30

How do captions work in Speechlab?

Accepted Answer

Captions are generated from the transcription. They inherit speaker labels, timestamps, and segment structure. Edit caption text inline, adjust display settings, then export as SRT/VTT sidecar files or burned-in captions.

Question 31

How accurate are AI-generated captions?

Accepted Answer

Speechlab generates captions from a full ASR transcription pipeline — not a lightweight caption-specific model — so accuracy is typically higher than platform-native auto-captions (YouTube, TikTok, etc.). You can edit any errors inline before exporting.

Question 32

Can I generate captions in multiple languages?

Accepted Answer

Yes. Translate the speech in your video or audio into any target language, then export captions from that translation. A single project can have captions in as many languages as you need.

Question 33

Do captions include speaker identification?

Accepted Answer

Yes. Speaker labels carry through from the transcription. Each caption segment knows which speaker is talking — important for accessibility compliance.

Question 34

Can I use Speechlab for accessibility compliance (WCAG, Section 508, ADA)?

Accepted Answer

Yes. Speechlab captions include speaker identification, accurate timestamps, and editable text — core requirements for WCAG 2.1 Level AA and Section 508 compliance. Export as SRT/VTT sidecar files for web players that support accessible captions.

Question 35

What's the difference between captions and SRT subtitles in Speechlab?

Accepted Answer

Captions are display-ready text synced to your media — adjustable font, position, and styling. SRT subtitles are a separate product surface with broadcast-grade formatting: frame-accurate split/merge, CPS validation, and profile-driven rules for professional distribution. Both export as .srt files, but the SRT product gives you production-level control.

Question 36

Does Speechlab support burned-in captions?

Accepted Answer

Yes. Export captions rendered directly into the video file for social media, downloads, and offline viewing where sidecar files aren't supported.

Question 37

What is an SRT file?

Accepted Answer

An SRT (SubRip Subtitle) file is a plain-text subtitle format used by video players, streaming platforms, and broadcast systems. Each entry contains a sequence number, start/end timestamp, and the subtitle text. It's the most widely supported subtitle format.

Question 38

What's the difference between SRT and VTT?

Accepted Answer

VTT (WebVTT) is a web-native subtitle format similar to SRT but with additional styling options. Speechlab exports both. SRT is standard for video editing and broadcast; VTT is preferred for web and podcast players.

Question 39

Can I edit subtitle timing in the browser?

Accepted Answer

Yes. Drag segments on a waveform timeline, edit start/end times inline, split and merge segments, and validate against broadcast profiles — all in the browser. No desktop software required.

Question 40

What are CPS limits and why do they matter?

Accepted Answer

CPS (characters per second) measures reading speed. Broadcast standards typically require 15–25 CPS. Exceeding the limit means viewers can't read the subtitle before it disappears. Speechlab validates CPS per segment and highlights violations.

Question 41

Can I generate subtitles in multiple languages?

Accepted Answer

Yes. Translate the speech into any target language, then generate SRT files from each. Every language gets its own formatting rules, CPS calculation, and line-breaking logic.

Question 42

Does Speechlab handle RTL languages (Arabic, Hebrew)?

Accepted Answer

Yes. The SRT Generator applies proper RTL formatting, Unicode handling, and script-direction rules for Arabic, Hebrew, and Farsi subtitle files.

Question 43

What subtitle profiles are available?

Accepted Answer

Profiles define formatting rules — CPS, max line length, max lines per subtitle, min/max duration. Select a profile per project, validate against it, and fix violations before export. Custom profiles can be added for specific distribution requirements.

Question 44

How does pricing work?

Accepted Answer

Credit-based, per-minute, per-language pricing. You pay credits based on the duration of your source media for each language you add.

Question 45

Is there a free plan?

Accepted Answer

Yes. 2 free projects of dubbing with all features, dubbing, captions, subtitles. No credit card required.

Question 46

What's included in the Pro plan?

Accepted Answer

Per-minute credits, any file length, up to 4K resolution, API access, all voice modes, all export formats.

Question 47

What's included in the Enterprise plan?

Accepted Answer

Volume discounts, team roles, linguist review, custom voices, lip-sync, bulk processing, API integration, invoice billing, SSO, and custom data retention. Contact sales for details.

Question 48

Are there per-language charges?

Accepted Answer

Yes. Each language you add costs credits based on the source media duration. The rate per language is flat and visible upfront — no hidden multipliers or surprise charges.

Question 49

Do credits expire?

Accepted Answer

No. Credits never expire. Use them whenever you're ready.

Question 50

Is there a file length or size limit?

Accepted Answer

Files up to 1.5 GB, any duration. No per-file length cap on Pro or Enterprise plans.

Question 51

What enterprise features are available?

Accepted Answer

Bulk processing, API integration, linguist-reviewed outputs, custom voice creation, role-based team access, invoice billing, lip-sync, SSO, and custom data retention policies. Contact sales for details.

Question 52

Does Speechlab offer an API?

Accepted Answer

Yes. Pro and Enterprise plans include RESTful API access with per-project endpoints, webhook callbacks, and batch job tracking. Integrate Speechlab into your existing media asset management or content pipeline.

Question 53

How does bulk processing work?

Accepted Answer

Upload hundreds of video and audio files at once via the dashboard or API. Queue localization jobs across languages. Track progress per file, per language, per speaker. Bulk export dubbed media, captions, and subtitles in one click.

Question 54

What is the human-in-the-loop linguist review?

Accepted Answer

Enterprise accounts include professional linguist review on every output — checking translation accuracy, cultural nuance, and brand voice. Flag segments for re-review, approve inline, and export only when quality clears your bar.

Question 55

Can I manage team roles and permissions?

Accepted Answer

Yes. Assign creator, editor, and reviewer roles. Role-based permissions control who can edit, review, approve, and export. Conflict detection prevents silent overwrites when multiple people work on the same project.

Question 56

Does Speechlab support SSO?

Accepted Answer

Yes. Enterprise accounts can configure SSO for team authentication. Contact sales for setup details.

Question 57

Is NET-30 invoicing available?

Accepted Answer

Yes. Enterprise accounts use invoice billing — no credit card required. NET-30 terms.

Question 58

Can I white-label Speechlab for my clients?

Accepted Answer

White-label options are available for enterprise localization agencies. Contact sales for details.

Question 59

Is my content secure?

Accepted Answer

All uploads are encrypted in transit (TLS) and at rest (AES-256). Files are stored in SOC 2-compliant infrastructure.

Question 60

Who can access my uploaded content?

Accepted Answer

Only you and team members you've explicitly shared the project with. Speechlab employees do not access customer content unless required for support with your explicit permission.

Question 61

Can I control data retention?

Accepted Answer

Enterprise accounts can request custom data retention policies — including automatic deletion after a specified period.

Question 62

Where is data stored?

Accepted Answer

Files are stored in cloud infrastructure with regional availability. Enterprise accounts can request specific data residency requirements. Contact sales for details.

Question 63

Is Speechlab SOC 2 compliant?

Accepted Answer

Yes. Speechlab infrastructure is SOC 2-compliant. Enterprise accounts can request compliance documentation.

Question 64

What does the Speechlab editor look like?

Accepted Answer

A waveform timeline with draggable, resizable segments. Each segment shows the transcript text, speaker label, and timing. Click to seek, drag to reposition, resize to adjust timing. Translation, captions, and dub status are visible alongside the source.

General

Dubbing

Transcription

Speech translation

Captions & accessibility

SRT subtitles

Pricing & plans

Enterprise & teams

Security & privacy

Workflow & editor

Try it.
Hear your content in a new voice.

Frequently asked questions