Subtitle Editor Guide for Marketplace Suppliers
As a Marketplace supplier (freelance linguist), you use the Subtitle Editor for two primary workflows: translating subtitles and working with AI dubbing
Overview
The Subtitle Editor is Smartcat's dedicated environment for editing video and audio subtitles. As a Marketplace supplier (freelance linguist), you use the Subtitle Editor for two primary workflows: translating subtitles and working with AI dubbing. This guide covers both scenarios.
When to use it
Use this guide when you receive a task invitation for a video translation project. The Subtitle Editor opens automatically when you click on a subtitle file (SRT or VTT format) in the Files section of your project.
Key concepts and terminology
Term | Definition |
Cue | A single subtitle segment with a timecode (start and end time) and text content |
CPS (Characters Per Second) | A measure of reading speed that indicates whether the subtitle text fits within the available time |
Timeline | The visual panel showing subtitle placement and duration over time |
TTS (Text-to-Speech) | The AI technology that generates spoken audio from translated text in dubbing projects |
Requirements and Limitations
You must accept a task invitation to access the Subtitle Editor
⚠️ Cues are divided by timing, not by full sentences. This optimizes subtitles for readability and synchronization.
Part 1: Understanding the Editor Layout
The Subtitle Editor interface consists of three main areas:
Left Panel — Subtitle Segments (Cues)
Each row represents one subtitle cue with a timecode (start and end time) and text content
Source language cues are displayed on the left; your target translation is on the right
Cues are divided by timing, not by full sentences
Right Panel — Video Player and Preview
Displays the reference video with a graphical representation of subtitle cues
Line limit settings: adjust maximum line length and number of lines per cue
The video player automatically scrolls to the subtitle you are currently editing
Bottom Panel — Timeline
Visual representation of how subtitles are distributed over time
Playback controls: pause/play, adjustable speed and volume
Shows subtitle placement and duration
Supports multiple tracks: Subtitles, Voiceover, Background audio, and Original Audio
You can adjust cue timing directly on the timeline with granular control down to fractions of a second
Part 2: Translating Subtitles
This workflow covers standard subtitle translation where you translate subtitle text from one language to another, review timing, and confirm your work.
How it works
Step 1 — Review the source subtitles
Before translating, review the source subtitles while watching the video:
Play the video to understand the context, tone, and pacing
Note any segments where the source text may be unclear or where timing seems off
Check if the client has provided any glossaries or translation memories
Step 2 — Edit the translation
Click on any cue in the target column to begin editing
Type your translation directly in the target text field
Changes are saved automatically (with a brief delay for debouncing)
Press Enter within a cue to create multiple lines (for multi-line subtitles)
Monitor the CPS (Characters Per Second) indicator for each cue — this tells you whether the reading speed is appropriate for the subtitle duration
⚠️ If CPS is too high, the text may be too long for the available time. Consider shortening the translation or adjusting timing.
Step 3 — Manage cues (segments)
You have several operations available for managing cues:
Split by lines : If a cue contains multiple lines, click the Split by line button to divide it into separate cues
Merge cues : Select two adjacent cues and merge them into one (useful when a sentence spans two short cues)
Delete a cue : Remove an unnecessary cue via the three-dot menu
Insert a new cue : Hover over the border between two cues and click the "Insert cue" button, or use the three-dot menu to add a cue before or after the current one
Step 4 — Adjust timing
Timing adjustments ensure subtitles appear and disappear at the right moments:
Click on the timecode fields (start time and end time) to modify them directly
Use the timeline panel at the bottom to drag cue boundaries visually
Check for overlaps — the editor flags cases where one cue's end time exceeds the next cue's start time
Ensure subtitle duration matches the spoken content in the video
Step 5 — Confirm and mark as processed
As you complete each cue:
Mark individual cues as processed when you are satisfied with the translation
You can continue editing cues even after marking them as processed (until you finalize the file)
Click the Progress button to view your completion status
When all cues are reviewed, click Mark Processed in the control bar to finalize your work on the subtitle file
⚠️ You can use the "Mark Processed" button even if you have not individually marked every cue. Marking a file as processed is final from your side. Once submitted, you cannot reopen the task yourself. If you need to make corrections after submitting, contact the project owner and ask them to reopen the task.
Quality checks for subtitle translation
Check | What to Look For |
CPS (Characters Per Second) | Keep within the configured limit (typically 15-25 CPS). High CPS means the text is too long for the available time. |
Line length | Subtitles should not exceed the maximum line length configured by the client. The editor shows line length limits. |
Number of lines per cue | Most subtitles should have 1-2 lines. Three or more lines can be hard to read on screen. |
Timing accuracy | Subtitles should appear when the speaker starts talking and disappear shortly after they stop. |
Overlap detection | The editor flags overlapping cues. Resolve all overlaps before finalizing. |
Glossary compliance | If the client has provided a glossary, ensure terminology is consistent. The AI translation now applies glossary terms automatically, but you should verify. |
Translation memory | Translation memory matches are applied during automatic translation. Review and adjust as needed. |
Downloading your work
After completing translation, you can download the subtitle file in two formats:
SRT (SubRip Subtitle) — the most widely used subtitle format
VTT (WebVTT) — commonly used for web-based video players
Click the Download button and select your preferred format.
Part 3: AI Dubbing
AI Dubbing is the process of generating synthetic voice-over audio from translated subtitles. In this scenario, you work with the Subtitle Editor to review and refine AI-generated dubbed audio, adjust voice settings, manage multi-speaker assignments, and ensure the dubbed output sounds natural.
What is AI Dubbing in Smartcat?
When a client creates a video translation project with AI dubbing enabled:
The video is transcribed into subtitles (automatically or from an uploaded subtitle file)
The subtitles are translated into the target language (using AI translation with glossary and TM support)
AI voices generate dubbed audio for each translated subtitle cue
You, as the supplier, review and refine both the translated text and the dubbed audio output
How it works
Step 1 — Review the AI translation
Start by reviewing the AI-translated subtitle text:
Play the video to understand the original content
Check each translated cue for accuracy, natural phrasing, and appropriate length
Remember that dubbed audio must fit within the original timing. Overly long translations will result in unnaturally fast speech or CPS errors
Step 2 — Understand CPS for dubbing
CPS is critical in AI dubbing because it directly controls the speaking rate of the AI voice:
Each cue has a CPS value calculated from the text length and the cue duration
High CPS = the AI voice speaks faster (potentially unintelligible)
Low CPS = the AI voice speaks slower (may sound unnatural or leave gaps)
Adjust CPS by either shortening/lengthening the text or adjusting the cue timing
The editor highlights cues where CPS exceeds the configured maximum
Step 3 — Select and configure AI voices
Select a voice from the Audio (AI dubbing) dropdown
In the pop-up menu, review the preferred library of voices, suggestions, and filter voices by parameters
Select the voice that best matches the content and speaker
Step 4 — Work with multi-speaker content
Many videos feature more than one speaker. The Subtitle Editor supports multi-speaker dubbing workflows:
Speaker labels : Each subtitle cue can be assigned to a specific speaker. Look for the speaker label indicator on each cue in the left panel.
Manual speaker assignment : Currently, speaker detection is manual. Review each cue and assign the correct speaker label. Click on the speaker indicator for a cue and select or create a speaker from the dropdown.
Consistent voice mapping : Once you assign a voice to a speaker, use that voice consistently for all cues belonging to that speaker. This keeps the dubbed output natural and coherent.
Switching between speakers : Use the timeline view (bottom panel) to visually identify speaker transitions. Different speakers may appear on different audio tracks, making it easier to spot where speaker changes occur.
⚠️ Assign speakers correctly before generating audio. Incorrect speaker assignments result in voice inconsistencies in the final output.
Step 5 — Regenerate segments
After the initial AI-generated dubbing, you may need to regenerate specific segments:
Edit the translation text : Modify the translated text in a cue to improve phrasing, fix errors, or adjust for natural speech flow. Shorter, more natural phrases tend to produce better TTS output.
Regenerate audio : After editing the text, click the regenerate button on the cue to produce a new audio rendering with the updated text. The system uses the currently selected voice for that cue.
Try different voices : If a voice does not sound right for a segment, change the voice selection and regenerate. Compare different renditions to find the best fit.
Iterative refinement : Dubbing is an iterative process. You may need to regenerate a segment multiple times — adjusting text, voice, or both — until the result sounds natural.
⚠️ Each regeneration produces a fresh TTS rendering and replaces the previous version. Listen carefully before moving on.
Step 6 — Preview dubbed audio
Before finalizing, preview the dubbed output:
Play individual cues : Click the Play button on any cue to hear its dubbed audio in isolation. This helps you evaluate voice quality, pronunciation, and pacing for that specific segment
Play continuous preview : Use the video player in the right panel to play the video with the dubbed audio track. This gives you the full experience of how the dubbing sounds in context with the video.
Check synchronization : Confirm the dubbed audio aligns with the visual content. Lip movements will not match perfectly (this is expected with AI dubbing), but the audio should start and end at approximately the right times relative to the visual action.
Listen for transitions : Pay special attention to transitions between cues and between different speakers. Abrupt changes in volume, tone, or pacing between adjacent segments can be jarring for the viewer.
Use the timeline view to scrub through the video and spot-check specific sections.
Step 7 — Adjust timing for natural flow
Timing is critical for dubbing quality:
CPS impact : The CPS value directly affects the speech rate of the generated audio. A higher CPS means the TTS engine must speak faster to fit the text within the cue's time window. If the CPS is too high, the speech sounds rushed and unnatural
Recommended CPS range : Keep CPS within the acceptable range indicated by the editor (typically highlighted in green). Cues that exceed the recommended CPS are flagged with yellow or red indicators.
Shortening text to reduce CPS : If a cue has a high CPS, shorten the translated text. Use more concise phrasing, remove filler words, or simplify sentence structures, then regenerate the audio.
Adjusting cue boundaries : Adjust the start and end times of cues to give more time for the audio. Drag the cue boundaries in the timeline, or edit the timecodes directly. Avoid overlapping with adjacent cues.
Splitting long cues : If a cue contains too much text, split it into two shorter cues. This distributes the text across a longer time window and reduces the CPS for each individual cue.
The goal is to produce dubbed audio that sounds natural and unhurried while staying synchronized with the video content.
Step 8 — Finalize the dubbing task
Once you are satisfied with all segments:
Review all cues one final time. Scroll through the entire cue list and verify that every segment has been reviewed, the correct voice is assigned, and the audio sounds acceptable.
Check for flagged issues. Look for any remaining CPS warnings, unassigned speakers, or segments you have not yet regenerated after editing.
Mark cues as processed. As you confirm each cue, mark it as processed (checkmark or confirmation action) to track your progress through the task.
Complete the task. Once all cues are processed and you are satisfied with the output, mark the task as complete. This signals to the project owner that the dubbing work is finished and ready for their review.
⚠️ Marking a file as processed is final from your side. Once submitted, you cannot reopen the task yourself. If you need to make corrections after submitting, contact the project owner and ask them to reopen the task.