Subtitle Editor Guide for Marketplace Suppliers

As a Marketplace supplier (freelance linguist), you use the Subtitle Editor for two primary workflows: translating subtitles and working with AI dubbing

Overview

The Subtitle Editor is Smartcat's dedicated environment for editing video and audio subtitles. As a Marketplace supplier (freelance linguist), you use the Subtitle Editor for two primary workflows: translating subtitles and working with AI dubbing. This guide covers both scenarios.

When to use it

Use this guide when you receive a task invitation for a video translation project. The Subtitle Editor opens automatically when you click on a subtitle file (SRT or VTT format) in the Files section of your project.

Key concepts and terminology

Term

Definition

Cue

A single subtitle segment with a timecode (start and end time) and text content

CPS (Characters Per Second)

A measure of reading speed that indicates whether the subtitle text fits within the available time

Timeline

The visual panel showing subtitle placement and duration over time

TTS (Text-to-Speech)

The AI technology that generates spoken audio from translated text in dubbing projects

Requirements and Limitations

  • You must accept a task invitation to access the Subtitle Editor

⚠️ Cues are divided by timing, not by full sentences. This optimizes subtitles for readability and synchronization.

Part 1: Understanding the Editor Layout

The Subtitle Editor interface consists of three main areas:

Left Panel — Subtitle Segments (Cues)

  • Each row represents one subtitle cue with a timecode (start and end time) and text content

  • Source language cues are displayed on the left; your target translation is on the right

  • Cues are divided by timing, not by full sentences

Right Panel — Video Player and Preview

  • Displays the reference video with a graphical representation of subtitle cues

  • Line limit settings: adjust maximum line length and number of lines per cue

  • The video player automatically scrolls to the subtitle you are currently editing

Bottom Panel — Timeline

  • Visual representation of how subtitles are distributed over time

  • Playback controls: pause/play, adjustable speed and volume

  • Shows subtitle placement and duration

  • Supports multiple tracks: Subtitles, Voiceover, Background audio, and Original Audio

  • You can adjust cue timing directly on the timeline with granular control down to fractions of a second

Part 2: Translating Subtitles

This workflow covers standard subtitle translation where you translate subtitle text from one language to another, review timing, and confirm your work.

How it works

Step 1 — Review the source subtitles

Before translating, review the source subtitles while watching the video:

  1. Play the video to understand the context, tone, and pacing

  2. Note any segments where the source text may be unclear or where timing seems off

  3. Check if the client has provided any glossaries or translation memories

Step 2 — Edit the translation

  1. Click on any cue in the target column to begin editing

  2. Type your translation directly in the target text field

  3. Changes are saved automatically (with a brief delay for debouncing)

  4. Press Enter within a cue to create multiple lines (for multi-line subtitles)

  5. Monitor the CPS (Characters Per Second) indicator for each cue — this tells you whether the reading speed is appropriate for the subtitle duration

⚠️ If CPS is too high, the text may be too long for the available time. Consider shortening the translation or adjusting timing.

Step 3 — Manage cues (segments)

You have several operations available for managing cues:

  • Split by lines : If a cue contains multiple lines, click the Split by line button to divide it into separate cues

  • Merge cues : Select two adjacent cues and merge them into one (useful when a sentence spans two short cues)

  • Delete a cue : Remove an unnecessary cue via the three-dot menu

  • Insert a new cue : Hover over the border between two cues and click the "Insert cue" button, or use the three-dot menu to add a cue before or after the current one

Step 4 — Adjust timing

Timing adjustments ensure subtitles appear and disappear at the right moments:

  1. Click on the timecode fields (start time and end time) to modify them directly

  2. Use the timeline panel at the bottom to drag cue boundaries visually

  1. Check for overlaps — the editor flags cases where one cue's end time exceeds the next cue's start time

  2. Ensure subtitle duration matches the spoken content in the video

Step 5 — Confirm and mark as processed

As you complete each cue:

  1. Mark individual cues as processed when you are satisfied with the translation

  2. You can continue editing cues even after marking them as processed (until you finalize the file)

  3. Click the Progress button to view your completion status

  4. When all cues are reviewed, click Mark Processed in the control bar to finalize your work on the subtitle file

⚠️ You can use the "Mark Processed" button even if you have not individually marked every cue. Marking a file as processed is final from your side. Once submitted, you cannot reopen the task yourself. If you need to make corrections after submitting, contact the project owner and ask them to reopen the task.

Quality checks for subtitle translation

Check

What to Look For

CPS (Characters Per Second)

Keep within the configured limit (typically 15-25 CPS). High CPS means the text is too long for the available time.

Line length

Subtitles should not exceed the maximum line length configured by the client. The editor shows line length limits.

Number of lines per cue

Most subtitles should have 1-2 lines. Three or more lines can be hard to read on screen.

Timing accuracy

Subtitles should appear when the speaker starts talking and disappear shortly after they stop.

Overlap detection

The editor flags overlapping cues. Resolve all overlaps before finalizing.

Glossary compliance

If the client has provided a glossary, ensure terminology is consistent. The AI translation now applies glossary terms automatically, but you should verify.

Translation memory

Translation memory matches are applied during automatic translation. Review and adjust as needed.

Downloading your work

After completing translation, you can download the subtitle file in two formats:

  • SRT (SubRip Subtitle) — the most widely used subtitle format

  • VTT (WebVTT) — commonly used for web-based video players

Click the Download button and select your preferred format.

Part 3: AI Dubbing

AI Dubbing is the process of generating synthetic voice-over audio from translated subtitles. In this scenario, you work with the Subtitle Editor to review and refine AI-generated dubbed audio, adjust voice settings, manage multi-speaker assignments, and ensure the dubbed output sounds natural.

What is AI Dubbing in Smartcat?

When a client creates a video translation project with AI dubbing enabled:

  1. The video is transcribed into subtitles (automatically or from an uploaded subtitle file)

  2. The subtitles are translated into the target language (using AI translation with glossary and TM support)

  3. AI voices generate dubbed audio for each translated subtitle cue

  4. You, as the supplier, review and refine both the translated text and the dubbed audio output

How it works

Step 1 — Review the AI translation

Start by reviewing the AI-translated subtitle text:

  1. Play the video to understand the original content

  2. Check each translated cue for accuracy, natural phrasing, and appropriate length

  3. Remember that dubbed audio must fit within the original timing. Overly long translations will result in unnaturally fast speech or CPS errors

Step 2 — Understand CPS for dubbing

CPS is critical in AI dubbing because it directly controls the speaking rate of the AI voice:

  • Each cue has a CPS value calculated from the text length and the cue duration

  • High CPS = the AI voice speaks faster (potentially unintelligible)

  • Low CPS = the AI voice speaks slower (may sound unnatural or leave gaps)

  • Adjust CPS by either shortening/lengthening the text or adjusting the cue timing

  • The editor highlights cues where CPS exceeds the configured maximum

Step 3 — Select and configure AI voices

  1. Select a voice from the Audio (AI dubbing) dropdown

  2. In the pop-up menu, review the preferred library of voices, suggestions, and filter voices by parameters

  3. Select the voice that best matches the content and speaker

Step 4 — Work with multi-speaker content

Many videos feature more than one speaker. The Subtitle Editor supports multi-speaker dubbing workflows:

  • Speaker labels : Each subtitle cue can be assigned to a specific speaker. Look for the speaker label indicator on each cue in the left panel.

  • Manual speaker assignment : Currently, speaker detection is manual. Review each cue and assign the correct speaker label. Click on the speaker indicator for a cue and select or create a speaker from the dropdown.

  • Consistent voice mapping : Once you assign a voice to a speaker, use that voice consistently for all cues belonging to that speaker. This keeps the dubbed output natural and coherent.

  • Switching between speakers : Use the timeline view (bottom panel) to visually identify speaker transitions. Different speakers may appear on different audio tracks, making it easier to spot where speaker changes occur.

⚠️ Assign speakers correctly before generating audio. Incorrect speaker assignments result in voice inconsistencies in the final output.

Step 5 — Regenerate segments

After the initial AI-generated dubbing, you may need to regenerate specific segments:

  • Edit the translation text : Modify the translated text in a cue to improve phrasing, fix errors, or adjust for natural speech flow. Shorter, more natural phrases tend to produce better TTS output.

  • Regenerate audio : After editing the text, click the regenerate button on the cue to produce a new audio rendering with the updated text. The system uses the currently selected voice for that cue.

  • Try different voices : If a voice does not sound right for a segment, change the voice selection and regenerate. Compare different renditions to find the best fit.

  • Iterative refinement : Dubbing is an iterative process. You may need to regenerate a segment multiple times — adjusting text, voice, or both — until the result sounds natural.

⚠️ Each regeneration produces a fresh TTS rendering and replaces the previous version. Listen carefully before moving on.

Step 6 — Preview dubbed audio

Before finalizing, preview the dubbed output:

  • Play individual cues : Click the Play button on any cue to hear its dubbed audio in isolation. This helps you evaluate voice quality, pronunciation, and pacing for that specific segment

  • Play continuous preview : Use the video player in the right panel to play the video with the dubbed audio track. This gives you the full experience of how the dubbing sounds in context with the video.

  • Check synchronization : Confirm the dubbed audio aligns with the visual content. Lip movements will not match perfectly (this is expected with AI dubbing), but the audio should start and end at approximately the right times relative to the visual action.

  • Listen for transitions : Pay special attention to transitions between cues and between different speakers. Abrupt changes in volume, tone, or pacing between adjacent segments can be jarring for the viewer.

Use the timeline view to scrub through the video and spot-check specific sections.

Step 7 — Adjust timing for natural flow

Timing is critical for dubbing quality:

  • CPS impact : The CPS value directly affects the speech rate of the generated audio. A higher CPS means the TTS engine must speak faster to fit the text within the cue's time window. If the CPS is too high, the speech sounds rushed and unnatural

  • Recommended CPS range : Keep CPS within the acceptable range indicated by the editor (typically highlighted in green). Cues that exceed the recommended CPS are flagged with yellow or red indicators.

  • Shortening text to reduce CPS : If a cue has a high CPS, shorten the translated text. Use more concise phrasing, remove filler words, or simplify sentence structures, then regenerate the audio.

  • Adjusting cue boundaries : Adjust the start and end times of cues to give more time for the audio. Drag the cue boundaries in the timeline, or edit the timecodes directly. Avoid overlapping with adjacent cues.

  • Splitting long cues : If a cue contains too much text, split it into two shorter cues. This distributes the text across a longer time window and reduces the CPS for each individual cue.

The goal is to produce dubbed audio that sounds natural and unhurried while staying synchronized with the video content.

Step 8 — Finalize the dubbing task

Once you are satisfied with all segments:

  1. Review all cues one final time. Scroll through the entire cue list and verify that every segment has been reviewed, the correct voice is assigned, and the audio sounds acceptable.

  2. Check for flagged issues. Look for any remaining CPS warnings, unassigned speakers, or segments you have not yet regenerated after editing.

  3. Mark cues as processed. As you confirm each cue, mark it as processed (checkmark or confirmation action) to track your progress through the task. 

  1. Complete the task. Once all cues are processed and you are satisfied with the output, mark the task as complete. This signals to the project owner that the dubbing work is finished and ready for their review.

⚠️ Marking a file as processed is final from your side. Once submitted, you cannot reopen the task yourself. If you need to make corrections after submitting, contact the project owner and ask them to reopen the task.