Translate marketing images with AI (Image Translation 2.0)

Use Smartcat's AI-driven image translation workflow to create commercial-quality, translated images that require zero design adjustments. 

Overview

Image Translation 2.0 uses generative AI to create commercial-quality, translated images that require zero design adjustments. Instead of the traditional OCR-based workflow that requires manual editing of fonts, positioning, and layout, this new pipeline automatically preserves your original style, fonts, and visual identity.

The generative pipeline processes images through three stages:

  1. Text extraction: AI analyzes the image and extracts text blocks

  2. Translation: Text is translated using your glossaries and Translation Memory

  3. Image generation: AI generates the final image while preserving the original brand style

————————————————————————————————————————

When to use it

Use the generative image translation pipeline when you need to:

  • Translate marketing banners and ad creatives for international campaigns

  • Localize social media graphics for Instagram, Facebook, or LinkedIn

  • Translate images embedded in e-learning courses and training materials

  • Process product packaging graphics and labels

  • Translate infographics and visual content while maintaining brand consistency

💡 This pipeline works best for marketing and visual content where preserving the original design style is important. For document scans and screenshots, use the traditional OCR pipeline instead.

————————————————————————————————————————

How to use it

Select the generative pipeline during project creation

  1. Create a new project in your Smartcat workspace

  2. Navigate to the image upload wizard

  3. In the What kind of images to translate? dropdown, select Marketing & Visuals. This activates the generative AI pipeline

⚠️ Selecting Documents & Scans uses the traditional OCR pipeline

  1. Upload your images (banners, graphics, marketing materials)

  2. Configure your translation settings, including target languages, glossaries, and Translation Memory

  3. Submit the project for processing

The system processes your images using generative AI to extract, translate, and regenerate the content.

Review and edit translations

  1. Open the translated image in the CAT editor

  2. The image preview displays the original image initially

  3. Review the translated text in the segment grid

⚠️ Layer editing tools (Move, Resize, Font Select) are not available with the generative pipeline

  1. Edit text directly in the segment grid if needed. The system displays a soft warning if translated text is significantly longer than the original

  2. Once all segments are confirmed, click Update to regenerate the preview with your changes

The preview updates to show the AI-generated translated image.

Download the final image

  1. Confirm all segments are complete

  2. Click Done and Download to generate the final image

⚠️ Final generation may take up to 30 seconds

  1. If the AI output contains errors, click Regenerate to create a new version

Your translated image downloads with the original style, fonts, and layout preserved.

Requirements

  • Project must use the Marketing & Visuals image type selection

  • Glossaries and Translation Memory are recommended for consistent terminology

  • All segments must be confirmed before generating the final image

FAQs

What's the difference between Marketing & Visuals and Documents & Scans?

Marketing & Visuals uses the new generative AI pipeline that creates complete translated images preserving your brand style. Documents & Scans uses the traditional OCR pipeline with manual layer editing tools, which is better suited for document screenshots and scanned content.

Can I edit the font, position, or size of text in the generative pipeline?

No. The generative pipeline does not support manual layer editing. The AI automatically handles font matching, positioning, and sizing to match the original image style. If you need manual control over these elements, use the Documents & Scans pipeline instead.

What happens if the translated text is too long?

The system displays a soft warning when translated text is significantly longer than the source. The AI attempts to fit the text appropriately, but you may need to shorten the translation or use the Regenerate button if the result is not satisfactory.

How long does final image generation take?

Final image generation typically takes up to 30 seconds. The Done and Download button triggers the AI to create the final high-quality output.

Can I use glossaries and Translation Memory with this feature?

Yes. Glossaries and Translation Memory are applied during the translation stage to ensure consistent terminology across your images.