There's plenty to learn about how Smartcat calculates cost, so we will try to cover some core features including the weight of different TM matches, net rates, and cost calculation discrepancies. To get a wider picture, please check our article about statistics in Smartcat.
TM matches and repetitions
Translation memory match percentages represent the similarity level between a source segment in the Editor and a translation memory unit. The minimum threshold for fuzzy matches is 50%.
Here’s an example:
|TM segment||Source segment||Match percentage|
|Main Settings||Main Settings||100%|
|Home and office||Home & office||70%|
|Research and development||Institute for research and development||62%|
|User interface design and development||Institute for research and development||37% (no match)|
There are different types of TM matches:
- 50-99% (Fuzzy match): The text of a source segment differs to some extent from the text of a TM unit.
- 100% (Exact text match): The text of a source segment is the same as the text of a TM unit.
- 101% (Context match): The text of a source segment is the same as the text of a TM unit. The text of either adjacent segment is the same as the text recorded into the TM unit property "x-context-pre" or "x-context-post"
- 102% (Full context match): The text of a source segment is the same as the text of a TM unit. The texts of both adjacent segments are the same as the texts recorded into the TM unit property
"x-context-pre" and "x-context-post".
- 103% (ID context match): The text of a source segment is the same as the text of a TM unit, and the "x-context" property of the TM unit matches the segment context(an entry under the segment text).
- Repetition: The text of a source segment appears several times throughout the document.
- Cross-file repetition: Generally speaking, the same as an ordinary repetition, but it appears within multiple documents of the same project.
Types of units used in Smartcat for calculating statistics
The following units are supported for statistics:
- Asian characters
- Pages (one page equals 250 words or Asian characters)
- Characters with spaces
- Characters without spaces
Please let us know if you would like the system to consider one page as 1,800 characters without spaces instead of 250 words in your account. We will customize these settings by your request.
Net rate schemes
There are two types of net rate schemes that should not be mixed up:
Client net rate schemes. They are mostly used for calculating statistics in projects to find out the number of weighted words that a particular client should pay. You can customize net rates schemes, and link them to clients. By selecting a client in your project, you can automatically assign the client’s net rate scheme to the project.
Supplier net rate schemes. Weighted words payable are calculated based on these net rates which correspond to the project stages. Supplier net rates schemes comprise rates for different translation memory matches and repetitions. They can be set in the corporate account Payments section -> Settings & balance -> Payment settings
Important: matches in a scheme must be paid at full rate (100%) for any stage above the first since after translating segments at the first stage, the segments considered being recorded as 102% matches in the translation memory of the project. Weighted words — the cost of a job respectively — are calculated according to the comparison of source segments to the translation memory units.
Supplier net rates schemes are fixing at the moment of the project creation, so if you need to change the match rates, please create a new project.
Project cost calculation
The cost of work is calculated automatically by multiplying the number of weighted words (payable) of original documents to the linguist's rate per word or, for hieroglyphs, per character rate according to the net rate scheme applied.
TM matches and repetitions are calculated with discounts. Though by default, discounts do not apply for the work of editors, proofreaders, and post-editors.
Standard net rate for translators
|TM matches and repetitions||Percentage of the full per-word rate|
|New words||100% (full rate)|
|50–74%||100% (full rate)|
|Repetitions, including cross-file repetitions||Unpaid|
Let’s take a document containing 31 words, for example. Of these, 14 words are new, 12 words are fuzzy matches, and 7 words are repetitions.
Doing some math magic, we have the following: (14 × 1) + (12 × 0,4) + (7 × 0) = 18.8.
Important: Only the first linguist who has confirmed a segment on a particular stage will be paid for the segment. It relates also to the system and project manager, that is, if the project manager runs MT pretranlsation, let's say, confirming segments on the translation stage, a translator assigned will not get any weighted words by editing and confirming the segments again. You can add one more workflow stage to avoid this.
If you have an opposite case where translators are supposed to revise their work but do not have access to the document anymore, you should cancel the previous assignment and assign them to the last stage of the document.
The system calculates the number of segments translated(confirmed) by each linguist and forms a job per each document. That said, net rate schemes are functioning in real-time, so the system determines whether a translation memory unit has been applied to a segment confirmed. This is especially important when multiple linguists are working at the same stage.
For example, Josh has confirmed a segment at the translation stage; thus, he has recorded the segment in the TM applied to the project. Later working on the same document but upon another range of segments, his peer, Andy, has translated a similar segment(a fuzzy match to Josh's) and will get paid for the words within the segment with a discount. This has happened because Smartcat currently does not discern similarity among the segments until some of them are confirmed, so the system had considered both segments as new ones before the linguists began their work.
Important: the job including weighted words from already confirmed by an assignee segments will be formed in the Payments section if a document has the status Confirmed or Canceled. If a client has deleted a document or whole project, the job will be formed in the Payments section too.
The customer can see the cost of assignees' work under the Team tab.
While the assignees can see how much they have earned on the project page in their personal workspace.
A linguist's rate sets at the moment of accepting the invitation to a project. However, the client can ask the Smartcat support team to change the total cost of a job upon a mutual agreement with the linguist. If that is the case, the support team will require a screenshot from the Chat where the linguist confirms a new rate or cost.
Why the estimated and final costs may differ
The final project cost may not be the same as the estimated cost that you see right after having done an assignment. Here are possible reasons for this:
- Because Smartcat currently does not discern similar segments until some of them are confirmed, discounts for TM matches are applied during translation, which results in a lower final cost
- Linguists assigned to a repetition-paid stage with splitting(or there are many documents with repetitions). One linguist that has confirmed a repetition within his or her range will receive the full payment for every other repetition of the confirmed segment even though they haven't been out of the linguist's assigned range.
- Assigned tasks have been changed. Any changes in the project concerning assignments and roles may affect the calculation.
- New documents or translation memories have been added. By adding a new translation memory to the project, more TM matches might appear, while assigning new or updated documents will lead to an increased word count.