The translation industry. Part one: translation tests
May 22th, 2020
It is quite common nowadays for translation agencies to open a process to recruit new staff for internal positions or new translators to extend their databases. Not because they do not have enough for the main language pairs but for many other reasons, such as a deal with a new client which content might be very specific, or because they have a huge project and need to divide work among several translators, or simply because despite having quite a big number, they only work with a few trusted ones and want to look for new talent. These are all very reasonable and legitimate causes when it comes to look for new vendors.
So, agencies publish a job on a platform (LinkedIn, Proz, Translator’s Café, or any other), the professionals apply and then the company contacts some of the translators and requests an unpaid translation test that they need to pass to move forward with the process (despite the fact that the professionals have a degree specified in their CV and proven experience in the required fields), to make sure that they can provide high quality translations. But the test is a requirement to move forward with the process, so either you take it or you might lose the opportunity to get new projects.
Before we continue, this is not a complaint about tests themselves, but about the use and the way they are evaluated. Tests are helpful, but only if used properly. So, should translators be tested? Yes. And while I understand that testing translators helps agencies to narrow down all candidates, it is not the real purpose of the test. This article is based on my personal experience in my career as translator, proofreader and project manager, and reflects my personal opinion about the subject. The purpose of this article is not to call out any particular agency, but to open a debate to rethink about the process of translation tests.
If you work for a translation agency, you will probably think that these tests serve to measure the skills of the translator and the quality of their work. However, this is not the real purpose of a test. The real purpose of a test is to evaluate the quality of a translation, not the translator’s performance or skills. Think about it: you want to buy a painting, and you ask the artist for a similar painting to the one in your living room before you reveal what kind of painting you really want to buy. Despite the differences between translation tests and the painting analogy, the case would be the same: the skills of the professional are being tested, not the result.
But how can it be possible? The tests have specific instructions and translators should provide excellent quality with no typos and/or mistranslations. It should perfectly convey the source’s meaning and also be creative. Does this sound familiar to you? You cannot expect excellence from a test (in my humble opinion) because it is a one-time “job”, and also because the texts of the tests are carefully chosen to put the translator in a position to do a lot of research and make decisions “blindly” (more on this later). Of course, this does not mean that because translators possess a degree they will not commit mistakes: they inevitably will. But it is more likely that they commit mistakes on a one-time test than in a real project.
So let’s explain the “standard” process. There is a test going on, which is always a new opportunity, whether you are an experienced translator or not, for a new collaboration, a new client… basically, work. You (the translator) have been exchanging emails with a contact person (we will call this person CP) at the new agency about the new opportunity and the type of clients they work with, which will define content of the translation, competitor to get some references, etc. You receive the test, a bilingual table in Word with only the source text to be translated, an Excel file with maybe some images for context (only for some specific tests), or a Word file with fragments of texts chosen randomly. There are several potential formats. You check the file, you ask some questions prior to starting the test, you agree on the deadline (it’s a timed test in many cases, usually 48 hours for around 500 words) and the CP replies to you kindly and encourages you to do your best.
In the best possible scenario, you are allowed to ask questions (but not during the 48h turnaround time) and leave comments for the reviewer. Which from the start is kind of weird, because you are allowed to leave comments (as in a real project), but they will not be answered before you deliver (which is unlike a real project). The comments will be checked and evaluated by the reviewer once you deliver (remember this). So, you are on your own now. You and your knowledge, your experience, your resources and… your personal decisions are made on your own. Although this should not be much of a problem or issue if you are a good translator with experience in the content you are translating.
The real purpose of a test is to evaluate the quality of a translation, not the translator’s performance or skills.
You start working and notice that the text was carefully chosen to make you take some critical decisions, the formal-informal tone of voice, this or that word, capitalization, quotation marks, use of italics or underlined text, terminology research, transcreation, symbols… And then you ask yourself: how many people may have passed this test? It is not that you find difficult, you have your resources and enough experience to come up with the right solutions, but you identify some parts that require much more “dedication” than others simply because you do not have enough information about the context. It is ok, you start doing your research and you reach maybe two or three valid solutions you can apply, but you can choose only one. You choose one option and (if you are allowed) you leave comments for the reviewer. If you are not allowed to leave comments, you will have to trust your good judgement.
After research, translation, QA validation, running the spell check and leaving the comments you deem necessary, you prepare the file for delivery. You made sure no typos were left and you send it back to the CP and wait for the result, which can take as long as a week when you were given between 3 hours (it happened to me once) and two days.
At this point, a few things can happen:
- You never receive any feedback from the CP or the agency. They just disappear. It is not very common, but it happens sometimes.
- You receive an e-mail telling that you have passed or not passed, but you do not receive any feedback (positive nor negative) about your translation. This is probably the most common thing that happens. So if you fail, you will most probably think that the test serves to a purpose to get free translations. In my experience, this is not true but it is understandably one of the first thoughts that might come to your mind.
- Only if you have failed, you may receive the result of your performance with the evaluation criteria. Nevertheless, not all tests have unified criteria for evaluation and this leads to an open interpretation for the reviewer to tell if you have passed or not. This is because each agency has its own evaluation criteria and might penalize differently.
Let’s think about the purpose of the test. As said before, tests serve to evaluate the quality of a translation (in singular, one specific translation) between 300 and 600 words which in case of evaluation metrics such as the LISA QA model it really narrows the margin of error to a 2%-3%. This means that one typo can make you fail the test. But maybe that typo is not spotted by the spell-checker (know/now, for instance). The LISA QA model also scores each error according to the severity, which very detailed and strict (it is probably the most accurate model in the industry). However, there are some aspects open to the reviewer’s interpretation.
For example, let’s check this sentence: An ultra lightweight, extremely breathable cap constructed with GORE-TEX SHAKEDRY material. Let’s focus on the verb “constructed with”. In Spanish we have a precise verb, which is “confeccionada en”, but we also have a more generic option, “hecha con/de”. I will not go in detail about the verbs and the meaning and the prepositions they use, but the first verb is more suited for fashion. However, the second verb (more generic in its meaning) has the same accuracy, but it is less elegant. So, while the LISA QA model evaluates accuracy and style, this is not a mistake. It is a style issue. This kind of issue should not be even penalised, it should be recommended.
The tests have specific instructions and translators should provide excellent quality with no typos and/or mistranslations. It should perfectly convey the source’s meaning and also be creative. Does this sound familiar to you?
Remember that we said that when completing a test, the person doing the test has to do it blindly, without context. And even if you find the text on the website, you do not know if it is for that specific client, or if the company has chosen that text to test you for another client. The reviewers, though, have all the information about the tests. They were probably chosen before you received the test and, thus, they have all the information regarding context and client preferences, style guides, etc. Besides, if the reviewer is someone on the internal team of linguists at the agency, they already know all the key points to look at. For example, in a localization test I did recently there was a broken placeholder that and I did not notice. Since I did not flag it in the comments… I failed. But the reviewer did know about the broken placeholder. It was a “trap” for to see if translators are able to spot it.
Speaking of comments, earlier I said that your comments would be evaluated once you deliver. Comments should not be evaluated at all. They are there to let the reviewer know the choices and the process the translators had to take. They do not belong to the translation. Receiving feedback on one’s comments like “Too many explanations for such a poor style” are offensive and do not fall in line with the purpose of the test.
when completing a test, the person doing the test has to do it blindly, without context. And even if you find the text on the website, you do not know if it is for that specific client, or if the company has chosen that text to test you for another client
We are seeking excellence in a subjective industry. And I am not saying that metrics should be changed; rather, it is objectiveness that should prevail. In my opinion, style issues should not be evaluated, but be evaluated based on recommendation: “Translator used this verb, but this other is more accurate for this content”. Tests are meant to evaluate texts, not translators. I have faced very inconsistent work from translators who have passed tests before, and there might be very good ones that did not pass. This does not mean that there are not good or bad translators: there are plenty of both. But the purpose of the test is not meant to evaluate translator’s skills but the translation itself. You are supposed to score above 97.5% in the LISA QA model (based on a test between 300 and 600 words), but there are, objectively, a few decisions that might not be major, but minor, or even should not be considered as error but as preference.
If you are asked to review a translation test, be objective and provide a structured and detailed feedback on objective issues. Forget the fact that you are used to working for a client or that you know all the tricks and preferences for that client. The person doing the test does not. If you are a reviewer, you need to provide the same excellent work that the translator is required. For test translations, it does not mean to rewrite the translations, it means to identify the errors and apply the evaluation criteria correctly with a detailed explanation. It also means that if the translator did something outstanding, like a nice and creative translation, leave a positive comment too; everybody likes to know they are doing things well.
I think the best way to test translators is through a period of time and consistent real paid work (like 5-10 jobs). It is also more difficult to maintain because you would need to use the same proofreader for all the jobs. But this way you can truly evaluate consistency and progress over time. Like if you go to a restaurant and you order a dish that you did not like, but the service was nice, so you may give it another try. If translators make critical mistakes constantly or do not pay attention to instructions you will penalize them and you will end the collaboration with them. Like if you find a hair in your dish at a restaurant: you most likely will never go back.
In conclusion, I would only suggest rethinking the way translation tests are being used. I have passed and failed a lot of tests and I see vastly different criteria from one agency to another when evaluating tests (even if they use the same model, like the LISA one). If you are a translator completing a test, make sure you have time to focus on it. Clear your schedule and take all the time it has given to you to complete it, let the translation rest for a time and then review it with fresh eyes to provide the excellent translation that is expected from you. And, of course do not forget the basics: spell check and QA analysis (always!). I hope that this article has provided you a bit of information about test processes. If you work for an agency we would love to hear your experience with test translations. Let us know if you think you can improve the process. And you, translator? Do you have any anecdotes from the tests you might have taken? We would love to hear from all sides.
- PROZ Forum (2004). Reference material for translation assessment. Accessed: 10/05/2020. Available at: https://bit.ly/2TuSkIz
- PROZ Forum(2005). ISO Certification. Accessed: 10/05/2020. Available at: https://bit.ly/2ASiw9L
- PROZ Forum (2005). Price for quality assurance. Accessed: 10/05/2020. Available at: https://bit.ly/3cWYn0j
- SDL Trados. Lisa QA Metric. Accessed: 10/05/2020. Available at: https://bit.ly/3gdqbj1