Serious Problem with Translation Job Review

Hi everyone,

A translation job of mine was recently reviewed. (French–English) It was a long text of over a thousand words, and according to the reviewer, I made "a few errors." But for just a total of three errors in a 1000-word text, I was rated 3/10. This is a shockingly unfair rating, but it seems like I can't do anything about it.

I have already written to the senior editor to explain my disagreement with the two errors cited, but I am not sure he or she is going to help. (I pointed out to the senior editor that he/she were insisting on a literal translation, and that my translation communicated the meaning even if it wasn't a word-for-word translation.)

I mean, I am honestly shocked. Here I am, looking at a review of a long text with two errors that I could argue with, and it has a rating of 3 that I can't do anything at all about. (The reviewer did not cite the third error.) This is a translation that is mostly error-free, even in the grading editor's own words, but it has a rating that says "Your translation is horrible enough that you deserve to be kicked out of the Gengo translator community."

How can I have trust in Gengo if a mostly error-free translation gets a "failure" rating of 3, and there's no way to review the scoring? I just imagine the horrifying possibility of translating another 1000-page text and getting rated a 2/10 for two or three mistakes.

Have any of you had a similar experience? Do you have any advice for me?

Thank you,

Gregory

Date Votes

16 comments

0

Alexander July 24, 2015 19:23

Hi Gregory,

When I started with Gengo, I failed the Standard test because I did too much rewriting. Next, I tried the Pro test without rewriting anything at all, and passed. Apparently, Gengo wants us to adhere as closely as possible to the original text.

So even though you feel there is nothing wrong with your way of translating, I would advise you accept as a fact of life that Gengo wants it their way. Try to translate word-for-word, and only make minimal changes to make your sentences sound natural.

Your quality score is based on the 5 most recent reviews. Once you have survived the next 5 reviews (and I bet you will, if only you are willing to learn from the ST's feedback), the 3/10 low rating will be "pushed out" and can't hurt you any more.

Comment actions Permalink
0

Blenheim September 26, 2015 12:44

Very late to the party here, but I'd also add that occasionally an unqualified reviewer will find his or her way into the system. I once got a 0 rating, for example, because the reviewer had very clearly gotten confused as to the document being graded; none of the words or phrases he cited in the review were present in the document or translation being scored. While I agree with Alexander above that you should try to take advice given in your reviews into account, "Your translation is horrible enough that you deserve to be kicked out of the Gengo translator community" is an unprofessional comment, and I'd question whether the person making it has anything to teach. Put in a ticket, mention the issue to Gengo, and keep working; your next review will most probably be done by someone more experienced.

Comment actions Permalink
-3

Megan Waters September 29, 2015 09:19

@Blenheim: All our Senior Translators are very, very carefully chosen and vetted so I think saying they are "unqualified" is incorrect. In this situation, it sounds like there was a bug with GoCheck, as there sometimes is and is a much more likely situation. If you had written in to support about this, and it was indeed a bug, your review would have been amended.

Please could you give me your language pair and the job number of this review and we will double check it?

Comment actions Permalink
0

Blenheim September 29, 2015 12:08

This happened a couple years ago, so I no longer have the job review number. Thank you for the offer, though.

I've gone through a lot of reviews, and I've had no issue with the great majority of them; I'm not indicting the system. No matter how careful a selection process, though, bad apples can slip in—or at least apples who engage in occasional bad-apple behavior. I would still say that the behavior of the reviewer I mentioned and the comment Gregory says was made to him were unprofessional.

Comment actions Permalink
-3

Megan Waters September 30, 2015 02:39

@Blenheim:

I don't think Gregory actually meant that the reviewer said that comment to him, I think he meant that's what the score implied. He said: This is a translation that is mostly error-free, even in the grading editor's own words, but it has a rating that says "Your translation is horrible enough that you deserve to be kicked out of the Gengo translator community."

I would like to remind you that reviewers don't just decide a score by themselves randomly, they use a special programme we built to help score each review and is based on a set of strict grading rules. Of course, they do sometimes make mistakes and you can dispute a score if you really do feel that is was graded unfairly. We monitor our reviewers very closely and take any complaint against them very seriously so please do let us know if you really do think a reviewer is being in any way unprofessional, and we will look into it.

Comment actions Permalink
0

Blenheim September 30, 2015 13:08

I'm not saying there isn't a well-considered system in place.I do stand by my previous comment, though.

Comment actions Permalink
0

gregb1007 September 30, 2015 16:39

Megan is right about what I meant. Also, the issue has since then been resolved.

Comment actions Permalink
0

Permanently deleted user November 02, 2015 09:17

I just got the "very" shocking score recently too. I am a translator in Thai language pair. I know this time I had made mistakes more than usual I admit it. But my work was not from copying Google Translate and I don't care anything just copy, paste and submit the job so that the senior can scores me 0.22/10.10 :(

Since I have been working with Gengo for 3 years I never ever got any shameful score like this before and my standard quality was always stayed at 9-10 not far from this.

Since the senior scored me 0.22 this time, my job quality fails down to 6.9 and in 'red' suddenly :(

At least I did it with my brain, I type it with fingers. This is too cruel for me and I don't know what is the standard of their measuring.

I mean, you know, people are different...someone may be too strict and someone may be too kind, then, what is the standard of job review?

Comment actions Permalink
0

hwemudua57 July 23, 2016 06:03

"they use a special programme we built to help score each review and is based on a set of strict grading rules." In other words the rating system is computerized, but we are completely penalized for using machine translation. Human translation for us, but senior translators get to use a special program? How does that make sense? The GoCheck system should be abolished, because it is imperfect, instead your blessed "Senior Translators" should take the time to painstakingly review each translation without an automated system, just as we have to painstakingly translate each sentence. They shouldn't be able to click a few buttons, say there are one or two errors, and give us a poor rating. Your system is flawed, revise it!

Comment actions Permalink
1

Alexander July 23, 2016 08:25

From https://support.gengo.com/entries/23692441-How-Gengo-measures-quality I understand (from the text below 'Job Reviews') the ST painstakingly spots any error and determines what type of error it is.

GoCheck does not judge the translation, it just takes the number and type of errors detected by the ST as input, as well as the units count, and outputs a score based on these numbers only.

If the ST were to manually calculate the score, the same translation could get a different score depending on which ST rates it or on the mood of the ST. Either that, or the ST had to do the calculation based on a standardized set of rules which penalty to apply to which type of error - which is exactly what GoCheck does. You may disagree about the details of those rules, but the ultimate basis for your score are the errors found in your translation by a human, not by a machine.

GoCheck is not an automated grammar and spellchecker where the ST gets paid for hitting an OK button to deem GoCheck's work correct.

Comment actions Permalink
1

juan.garcia.heredero July 31, 2016 11:20

I agree with Alexander, and I want to add that usually the reviews are correct and the quality score is more or less accurate. I'm puzzled sometimes when I have a low quality score (say, a 5 or 6) which is below the expected minimum of 7 (Standard) or 8 (Pro), and the ST says that 'You did a good translation but made some mistakes that lowered your score'. I mean, if it's a good translation, I think the quality score should reflect that.

I found this is mainly because of the ST that reviewed the job. The same mistake could be classified as a slight mistake by a ST, and by a critical mistake by another ST. I personally found that a particular ST is far more severe than others. But, at the same time, I have learnt a lot from him/her, so no complaints. Sometimes I disagree with the reviews and use the form to ask for clarifications, sometimes I acknowledge my mistakes and try to learn from them, but I think that all in all the review process is fair enough to me, especially since they made the changes to take into account the last 20 reviews (instead of 5) and the larger jobs have more impact than the smaller ones.

Comment actions Permalink
2

Blenheim July 31, 2016 14:47

See, I don't think that the new system of weighting larger jobs more heavily makes sense at all. A translation of, say, a 500-character job with one minor error is not of the same quality as one of a 5,000-character job with one minor error. The ratio of errors to work is significantly different—and yet both jobs are treated as being of the same quality by the system. Now, suppose that 5,000-character job has two minor errors. It's still of a higher quality than the 500-chara job, as there are fewer errors over a given length (every 500 characters, say)—yet it's treated as of lower quality by the system, and it counts more heavily against your score.

Yes, the standard should, of course, be no errors—but, as detailed here, the reviewers are fallible in what they deem to be errors, particularly when it comes to jobs in specialties outside their purview or...well, when they're in a bad mood. If they're under a misconception or marking poorly for a certain reason—like the reviewer who took issue with me using "Americanisms" in my translation, overlooking the fact that American English is standard for JA-EN—then those misconceptions are going to be compounded over the course of a longer job, leading to a lower score that counts more heavily against the translator. Factor in that a review of a longer job by sheer length affords more opportunities for reviewer...peccadilloes (particularly in JA-EN, which seems to attract a lot of "my specific wording is the ONLY way to translate this passage, and anything else is WRONG!" folks—not exclusively in Gengo; just at large), and translators are almost guaranteed a low score that'll count more heavily against them in the long run when longer jobs are evaluated.

(I'm not, by the way, claiming that the system is completely broken or that translators never make genuine mistakes that deserve to be marked down. Certainly, we've all made mistakes. What I'm saying, though, is that the treatment of longer jobs exacerbates & amplifies the component of human error inherent in the review system.)

I think I'd make more sense for a great majority of the jobs evaluated to be around a certain standard length or at least have a length cap—with a set number of evaluations of longer jobs per translator, just so that those jobs don't escape review. Do away with the length-based weighting entirely. It wouldn't be perfect, but it would make review scores (both individual job scores and overall translator scores) fairer & more comparable.

Comment actions Permalink
0

juan.garcia.heredero July 31, 2016 15:32

If I understand you well, you mean that, when translating bigger jobs, we are more likely to make translations mistakes. So, let's assume we have to translate 1000 words and they are grouped this way:

1) A job with 999 words and a job with 1 word.

2) Two jobs of 500 words each.

(Both scenarios have exactly the same texts, but the words are grouped differently.) Do you think that, on the long run and assuming you will pay the same attention and spend the same time for both cases, you will make more mistakes on the first one?

I think I won't. I think I make an average number of mistakes per 1000 translated words, so it does not matter how the words are grouped. And I learn over time (usually not making the same mistake twice), so in the long run I try to lower that average number of mistakes and raise my quality score.

Comment actions Permalink
1

mirko August 01, 2016 10:18

@Alexander "If the ST were to manually calculate the score, the same translation could get a different score depending on which ST rates it or on the mood of the ST. Either that, or the ST had to do the calculation based on a standardized set of rules which penalty to apply to which type of error - which is exactly what GoCheck does. You may disagree about the details of those rules, but the ultimate basis for your score are the errors found in your translation by a human, not by a machine." - But the very same can be said about the severity level assigned by STs to the errors they found (as Juan was saying)... I've had reviewers (not on Gengo) who considered some things as "minor" errors or even just preferential changes (such as typos or a particular wording/term/expression instead of another), while others marked them as more serious errors. Since, AFAIK, GoCheck doesn't automatically assign a severity level to errors, I'm pretty sure the same level of subjectivity applies to its scores.

@Blenheim "See, I don't think that the new system of weighting larger jobs more heavily makes sense at all. A translation of, say, a 500-character job with one minor error is not of the same quality as one of a 5,000-character job with one minor error." - Although I definitely agree with you about this, I don't think that's what they mean with "To give you a more accurate score, it more heavily measures scores of larger and more recent jobs than that of older and smaller jobs". Actually, I believe what is meant here is that the score you get on a 5,000 units job will weigh more toward the "overall quality score" than the score you get on a 500 units job. That said, I've always thought (and my experience with past scores seems to support it) that the score you get on individual jobs does take into account the number of units vs. severity/number of errors (which is how it should be).

Comment actions Permalink
0

Alexander August 01, 2016 12:09

@mirko - The basic point is that the role of the machine is limited to doing the math AFTER a human has spotted any errors and decided about their severity. It's true that there is still some ambiguity left, like wheter any given error is 'minor' or 'major', but it's not true that GoCheck reduces the quality of the quality score.

Comment actions Permalink
2

mkirali January 24, 2019 07:58

I just ran into this problem myself.

4000 words collection in 500 jobs. I had 3 jobs where half of the translation was missing, possibly due to the constant connection errors with gengo at that time, and one job where a word was mistranslated. Customer rated these 1/5 and the gocheck was triggered, resulting in four scores i am too embarrassed to say. My quality score went red. I understand and accept that for these mistakes I deserve the bad score, but 4 times for the same collection? At least pat me in the back for the other 496 jobs.

What is the message of this system? Don`t take large collections?

Comment actions Permalink

Please sign in to leave a comment.