I'm having a problem with the way my GoCheck score is being displayed, and I can't get Gengo to sort it out, so I'm wondering if anyone else here has had the same issue or could offer suggestions.
The problem is that my latest evaluation result, which is a very good one, is not being taken into account in the calculation of my GoCheck score. My GoCheck score was standing at 6.5. I then got a 250-word job evaluated which got 10/10. That should have brought my GoCheck score up quite significantly, but it didn't: the score has stubbornly remained at 6.5.
This problem started about 3 weeks ago. I raised a support ticket with Gengo. I then received a response saying that the issue would be passed on to the engineers, but then nothing happened for a while, so I chased the issue at the beginning of this week and I was told that it would be chased with the engineers - but nothing further has happened, the problem has not been solved. However, I've received an email asking how satisfied I am with the support I've received, and also a request to take part in a survey about my satisfaction with Gengo as a whole, and that doesn't feel right!
I'm raising the issue on this forum because I don't know what to do about it now. I have asked whom I could escalate it to, but the person who sent me the support emails hasn't answered that.
So, has anyone else experienced a similar problem? Do you have any suggestion as to what to do?
To confuse matters, my score shows different values in different places: it shows 6.5 on my dashboard but 8.4 on my profile. I think the 8.4 value is correct, but it seems that the incorrect lower value is the one being taken into account because I'm hardly getting any jobs at all.
So, any suggestions for solving this problem?
"To confuse matters, my score shows different values in different places: it shows 6.5 on my dashboard but 8.4 on my profile. I think the 8.4 value is correct, but it seems that the incorrect lower value is the one being taken into account because I'm hardly getting any jobs at all."
This is a normal situation, the scores on my dashboard and in my profile also differ. What you see in your public profile is just a weighted average of your last 10 GoCheck scores. This score is for customers. The score on your dashboard is for you - it's calculated differently, taking into account all of your previous GoCheck scores except for the worst one. You can read about it here: https://support.gengo.com/hc/en-us/articles/231441287-How-Gengo-measures-quality
So I may assume that your last score just didn't influence the score on your dashboard as much as you expected because of a different formula used for its' calculation.
The last 10 GoChecks should be weighted on your dashboard score are well. If your last 10 GoChecks BEFORE your 10/10 were something like "8, 6, 7, 6, 7, 6, 8, 6, 5, 10", and then your latest 10 AFTER the 10/10 were "10, 8, 6, 7, 6, 7, 6, 8, 6, 5", the 10 on the end would get moved out of your latest 10 reviews, resulting in the weighted part of your score being unchanged.
It's not uncommon to not have your score change even after a 10/10 review in my experience.
On the other hand, if you go by my example above, if you get another 10/10 to push the 5/10 score out of the recent reviews, you should get a decent score bump.
Thanks to both of you for clarifying the matter. I wasn't aware that the scores on the dashboard and on the profile are calculated using different formulae. I am not aware that Gengo have ever informed us of this either and, Flash-Ost, it is not mentioned in the link that you supply. If it is correct though, I hope that the score shown to customers remains higher than the one on the dashboard, as is the case for me now!
Also, Colin, thanks for your explanations that seem to fit my situation quite well. My last 10/10 pushed another 10/10 out of the list of my last ten evaluation results - the other 10/10 is the eleventh score from last, that's the one that was pushed out. So maybe that explains why my score hasn't changed, because a 10/10 replaced another one, but it's the first time this happens to me! Also, why doesn't a 10/10 on a long job count for more than one on a short job?
I don’t know if the rule is applied to all translators, but this is my experience for the last year:
10 reviews x 10/10 = 1 tenth (that’s to say if my score until then was 7.1, after the mentioned 10 reviews (all of them 10/10) changes to 7.2 (along the 10 reviews’ period no changes at all).
If during the 10 reviews, one of them has been marked with an error (whether real or not) at least 1 tenth of the score is subtracted (if the score reflected 7.2, after the review, it shows 7.1).
In my case, to add more unfairness to the situation, I was subtracted 1 tenth with no review or notification in the middle, and nobody in Gengo wanted to assume the error or rectify.
In conclusion, if the jobs of all Gengo’s translators are reviewed under the same policies, I assume that in a year their scorecard (no matter how good translators they are) must have gone downwards to a ridiculous score.
Ballmar - Yes, I've also noticed that the score comes down quicker than it goes up. Also, with regards to your score changing out of the blue, with no review or notification, it could be due to Gengo changing their calculation formula recently.
Daniel, as far as I know, Gengo hasn't recently changed its calculation system, it was more than a year ago. Regarding my issue, although the people contacted at Gengo said they couldn't clarify or justify the reason, no rectification was made.
As other translators have already explained, the score that you see in your translator profile is merely the weighted average of your last 10 GoChecks -- this is shown to customers only. The score that you see on your dashboard consists of two steps: the weighted average of your last 10 GoChecks minus your standard deviation (calculated based on your entire GoCheck history excluding your lowest score ever.)
I can see that you gained qualifications with us in early 2019. By then, our formula was already as described above, and we haven't made any recent changes.
As for this question:
Also, why doesn't a 10/10 on a long job count for more than one on a short job?
As explained before, your average (the first step of the calculation) is weighted. As such, each score will weight differently on the average based not only on length but also on recency of the job. Namely, I am not confirming that a 10/10 on a long job counts less than one on a short job, but rather highlighting the fact that are other factors at play here, and we're talking about 10 scores weighting differently, so you can't assume or assign the value of each -- this is calculated based on our proprietary formula.
With every new score, not only your weighted average changes, but also your standard deviation (the amount that we deduct from your weighted average to calculate the consistency of your quality) changes slightly. If you'd like to see a more detailed breakdown of the fluctuations in your standard deviation, please write to us at email@example.com.
Lara - Thanks for clarifying how the calculations on my dashboard and on my profile differ. I now understand why the score on the dashboard is lower than the one on the profile - and also that this is bound to remain the case for ever, given that the standard deviation is deducted to calculate the score on the profile.
However, I still can't understand the situation regarding weighting for length and recency of the jobs. After my last evaluation, an old 10/10 on a short job (the eleventh evaluation from last) got pushed out of my last ten marks, and it was replaced by the result of the newest evaluation, a 10/10 on a much longer job. Because the new 10/10 concerned a job that was both much more recent and much longer than the old one, I can't understand why my overall score didn't increase. There seem to be factors at play other than just weighting for length and recency. Maybe a pro job counts for more than a standard one? That could explain the situation too.
Hi Daniel - as your most recent evaluation pushed out an old score, it also reshuffled how much each score accounts for. Namely, the jobs that become older (moved back) also weight less now on the actual calculation, as we take not only length but also recency into account. And even if your most recent score is a 10, if in the recency reshuffle there are jobs with lower scores with a heavy weight, that would explain your current score. Take into account also that for your overall score to show an increase, the increase in the first step of the calculation (weighted average) has to be significant enough to show after the deduction of the standard deviation. Sometimes there is indeed an increase in the decimals, but you wouldn't see it in your dashboard, as we only show the first decimal, if that makes sense. Regardless, for specific details on your score history, so that you can see the evolution of the score and standard deviation, please do write in to our Support team and they'll be able to assist :)
Lara - Thanks for clarifying. I can see that a number of my very good older scores have been moved back and now weigh less for the calculation, so I think that explains the situation.
I have also had this happened to me multiple times where I get a 10/10 score and see no fluctuation in my Gocheck score, and yet, if I do get a review rate of less than 10, say a 9 or 8, it immediately goes down some points. I wonder how am I supposed to increase my level if getting 10's doesn't make a difference. Seems like you have to keep getting nothing but consecutive 10's in order to see an increment of 0.1 in your score.
I have just sent it to the support:
"Well, I have been trying for long to revert from commenting the evaluation system, since my score is well over the expected minimum, so it is not crucisal for me. There was a point, when getting a 10/10 haven't affected my average at all. Just most recently I have got 2 more 10/10 that increased my score from 7.5 to 7.7. Fine. And got (I presume) an automatic comment "Good job! You're doing well, but a few low reviews are affecting your overall score". Fine.
I have gone back in the historical data until Nov. 2017. From the 50 evaluations there were only 11 below 9 and only 20 below 9.5 and from the most recent 10 evaluation there were only 3 below 9.7. I dont think the 7.7 is a real reflection of my performance. Earlier I did comment that the methodology is far from any statistical science. Warm regards to all those creating this evaluation system. I don't want any personal comment. Maybe this evaluation system disqualifies some translators who doesn't deserve it."
Even if some of Gengo’s staff incessantly defend the currently existing evaluation system (based on their own conviction or because they are forced to do so), the truth is that it lacks logic, clarity and common sense.
The people who "invented" the mentioned system clearly are against Gengo and its translators, as when creating an ambiance of unfairness and discomfort, the negative effects reach every corner of the company.
When the system was created, as I understood nothing about it, I thought that it was because my mind was not up to such a "complex and sophisticated system". Later on, I understood that the aim of the terminology created around it was in fact to make it ununderstandable.
Just wanted to clarify this one bit:
I dont think the 7.7 is a real reflection of my performance.
You are correct, because that 7.7 is not a reflection of your performance, since we don't use a simple average, as our Support articles mention and as you yourself have described. With the current formula, that 7.7 does not mean that your jobs feature a quality level averaging 7.7 -- this is where the interpretation of it is wrong :) What that 7.7 actually means is that you regularly score above that, and that we can expect your quality to consistently be higher than that.
Hope this helps shift the perspective :)
As others already mentioned, the problem with the current formula is that your score can only go down and down with time... unless you get dozens of 10 to see the weighted average score go up only 0.1pt, which is ridiculous. I've got 6 "red" GoCheck scores out of 35 reviews, I'm sorry for that, but it doesn't deserve for me to be already on the verge of getting killed by the formula, since every 10 to come won't make the score go up anyway.
Each mistake weighs heavily, each good work doesn't make a difference... what's the point? Especially on collections that are worth under 1$, which is 90% of what I can get my hands on due to the average time needed for every collection to be snatched in EN/FR pair.
On the same idea, I wanted also to point out the difference of work environment/rhythm between different languages pairs. Some pairs like JP/FR are not as competitive and you have enough time to accept collections (sometimes within hours, and even get a +20% incentive), work on them calmly, and keep delivering good translations. But when collections get snatched in less than a second, I don't feel like I can focus and take my time: the more I take time, the more other collections get snatched meanwhile (plus the fact Gengo doesn't pay "rush fees", which should be the norm in the industry to make your workers not mere slaves).
(Unrelated, but speaking about slaves, I noticed a recurring customer, #605705, that keeps on posting since a few years standard-level jobs in the JP/FR pair for tourism website, involving historical research and toponym research to properly transliterate japanese : this should be Pro level, with appropriate rates for the researchs. I don't know what to do, flag the jobs ? Contact the client in the comments section ? They will probably keep posting these jobs at Standard level, though, which is unacceptable on a hourly rate basis.)
Thank you for your time and effort to change my perspective. As long as the figure is over 7 (?) there is no problem with whatever score.
And Claudius is right, some small mistakes (e.g. a punctuation or spelling issue) with a score below 10 can affect the Go check score down, while the numerous tens are hardly result the score go up.
Mathematically the reason is that the more the 10s, the higher the lifetime deviation, since the lifetime average goes higher, hence the scores under 10 show higher difference from the average. So, in the extreme case of all 10s in the most recent scores can still affect the calculated score go down or at least not increase.
Joining to what Claudius explained: Sometimes I rather reject small jobs, when there is no context to be precise in translation, or there is not an unequivocal solution. Or take my time "free" to make the research to find the proper solution or take my time to contact the client to clarify it. In smaller jobs, below $1 sometimes it simply doesn't worth the effort. Not only because of the very small hourly rate but because of the threat of low gocheck score.
I promise to have the last words in this issue.
Kind regards, Gyuri
Thank you Gyuri, and Lara, for the clarification.
I understand that giving a customer some kind of guarantee (which is the main point here) that the delivered quality will be up to a par specified by the GoCheck score figure has some importance for Gengo. If I can humbly suggest something, what seems to me wrong with the current formula, is that it is a "lifetime" one. Everyone has a few bad days, and I'm not asking here for benevolence, since the worst score you get in the last 10-streak of scores is already not taken into account, from the resources I've already read about that issue.
But what I wanted to say, is that a lifetime formula does not consider a critical factor: the margin of progession for each of us, especially if you try to take into account your reviewers' comments (NB: in JP/FR it is always the same person, and I think they know my style, so I'd better listen to their advices and apply them to later translations, the person is pretty sharp and they'll notice at once those who are not listening to their kind feedback). A solution could be a multi-GoCheck score, showing data for the 3 last months, the 6 last months, the last year, and finally a "lifetime" one. This way, translators would be able to clearly check if they have made progress, or if they're going the wrong way and need to adress quickly problems by changing their ways of working. For instance: I lost two points yesterday on my lifetime score, which indicates that the work I did last week on my vacation place, not as focused as usual, was clearly a questionable choice that I should avoid from now on. But sometimes one can have real excuses for not delivering the "best" translation possible: health problems, lost a loved one, or other issues that will make him less efficient than usual. The multi-GoCheck scores would help identify such "down" periods and help the translators work out on them. It would also help Gengo gain humanity as a company in the translation and IT industry. I guess this could be a win-win solution, since as you can see, many translators are really bothered by the new formula for its "unfair" aspect. Don't get me wrong about my few complaints, I like working for Gengo, the user-friendly UI and flexible aspects of the workbench makes it a great place, better than other sites I've translated for, not to name a few rivals. But I feel like we could go the extra mile on the most cruel aspect of the system.
Have a nice day, all.
Hi Gyuri & Claudius - my apologies for the late response, as I was out of office :) I appreciate your feedback on our Quality Score system, and I will make sure to pass it along to the team. Please do note, however, that I can't guarantee that this will bring about any changes in the near future, but at least the team will be aware of our translators' voice and opinions when planning further implementations in the future.
Separate from that, I'd like to once again clarify the calculation, just to make sure that we're all on the same page :)
I'd like to make sure that the understanding regarding the calculation is clear here :) The lowest score is not removed from your last 10 scores (the ones who make up your weighted average, which is the base of the calculation). The lowest score is removed from your standard deviation (the one that is calculated based on your entire GoCheck history). So if you had an odd "bad day", as it were, sometime in the past, we will simply remove it so it doesn't affect your score forever :) If you have several "bad days", though, we could argue that this is not a one-off thing, and that it happens more often than we (and probably you) would like, and there's always something to learn from that :)
While this is of course true, and we're all human, the responsibility for these types of situations should work both ways. Not only Gengo should understand that our translators are human and sometimes may have something going on in their life that affects their work, but also our translators should try to objectively assess their capabilities when deciding when and which jobs to pick up. As you mention in the example of working with less focus than usual during your vacation, calling it a questionable choice, I think deep down most of us know when we're distracted and producing subpar quality. Please don't get me wrong, I've made this mistake myself as a Gengo translator in the past, so I can fully empathize... and I can also say that I learned my lesson the hard way. When we realize that perhaps we're not fully present and not doing a good job, we can always assess whether it'd be better to decline the job and reopen it to someone else or, if we deem it's a matter of time, we can ask for a deadline extension :)
I do like this idea! I don't necessarily think it serves the purpose of addressing problems quickly (GoCheck reviews are more immediate and should already be serving this purpose), but I think it would provide a nice visual way for a translator to track their progress over time, instead of seeing only the final score. Again, while I can't promise we will be implementing anything like this anytime soon, I will be sure to pass this feedback along :)
Dear Lara, thank you for explaining all of this once again, since I guess this is not the first — nor the last — time someone doesn't understand fully the cryptic deviation factor. Must be boring at some point. My apologizes for being... slow at maths!
Also, I absolutely agree with you on the fact the Gengo/translator relationship has to be a reciprocal one. Working on improving my score one job at a time and compensate for one mistake I've made is a thing, but I thought all over again on how to be on par in such a competitive environment as some languages pair can be, while having some, you know, "quality ethics". Since you've got less than a second to hit the "start translating" button and, let's say the truth, can't read properly the contents to translate beforehand... well I think the best thing to do from now on is to take the job, but this time, don't hesitate to decline it once one has had time to read the contents, if they feel subpar on the topic or "at risk". It would be a pity to jeopardize months of efforts and a licence I'm prouf of, on one "random" little task I was not 100% mastering, so I will stick to this code of conduct now.
(Also, I took another few measures and edited my feed notifier filters not to be distracted by smallest fishes, and properly concentrate on medium-sized jobs. Less notifications, less stress, more focus. That's the idea.)
Concerning your comment:
So if you had an odd "bad day", as it were, sometime in the past, we will simply remove it so it doesn't affect your score forever :) If you have several "bad days", though, we could argue that this is not a one-off thing, and that it happens more often than we (and probably you) would like, and there's always something to learn from that :)
It would be relevant to take into account the time factor.
Of course, many bad days says something about consistency, but it should be calculated over a given time period, or amount of jobs.
One bad day over one year is consistent with 5 bad days over 5 years,
or in other words, it would be fairer to remove lower scores as a function of the number of Gocheck reviews that a translator has received... One bad score removed every 20 or 50 Gocheck reviews for example.
That's an interesting approach, and I will pass your feedback along :)
However, allow me to bring up something, as you seem to be mixing "time factor" with "number of GoChecks" and this doesn't always correlate with the experience of the Gengo translator. Please note that at Gengo, like with any other crowdsourced/freelance platform, people have different levels of activity (unlike a job where you clock in for a certain amount of hours every day). Some people translate several jobs daily, some people translate occasionally, once a week or even less.
The more active a translator is, the more GoChecks they receive, and therefore they may accumulate several GoChecks pretty quickly, without allowing for the level of growth that might happen over time (time factor), so there's always the risk that removing jobs every X number of GoChecks might accidentally blind us to a person's inconsistencies if they happen too close in time.
Hope it makes sense!
Yes exactly! I was thinking along those lines when I started off by stating the time factor (for example one year). But then I thought that the occasional Gengo translator would probably have only a few Gocheck reviews over a year (less than half a dozen?) and so removing one would make too much of a difference.
That is why I shifted to 20-50 GoCheck reviews. I based this number on my own activity which fits in with your description of a "several jobs daily" and as a proud "twice-over" Gengo Wordsmith, I guess I'm part of the upper tier of your active translators.
I have had 20 GoCheck reviews over the past year in my most active language pair, so the number of 20-50 GoCheck reviews was based on that.. (one low score every 1-2 years! which leaves time for growth–which in my opinion is not only the result of time but also of activity/practice :-) ).
Thank you Lara again for passing on our feedback.
Just my ten cents. I have up till today 72 Gocheck scores, back to 2014. In Sept. 2014 (six years ago) I have got a 2.8 evaluation. I wonder, what relevance has it today?
Hi Gyuri — is that your lowest score ever? If it is, the answer to your question is “none” :)
Your lowest score ever is removed from the calculation of your standard deviation :)
Hi Lara - No, it isn't. The lowest score ever is 1.2 in October 2019, where - because of some reason - the whole translation was missing.
I did reflect to the idea of time factor, raised by Alex, wether a six year old score has any relevance to my present (and expectable) performance.
Hi Gyuri -- got it! If that's your only lower score out of 72 GoChecks, and you're otherwise always pretty consistent, it really should only affect your standard deviation to the extent that is indeed relevant within your history, i.e. 1 job out of 72, so it wouldn't significantly increase your standard deviation.
If it's one more of several other low scoring jobs scattered throughout your history, however, it's relevant to the point that it illustrates that very fact and track record, but it won't be affecting your standard deviation in and of its own as a single job. Of course, it does have an effect, but it is not isolated -- meaning that the impact of a single low scoring job within a history of dozens of GoChecks is limited. In this scenario. If there's more low scoring jobs, though, that's a different story, and each of them will subsequently impact the calculation. Namely, what makes the most impact on the standard deviation calculation is not merely a one-off, but repeated fluctuations.
It is simple enough to check how much that GoCheck score of 2.8 from 6 years ago impacts your current score with Gengo.
You can simply copy/paste your history of GoChecks into a spreadsheet and get the answer in minutes. I was curious too, so I will share my result with everybody. I used all 57 scores from my second most active language pair, removed the worst one, and changed a GoCheck of 10 from 5 years ago to a 2.8 evaluation (your question).
Well, the result is not as limited as suggested by Lara: my standard deviation jumped 0.5 points (it doubled in fact which to me is a significant increase)! This can make a huge difference when you're on shaky ground (around 7)! So I think we have a fair question here for Gengo. Why would an old bad score dating five years back have such an impact on the current evaluation of our work?
Not to say that that standard deviation is here to stay, a huge number of 10s are needed to bring it down a tiny notch...(and in my case, 200 perfect 10s were needed to bring back the standard deviation to where it was).
This formula has been the source of so much frustration. It would be great if our voices could finally get heard to improve it so that it really reflects the current quality of our work.
Thank you for making the calculation instead of me ( :-) ) Since my score is at this moment 7.8, I don't feel myself in danger.
Somewhere I also told, that the more tens you have, the higher is the deviation. (Since the average is increasing by the tens, the deviation of those figures, lower then ten will show higher difference - hence the final result will higher deviation). Maybe with the 200 pieces of 10-s would help. You (we) have to provide perfect work for about 10 years or so :-) No problem....
Honestly I gave up arguing, but sometimes I cannot stand to say a word. Regards to all of you
@AlexF — While I appreciate your experiment, I believe you and Gyuri’s track records are vastly different, and therefore inserting his score in your own history will show the fluctuation in regards to your usual scores — not to his :)