The last untranslatable topics


In last month’s translation industry updates, we wrote about what makes certain elements of a language particularly difficult to translate. This month, we’ll be looking at even more translation challenges, and we’ll also be thinking about some of the ways in which natural human speech could potentially create obstacles for automated translation. Read on to discover more. 


Novel information


One obvious limitation of any AI solution, and one that has been talked about particularly commonly in relation to some of the new generative AI tools, is that it can’t make decisions about what it hasn’t seen before. This makes it poorly suited to creating up-to-the-minute content about rapidly-evolving industries and topics, particularly when this means handling industry-specific terminology such as abbreviations that are specific to certain companies (and most companies tend to have a lot of these.) 


Any kind of generative AI will work best on topics for which there is a large volume of training data available, which means that niche fields are more difficult to cover without human involvement. This reliance on previously-generated content also accounts for some of the much-talked-about potential for bias in AI output, since it’s typically trained on scenarios that have already been widely written about, usually in widely-spoken European languages such as English. 


Non-standard ways of writing


Typing errors and poor punctuation can in some cases be real stumbling blocks even for human readers and translators. Missing or misplaced commas can potentially change the entire meaning of a sentence, which can create ambiguity if the resulting sentence is translated uncritically. Online content and much of the text created for mass communication also tends to follow slightly different grammar and stylistic rules than more professional content (with one example in English being the current social media style of writing sentences with all of the verbs in the -ing form, such as ‘Everybody fixing their bodies, nobody fixing their souls’). Whether, or how, to translate this type of non-standard speech into a completely different language is tricky, which is why user-generated content such as social media commentary can sometimes be harder to translate than more apparently complex topics. 


Ambiguous punctuation


Lack of capitalisation can also make sentences less readable and translatable even for humans, particularly when it comes to names of people and places that actually mean something in their language of origin, such as Brazilian place names like ‘Belo Horizonte’ (‘Beautiful Horizon’) or ‘Minas Gerais’ (‘General Mines’). If you capitalise ‘Sonic Youth’ then it’s obvious that you’re talking about an American rock band, but without the capitalisation or sufficient context or knowledge, it could be seen as a slightly more poetic way of referring to today’s actual youth. Names that can refer to both people and places – such as the woman’s name India in English – or to both people and months or seasons – such as ‘Avril’ in French – could also create confusion, as can the fact that in English many words can function as either nouns or verbs, which makes them difficult to interpret if we encounter them in a fragmented sentence which lacks proper context. 


Any kind of culturally or geographically specific information


Depending on your audience, it might be well-known that ‘Scotland Yard’ refers to the London Metropolitan Police, but it’s clearly not possible to translate the name literally into another language, and leaving the name in English without any additional explanation might not help an audience which isn’t aware of the significance of the term. And this same question of deeper meaning is also raised whenever the name of a place is used as a byword for that government, such as saying ‘Ankara decided’ to mean that ‘the government of Turkey decided’.


Smaller organisations, such as local football teams which are commonly referred to by supporters using nicknames or other alternative expressions, can also be tricky to pick up on and render in another language. And while many major global institutions such as the United Nations have widely-used equivalent abbreviations, such as ‘ONU’ in Spanish, the name of, say, a local police force or military command might resist easy translation. This then begs the question of how to handle it for a completely different target audience.


Pop culture references that are source language specific


References to local advertising slogans, particularly ones that have passed into popular speech, are difficult to pick up on without having the cultural knowledge to recognise them, or indeed without coming from a generation that remembers them. For example, in British English it’s not uncommon to say ‘I’m like Marmite’ to indicate that you’re similar to the opinion-splitting yeast-based spread in the sense that people either love you or they hate you, as the iconic advertising slogan ran. But those three words would mean virtually nothing without that knowledge.


Song lyrics or catchphrases from TV shows are also prime examples of text that has an additional significance over and above the content of the lyrics, and in some cases these can impact on the way in which a substantial proportion of a population speaks, as was the case with some of the phrases that the TV series ‘Friends’ spawned in the nineties. Yet the cultural resonance of such expressions might be lost if they were translated word-for-word. And following along this same line of thought is the question of foreign words being used for stylistic reasons, or because they reflect a word or a concept that’s more resonant in its original language, such as the culinary term ‘al dente’, which is now widely used outside of Italian. 


Potentially offensive or inappropriate content 


Our last and final example, and one which we touched on in last month’s article, is the challenge of translating idiomatic content and speech that is potentially sensitive. For example, euphemisms by their very nature tend to be quite opaque because they communicate difficult topics, with one example being the use of the everyday verb ‘to lose’ as a polite way of saying ‘to become bereaved’. This means that in English there is more than one way of understanding the sentence ‘Sadly, we lost our beloved dog’. 


And on a similar note, one of the areas in which language can be most nuanced and trickiest is when it relates to potentially sensitive topics, particularly considering the fact that what is deemed to be politically correct terminology tends to change quite quickly, with the penalties of using outdated or inappropriate language being often very high. All of this means that it’s important to exercise appropriate judgement to capture the different layers of meaning in a statement – and this can often create challenges for any translator. 


We hope that in this article we’ve succeeded in covering some of the interesting nuances of language that can thwart easy translation solutions. In the coming months’ updates, we’ll be thinking some more about how language influences technology and vice versa. 


Stay tuned for our next update, and until the next time! 




Forbes | Generative AI: Friend Or Foe For The Translation Industry?



Spiceworks | Generative AI Translation: Proceed with Caution



Please sign in to leave a comment.