After transcreation, is subtitling now stealing the headlines? Is this the new skill where humans can outshine AI? It certainly is a long way from 1:1 translation.
Not every consumer understands Chinese or English, or Korean. Voice-over is one way of translating spoken video or film content into other languages. Subtitling is another. Subtitles are becoming increasingly popular, as they are also useful when people watch videos without sound, perhaps because they don’t have their headphones with them. Whatever the reason, subtitles help to spread content to a wider audience.
There is software that performs subtitling automatically. YouTube and Facebook use such software, at least for English. TV channels are using AI to directly render news bulletins, panel discussions or interviews into other languages, i.e. for interlingual subtitling. Although quite successful, they are not about to replace human subtitlers for a while, I would think.
As this area of linguistic activity is rapidly gaining prominence, everyone should be aware of the rather special requirements and skills that are involved. From where I stand, it is an admirable skill – and the process is a long way from straightforward. Subtitling is in fact a highly specialist form of translation, with certain aspects differing considerably from “regular” translation. Most notable is perhaps the necessity for compacting and reducing, which plays a central role in the process – a form of abstraction which is imposed by temporal and spatial constraints.
Despite these constraints it is essential that the functional aspects be preserved in the subtitles. Message, intention, emotions etc. must remain intact; the effect on the viewer must be the same as if they were listening to the audio track. In addition to this, differences in syntactical structure pose major problems: on the one hand you are trying to “synchronize” your subtitles as closely as possible to what is being said, on the other you must ensure the language is natural and idiomatic, and easy to read, understand – and memorize.
In addition you must remember that in contrast to voice-over where you translate L1 speech into L2 speech, with subtitling, you are introducing another translation axis. Perhaps we should call it Transformation, as you are changing from the auditive, i.e. the spoken medium, to the visual, i.e. the written medium. So while your starting point (L1) is oral, your final output is written. And we all know that spoken language is a lot more spontaneous and often more convoluted than written language. I believe that this extra transformation requires yet a bigger mental effort than “regular” translation.
This is particularly true of course when you are subtitling interactions that have not been rehearsed and refined, i.e. when you are not subtitling a carefully constructed film dialog, but an informal, rather less sophisticated oral event. You will soon realize that many speakers or subject-matter experts do not speak in beautifully formed, logically construed sentences. They are not priests giving a well-rehearsed Sunday sermon. Instead, they stop and start, forget what they were going to say, leave out a crucial “not”, or use the wrong verb…, or the wrong tense. There are interruptions and words that “hang in the air”. All too often, the utterances lack coherence, leaving the subtitler baffled.
To the extent possible, such idiosyncrasies will need to be ironed out. At the same time, it is important to realise that subtitles do not obey the same strict rules as written language. They are expected to be somewhat closer to the spoken language and do not follow quite the same rigid rules. They aim at easy comprehension and syntactic structures that are quick to grasp, and retain. Because – in contract to a written text – the video viewer does not have the chance of going back and re-reading a sentence or two. Once he/she has read it, it’s gone!
Equally important is focus. Where is the focus, what is the emphasis? Perhaps you are familiar also with the linguistic terms “theme and rheme”, where theme refers to what is already known, and rheme is what is new, i.e. what the utterance or message is adding to the conversation. With subtitling, this takes on a particular significance. That means, the subtitler has to be aware of it all the time, and know how to handle it in their own language.
I think it is quite clear from the above observations that subtitling requires an extremely good and quick grasp firstly of the spoken source, followed by an equally quick processing of how to turn the extracted meaning into the target, then condensing it. This means filtering out the most pertinent elements of every utterance, and compacting the most relevant elements with as little loss of information (and sentiment) as possible. It requires reformulating complex utterances by breaking them down into smaller, logical units that can be grasped easily, leaving out anything that is pure noise. In other words, it is a form of abstracting.
The real skill here, I would say, is to be able to simplify complex messages and to know exactly what is and what is not, redundant. Leaving out too much, or important cues or information, would obviously disrupt the communicative context, and obviate the purpose.
As I believe the process involved is rather complex, it might help to break it down into discrete steps, bearing in mind the differences to “regular” translation.
The first step is taking in the context. Unlike in regular translation, video subtitling does not just consist of reading a text on paper or the screen, but actually playing the video, listening to the speaker/s, considering the setting, etc. Is someone reading out a script or are they talking off-the-cuff? Are they all fluent speakers? Who is the intended audience. Is it a formal, planned event or an informal setting? What’s the speed of the speaker/s? What is the exact communicative setting – one speaker addressing an audience, or a teacher interacting with students? Or a panel of peers?
Depending on what the subtitler is given, they may be doing the transcript and spotting themselves, or they may be working from a “template” or .srt file that already contains the time codes.
The second step then is the actual translation from L1 into the L2. Here, you pay special attention to any exclamations, fillers, hesitations, “stutters” (in the sense of unnecessary repetitions), false starts, with a view of deleting these. Then follows translation proper into a kind of rough target.
The third and final step involves working within the target (L2) only. This is concerned with transforming the spoken feel into an output that resembles written language rather more. It involves reduction, compacting, finding the best and shortest formulation, and refining of stylistic elements, including rhetoric, emphasis, best synonyms, etc.
The 3-column table below shows this process in schematic form. It is simplified of course. Very experienced practitioners will probably skip step 2, and more than one pass-through might be needed in step 3. But it is meant to illustrate the progression from translating source (col. 1) into an in-between translation (col. 2) and the final, condensed target (subtitle). The (slightly abridged) samples are taken from webinars for sales people, and are in my view quite typical for the kind of rambling discussions that have been going on during the pandemic. These are not to give you the worst examples of what we have seen. They are simply typical, rambling. Nor is my aim to present the perfect, ideal subtitle in column 3, but to give an idea just how much processing goes into the creation of suitably condensed output.
|So with that, why don’t we go ahead and get started, and for us to do that, Javier, why don’t you start by leading us, giving us some of the main terminology that is used around the cloud and cloud computing so we can all have a general understanding of that terminology.||Also, warum fangen wir nicht gleich an? Damit wir das tun können, Javier, warum führst du uns nicht zunächst in die wichtigsten Begriffe ein, die im Zusammenhang mit der Cloud und dem Cloud Computing verwendet werden, damit wir alle ein allgemeines Verständnis für diese Terminologie haben. (46 w)||Dann wollen wir anfangen. Javier, vielleicht führst du uns kurz durch die Terminologie im Zusammenhang mit Cloud Computing, damit wir ein allgemeines Verständnis der relevanten Begriffe bekommen? (27 w)|
|So, I think what’s interesting is we see a lot of people impressed in cloud and SaaS, so we see customers moving towards it, so maybe let’s discuss that a bit and hear from Peter on what are some of the motivating factors that we see from customers on why your people move on these cloud or SaaS model.||Ich denke, es ist interessant, dass wir eine Menge von Menschen sehen, die von Cloud und SaaS beeindruckt sind, also sehen wir, dass Kunden sich dahin bewegen. Lassen Sie uns das vielleicht ein wenig diskutieren und von Peter hören, was einige der Motivationsfaktoren sind, die wir von Kunden sehen, warum Ihre Leute auf diese Cloud- oder SaaS-Modelle umsteigen. (58 w)||Wir sehen sehr viel Interesse an Cloud und SaaS. Die Kunden bewegen sich in diese Richtung. Deshalb wollen wir das diskutieren und von Peter einige der Gründe und Motive erfahren, weshalb Kunden in die Cloud oder auf SaaS-Modelle wechseln. (39 w)|
|So, I mean I think at the minute customer experience is a very hot topic, and, it’s kind of on everyone’s agenda, everyone’s talking about it and everyone kind of wants it, but I don’t feel like it’s fully defined at the minute, which is why we need to get down to that before we even start.||Ich denke also, im Moment ist die so genannte Kundenerfahrung ein sehr aktuelles Thema, das auf jeder Agenda steht in aller Munde ist, über das jeder spricht und das jeder haben möchte, aber ich habe das Gefühl, dass es derzeit noch nicht vollständig definiert ist, weshalb wir das erst einmal tun müssen, ehe wir anfangen. (55 w)||Heute wollen wir uns also die sogenannte Customer Experience anschauen. Ein derzeit heißes Thema, das alle bewegt. Aber ich meine, es ist noch nicht wirklich definiert, und deshalb sollten wir das erst einmal tun, bevor wir anfangen. (37 w)|
It would appear that during step 3, the main aspect is reduction (see wordcounts). This gives some idea just how much mental effort the subtitler has to put into really understanding what can be left out without losing any relevant information, and why the art of abstracting is extremely useful in this context. I believe it does show that subtitling forces a lot more decisions and is a long way from a 1:1 translation.