The book series centres around human and machine translation, with a special emphasis on empirical studies. This includes computational, corpus linguistic and cognitive aspects of translation.
By its nature, the topic of translation is interdisciplinary in the sense that it involves many of the classical linguistic sub-disciplines such as computational linguistics, corpus linguistics, morphology, syntax, semantics, pragmatics, text linguistics, lexicography, psycholinguistics, neurolinguistics, applied linguistics and others. However, all book submissions need to have a clear focus on the translation aspect, and a special emphasis is laid on empirical studies. The aim of the book series is to bring these different perspectives closer together by offering a forum for all different approaches to the empirical study of translation. The series welcomes in particular studies investigating corpus data and/or experimental findings, preferably in – but not limited to – a quantitative perspective. Possible topics are:
The purpose of this book is to showcase a diverse set of directions in empirical research on mediated discourse, reflecting on the state-of-the-art and the increasing intersection between Corpus-based Interpreting Studies (CBIS) and Corpus-based Translation Studies (CBTS). Undeniably, data from the European Parliament (EP) offer a great opportunity for such research. Not only does the institution provide a sizeable sample of oral debates held at the EP together with their simultaneous interpretations into all languages of the European Union. It also makes available written verbatim reports of the original speeches, which used to be translated. From a methodological perspective, EP materials thus guarantee a great degree of homogeneity, which is particularly valuable in corpus studies, where data comparability is frequently a challenge.
In this volume, progress is visible in both CBIS and CBTS. In interpreting, it manifests itself notably in the availability of comprehensive transcription, annotation and alignment systems. In translation, datasets are becoming substantially richer in metadata, which allow for increasingly refined multi-factorial analysis. At the crossroads between the two fields, intermodal investigations bring to the fore what these mediation modes have in common and how they differ. The volume is thus aimed in particular at Interpreting and Translation scholars looking for new descriptive insights and methodological approaches in the investigation of mediated discourse, but it may be also of interest for (corpus) linguists analysing parliamentary discourse in general.View less
Examining the general impact of the Controlled Languages rules in the context of Machine Translation has been an area of research for many years. The present study focuses on the following question: How do the Controlled Language (CL) rules impact the Machine Translation (MT) output individually? Analyzing a German corpus-based test suite of technical texts that have been translated into English by different MT systems, the study endeavors to answer this question at different levels: the general impact of CL rules (rule- and system-independent), their impact at rule level (system-independent), their impact at system level (rule-independent), and at rule and system level. The results of five MT systems (a rule-based system, a statistical system, two differently constructed hybrid systems, and a neural system) are analyzed and contrasted. For this, a mixed-methods triangulation approach that includes error annotation, human evaluation, and automatic evaluation was applied. The data were analyzed both qualitatively and quantitatively based on the following parameters: number and type of MT errors, style and content quality, and scores from two automatic evaluation metrics. In line with many studies, the results show a general positive impact of the applied CL rules on the MT output. However, at rule level, only four rules proved to have positive effects on all parameters; three rules had negative effects on the parameters; and two rules did not show any significant impact. At rule and system level, the rules affected the MT systems differently, as expected. Some rules that had a positive impact on earlier MT approaches did not show the same impact on the neural MT approach. Furthermore, the neural MT delivered distinctly better results than earlier MT approaches, namely the highest error-free, style and content quality rates both before and after the rules application, which indicates that the neural MT offers a promising solution that no longer requires CL rules for improving the MT output, what in turn allows for a more natural style.View less
Language learning and translation have always been complementary pillars of multilingualism in the European Union. Both have been affected by the increasing availability of machine translation (MT): language learners now make use of free online MT to help them both understand and produce texts in a second language, but there are fears that uninformed use of the technology could undermine effective language learning. At the same time, MT is promoted as a technology that will change the face of professional translation, but the technical opacity of contemporary approaches, and the legal and ethical issues they raise, can make the participation of human translators in contemporary MT workflows particularly complicated. Against this background, this book attempts to promote teaching and learning about MT among a broad range of readers, including language learners, language teachers, trainee translators, translation teachers, and professional translators. It presents a rationale for learning about MT, and provides both a basic introduction to contemporary machine-learning based MT, and a more advanced discussion of neural MT. It explores the ethical issues that increased use of MT raises, and provides advice on its application in language learning. It also shows how users can make the most of MT through pre-editing, post-editing and customization of the technology.View less
Synopsis: Die vorliegende Arbeit widmet sich der Informationsintegration in maschinell übersetzten, mehrsprachigen Textchats am Beispiel des Skype Translators im Sprachenpaar Katalanisch-Deutsch. Der Untersuchung von Textchats dieser Konfiguration wurde sich bislang nur wenig zugewendet. Deshalb wird der zunächst grundlegend explorativ ausgerichteten Forschungsfrage nachgegangen, wie Personen eine maschinell übersetzte Textchat-Kommunikation wahrnehmen, wenn sie nicht der Sprache des Gegenübers mächtig sind. Damit einher geht auch die Untersuchung der Informationsextraktion und -verarbeitung zwischen Nachrichten, die in der eigenen Sprache verfasst wurden, und der Ausgabe der Maschinellen Übersetzung.
Zur Erfassung des Nutzungsverhalten im Umgang mit Skype und dem Skype Translator wurde mit einer deutschlandweit an Studierende gesendeten Online-Umfrage gearbeitet. In einer zweiteiligen, naturalistisch orientierten Pilotstudie unter Einsatz des Eye-Trackers wurde das Kommunikationsverhalten von Studierenden mit deutscher Muttersprache einerseits in maschinell vom Skype Translator übersetzten Chats mit katalanischen Muttersprachler·innen und andererseits, als Referenz, in monolingualen, rein deutschsprachigen Chats ohne Skype Translator untersucht. Bei den Teilnehmer·innen an diesen Studien handelt es sich um zwei unabhängige Gruppen. Beide wurden ebenfalls mit Fragebögen zum Nutzungsverhalten und zu den Eindrücken des Skype Translators erfasst.
Das sicher überraschendste Ergebnis der Studie ist, dass die Versuchspersonen einen substanziellen Teil der Chatkommunikation auf der MÜ-Ausgabe in beiden beteiligten Sprachen verbringen. Die Untersuchung der Sakkaden und Regressionen deutet auf einen sprunghaften Wechsel zwischen Originalnachricht und MÜ hin. Der Schwerpunkt der Aufmerksamkeit liegt dabei konsequent auf den neusten Nachrichten. Es ist daher anzunehmen, dass die Versuchspersonen die MÜ-Ausgabe aktiv in die Kommunikation miteinbeziehen und wesentliche Informationen zwischen Original und MÜ abzugleichen versuchen.View less
Artificial intelligence is changing and will continue to change the world we live in. These changes are also influencing the translation market. Machine translation (MT) systems automatically transfer one language to another within seconds. However, MT systems are very often still not capable of producing perfect translations. To achieve high quality translations, the MT output first has to be corrected by a professional translator. This procedure is called post-editing (PE). PE has become an established task on the professional translation market. The aim of this text book is to provide basic knowledge about the most relevant topics in professional PE. The text book comprises ten chapters on both theoretical and practical aspects including topics like MT approaches and development, guidelines, integration into CAT tools, risks in PE, data security, practical decisions in the PE process, competences for PE, and new job profiles.View less
Cognitive aspects of the translation process have become central in Translation and Interpreting Studies in recent years, further establishing the field of Cognitive Translatology. Empirical and interdisciplinary studies investigating translation and interpreting processes promise a hitherto unprecedented predictive and explanatory power. This collection contains such studies which observe behaviour during translation and interpreting. The contributions cover a vast area and investigate behaviour during translation and interpreting – with a focus on training of future professionals, on language processing more generally, on the role of technology in the practice of translation and interpreting, on translation of multimodal media texts, on aspects of ergonomics and usability, on emotions, self-concept and psychological factors, and finally also on revision and post-editing. For the present publication, we selected a number of contributions presented at the Second International Congress on Translation, Interpreting and Cognition hosted by the Tra&Co Lab at the Johannes Gutenberg University of Mainz.View less
The present volume seeks to contribute some studies to the subfield of Empirical Translation Studies and thus aid in extending its reach within the field of translation studies and thus in making our discipline more rigorous and fostering a reproducible research culture. The Translation in Transition conference series, across its editions in Copenhagen (2013), Germersheim (2015) and Ghent (2017), has been a major meeting point for scholars working with these aims in mind, and the conference in Barcelona (2019) has continued this tradition of expanding the sub-field of empirical translation studies to other paradigms within translation studies. This book is a collection of selected papers presented at that fourth Translation in Transition conference, held at the Universitat Pompeu Fabra in Barcelona on 19–20 September 2019.View less
Although the notion of meaning has always been at the core of translation, the invariance of meaning has, partly due to practical constraints, rarely been challenged in Corpus-based Translation Studies. In answer to this, the aim of this book is to question the invariance of meaning in translated texts: if translation scholars agree on the fact that translated language is different from non-translated language with respect to a number of grammatical and lexical aspects, would it be possible to identify differences between translated and non-translated language on the semantic level too? More specifically, this books tries to formulate an answer to the following three questions: (i) how can semantic differences in translated vs non-translated language be investigated in a corpus-based study?, (ii) are there any differences on the semantic level between translated and non-translated language? and (iii) if there are differences on the semantic level, can we ascribe them to any of the (universal) tendencies of translation? In this book, I establish a way to visually explore semantic similarity on the basis of representations of translated and non-translated semantic fields. A technique for the comparison of semantic fields of translated and non-translated language called SMM++ (based on Helge Dyvik’s Semantic Mirrors method) is developed, yielding statistics-based visualizations of semantic fields. The SMM++ is presented via the case of inchoativity in Dutch (beginnen [to begin]). By comparing the visualizations of the semantic fields on different levels (translated Dutch with French as a source language, with English as a source language and non-translated Dutch) I further explore whether the differences between translated and non-translated fields of inchoativity in Dutch can be linked to any of the well-known universals of translation. The main results of this study are explained on the basis of two cognitively inspired frameworks: Halverson’s Gravitational Pull Hypothesis and Paradis’ neurolinguistic theory of bilingualism.View less
Companies and organisations are increasingly using machine translation to improve efficiency and cost-effectiveness, and then edit the machine translated output to create a fluent text that adheres to given text conventions. This procedure is known as post-editing.
Translation and post-editing can often be categorised as problem-solving activities. When the translation of a source text unit is not immediately obvious to the translator, or in other words, if there is a hurdle between the source item and the target item, the translation process can be considered problematic. Conversely, if there is no hurdle between the source and target texts, the translation process can be considered a task-solving activity and not a problem-solving activity.
This study investigates whether machine translated output influences problem-solving effort in internet research, syntax, and other problem indicators and whether the effort can be linked to expertise. A total of 24 translators (twelve professionals and twelve semi-professionals) produced translations from scratch from English into German, and (monolingually) post-edited machine translation output for this study. The study is part of the CRITT TPR-DB database. The translation and (monolingual) post-editing sessions were recorded with an eye-tracker and a keylogging program. The participants were all given the same six texts (two texts per task).
Different approaches were used to identify problematic translation units. First, internet research behaviour was considered as research is a distinct indicator of problematic translation units. Then, the focus was placed on syntactical structures in the MT output that do not adhere to the rules of the target language, as I assumed that they would cause problems in the (monolingual) post-editing tasks that would not occur in the translation from scratch task. Finally, problem indicators were identified via different parameters like Munit, which indicates how often the participants created and modified one translation unit, or the inefficiency (InEff) value of translation units, i.e. the number of produced and deleted tokens divided by the final length of the translation. Finally, the study highlights how these parameters can be used to identify problems in the translation process data using mere keylogging data.View less
Unlike other professions, the impact of information and communication technology on interpreting has been moderate so far. However, recent advances in the areas of remote, computer-assisted, and, most recently, machine interpreting, are gaining the interest of both researchers and practitioners. This volume aims at exploring key issues, approaches and challenges to the interplay of interpreting and technology, an area that is still underrepresented in the field of Interpreting Studies. The contributions to this volume cover topics in the area of computer-assisted and remote interpreting, both in the conference as well as in the court setting, and report on experimental studies.View less
This text is a practical guide for linguists, and programmers, who work with data in multilingual computational environments. We introduce the basic concepts needed to understand how writing systems and character encodings function, and how they work together at the intersection between the Unicode Standard and the International Phonetic Alphabet. Although these standards are often met with frustration by users, they nevertheless provide language researchers and programmers with a consistent computational architecture needed to process, publish and analyze lexical data from the world's languages. Thus we bring to light common, but not always transparent, pitfalls which researchers face when working with Unicode and IPA. Having identified and overcome these pitfalls involved in making writing systems and character encodings syntactically and semantically interoperable (to the extent that they can be), we created a suite of open-source Python and R tools to work with languages using orthography profiles that describe author- or document-specific orthographic conventions. In this cookbook we describe a formal specification of orthography profiles and provide recipes using open source tools to show how users can segment text, analyze it, identify errors, and to transform it into different written forms for comparative linguistics research.View less
This volume of the series “Translation and Multilingual Natural Language Processing” includes most of the papers presented at the Workshop “Language Technology for a Multilingual Europe”, held at the University of Hamburg on September 27, 2011 in the framework of the conference GSCL 2011 with the topic “Multilingual Resources and Multilingual Applications”, along with several additional contributions. In addition to an overview article on Machine Translation and two contributions on the European initiatives META-NET and Multilingual Web, the volume includes six full research articles. Our intention with this workshop was to bring together various groups concerned with the umbrella topics of multilingualism and language technology, especially multilingual technologies. This encompassed, on the one hand, representatives from research and development in the field of language technologies, and, on the other hand, users from diverse areas such as, among others, industry, administration and funding agencies. The Workshop “Language Technology for a Multilingual Europe” was co-organised by the two GSCL working groups “Text Technology” and “Machine Translation” (http://gscl.info) as well as by META-NET (http://www.meta-net.eu).View less
Eyetracking has become a powerful tool in scientific research and has finally found its way into disciplines such as applied linguistics and translation studies, paving the way for new insights and challenges in these fields. The aim of the first International Conference on Eyetracking and Applied Linguistics (ICEAL) was to bring together researchers who use eyetracking to empirically answer their research questions. It was intended to bridge the gaps between applied linguistics, translation studies, cognitive science and computational linguistics on the one hand and to further encourage innovative research methodologies and data triangulation on the other hand. These challenges are also addressed in this proceedings volume: While the studies described in the volume deal with a wide range of topics, they all agree on eyetracking as an appropriate methodology in empirical research.View less
Contrastive Linguistics (CL), Translation Studies (TS) and Machine Translation (MT) have common grounds: They all work at the crossroad where two or more languages meet. Despite their inherent relatedness, methodological exchange between the three disciplines is rare. This special issue touches upon areas where the three fields converge. It results directly from a workshop at the 2011 German Association for Language Technology and Computational Linguistics (GSCL) conference in Hamburg where researchers from the three fields presented and discussed their interdisciplinary work. While the studies contained in this volume draw from a wide variety of objectives and methods, and various areas of overlaps between CL, TS and MT are addressed, the volume is by no means exhaustive with regard to this topic. Further cross-fertilisation is not only desirable, but almost mandatory in order to tackle future tasks and endeavours, and this volume is committed to bringing these three fields even closer together.View less
Historically a dubbing country, Germany is not well-known for subtitled productions. But while dubbing is predominant in Germany, more and more German viewers prefer original and subtitled versions of their favourite shows and films. Conventional subtitling, however, can be seen as a strong intrusion into the original image that can not only disrupt but also destroy the director’s intended shot composition and focus points. Long eye movements between focus points and subtitles decrease the viewer’s information intake, and especially German audiences, who are often not used to subtitles, seem to prefer to wait for the next subtitle instead of looking back up again. Furthermore, not only the placement, but also the overall design of conventional subtitles can disturb the image composition – for instance titles with a weak contrast, inappropriate typeface or irritating colour system. So should it not, despite the translation process, be possible to preserve both image and sound as far as possible? Especially given today’s numerous artistic and technical possibilities and the huge amount of work that goes into the visual aspects of a film, taking into account not only special effects, but also typefaces, opening credits and text-image compositions. A further development of existing subtitling guidelines would not only express respect towards the original film version but also the translator’s work. The presented study shows how integrated titles can increase information intake while maintaining the intended image composition and focus points as well as the aesthetics of the shot compositions. During a three-stage experiment, the specifically for this purpose created integrated titles in the documentary “Joining the Dots” by director Pablo Romero-Fresco were analysed with the help of eye movement data from more than 45 participants. Titles were placed based on the gaze behaviour of English native speakers and then rated by German viewers dependant on a German translation. The results show that a reduction of the distance between intended focus points and titles allow the viewers more time to explore the image and connect the titles to the plot. The integrated titles were rated as more aesthetically pleasing and reading durations were shorter than with conventional subtitles. Based on the analysis of graphic design and filmmaking rules as well as conventional subtitling standards, a first workflow and set of placement strategies for integrated titles were created in order to allow a more respectful handling of film material as well as the preservation of the original image composition and typographic film identity.View less
The contributions to this volume investigate relations of cohesion and coherence as well as instantiations of discourse phenomena and their interaction with information structure in multilingual contexts. Some contributions concentrate on procedures to analyze cohesion and coherence from a corpus-linguistic perspective. Others have a particular focus on textual cohesion in parallel corpora that include both originals and translated texts. Additionally, the papers in the volume discuss the nature of cohesion and coherence with implications for human and machine translation. The contributors are experts on discourse phenomena and textuality who address these issues from an empirical perspective. The chapters in this volume are grounded in the latest research making this book useful to both experts of discourse studies and computational linguistics, as well as advanced students with an interest in these disciplines. We hope that this volume will serve as a catalyst to other researchers and will facilitate further advances in the development of cost-effective annotation procedures, the application of statistical techniques for the analysis of linguistic phenomena and the elaboration of new methods for data interpretation in multilingual corpus linguistics and machine translation.View less
The purpose of this volume is to explore key issues, approaches and challenges to quality in institutional translation by confronting academics’ and practitioners’ perspectives. What the reader will find in this book is an interplay of two approaches: academic contributions providing the conceptual and theoretical background for discussing quality on the one hand, and chapters exploring selected aspects of quality and case studies from both academics and practitioners on the other. Our aim is to present these two approaches as a breeding ground for testing one vis-à-vis the other.
This book studies institutional translation mostly through the lens of the European Union (EU) reality, and, more specifically, of EU institutions and bodies, due to the unprecedented scale of their multilingual operations and the legal and political importance of translation. Thus, it is concerned with the supranational (international) level, deliberately leaving national and other contexts aside. Quality in supranational institutions is explored both in terms of translation processes and their products – the translated texts.View less
Empirical research is carried out in a cyclic way: approaching a research area bottom-up, data lead to interpretations and ideally to the abstraction of laws, on the basis of which a theory can be derived. Deductive research is based on a theory, on the basis of which hypotheses can be formulated and tested against the background of empirical data. Looking at the state-of-the-art in translation studies, either theories as well as models are designed or empirical data are collected and interpreted. However, the final step is still lacking: so far, empirical data has not lead to the formulation of theories or models, whereas existing theories and models have not yet been comprehensively tested with empirical methods.
This publication addresses these issues from several perspectives: multi-method product- as well as process-based research may gain insights into translation as well as interpreting phenomena. These phenomena may include cognitive and organizational processes, procedures and strategies, competence and performance, translation properties and universals, etc. Empirical findings about the deeper structures of translation and interpreting will reduce the gap between translation and interpreting practice and model and theory building. Furthermore, the availability of more large-scale empirical testing triggers the development of models and theories concerning translation and interpreting phenomena and behavior based on quantifiable, replicable and transparent data.View less
Exchange between the translation studies and the computational linguistics communities has traditionally not been very intense. Among other things, this is reflected by the different views on parallel corpora. While computational linguistics does not always strictly pay attention to the translation direction (e.g. when translation rules are extracted from (sub)corpora which actually only consist of translations), translation studies are amongst other things concerned with exactly comparing source and target texts (e.g. to draw conclusions on interference and standardization effects). However, there has recently been more exchange between the two fields – especially when it comes to the annotation of parallel corpora. This special issue brings together the different research perspectives. Its contributions show – from both perspectives – how the communities have come to interact in recent years.View less
Corpus-based translation studies has become a major paradigm and research methodology and has investigated a wide variety of topics in the last two decades. The contributions to this volume add to the range of corpus-based studies by providing examples of some less explored applications of corpus analysis methods to translation research. They show that the area keeps evolving as it constantly opens up to different frameworks and approaches, from appraisal theory to process-oriented analysis, and encompasses multiple translation settings, including (indirect) literary translation, machine(-assisted) translation and the practical work of professional legal translators. The studies included in the volume also expand the range of application of corpus applications in terms of the tools used to accomplish the research tasks outlined.View less