Anda belum login :: 17 Feb 2025 11:19 WIB
Home
|
Logon
Hidden
»
Administration
»
Collection Detail
Detail
A New Approach to the Study of Translationese: Machine-learning the Difference between Original and Translated Text
Oleh:
Baroni, Marco
;
Bernardini, Silvia
Jenis:
Article from Journal - e-Journal
Dalam koleksi:
Literary and Linguistic Computing vol. 21 no. 3 (Sep. 2006)
,
page 259-274.
Fulltext:
Vol 21, 3, p 259-274.pdf
(137.79KB)
Isi artikel
In this article we describe an approach to the identification of ‘translationese’ based on monolingual comparable corpora and machine learning techniques for text categorization. The article reports on experiments in which support vector machines (SVMs) are employed to recognize translated text in a corpus of Italian articles from the geopolitical domain. An ensemble of SVMs reaches 86.7% accuracy with 89.3% precision and 83.3% recall on this task. A preliminary analysis of the features used by the SVMs suggests that the distribution of function words and morphosyntactic categories in general, and personal pronouns and adverbs in particular, are among the cues used by the SVMs to perform the discrimination task. A follow-up experiment shows that the performance attained by SVMs is well above the average performance of ten human subjects, including five professional translators, on the same task. Our results offer solid evidence supporting the translationese hypothesis, and our method seems to have promising applications in translation studies and in quantitative style analysis in general. Implications for the machine learning/text categorization community are equally important, both because this is a novel application and especially because we provide explicit evidence that a relatively knowledge-poor machine learning algorithm can outperform human beings in a text classification task.
Opini Anda
Klik untuk menuliskan opini Anda tentang koleksi ini!
Kembali
Process time: 0.015625 second(s)