Editor's note: The Eighth Asia-Pacific Translation and Interpreting Forum (APTIF) will be held in Xi'an, China on June 17-18, 2016. The theme of this year's conference is "Translation and Interpreting in Tomorrow's Asia-Pacific Region." The following is a summary of the research paper submitted by May Li with Tongji University.
Title: How far can post-editing go in MT?
Abstract:
At a time when mega data has become a click word in our life, machine translation (MT) seems better known as CAT which is mostly data-driven. It is indeed true that statistics-based translation memory (TM) has greatly improved the translation quality via data analysis and deep learning. That is why free online translation search engines like Google translate, Bing translate,Baidu translate and Youdao translate have gained increasing popularity among language users. For the time being, rule-based MT seems to have lost its ground to statistics-based TM, for the latter appears to be able to produce better quality translation. However, no matter how "good" the quality is, post-editing is almost inevitable for all serious translation, which in turn dilutes the effectiveness of TM which is "cloud" supported. To improve the quality of rule-based MT, an alternative approach was proposed to turn the process of post-editing automatic instead of the conventional way of working on the MT system itself. Based upon a five-year research on post-editing along the line, this paper will show a linguistic endeavor on its way to achieve efficiency in rule-based MT. A corpus of 100,000 English Chinese sentence pairs was analyzed in an attempt to identify error patterns through a comparison between MT and human translation. Statistical analyses of the error patterns reveal that an overwhelming majority of machine errors are at the lexical level while far fewer at the syntactic level. If rules from syntactic theory are to be integrated in the development of post-editing automation software, chances are the rule-based MT may significantly improve its quality, save the labor cost and enhance efficiency.
About the author:
May Li, professor of School of Foreign Languages, Tongji University
Research interests: machine translation, syntax