节点文献

预标准化Transformer在乌英机器翻译中的实现

Implementation of Pre-normalized Transformer in Urdu-English Machine Translation

  • 推荐 CAJ下载
  • PDF下载
  • 不支持迅雷等下载工具,请取消加速工具后下载。

【作者】 高巍陈子祥李大舟李耀松

【Author】 GAO Wei;CHEN Zi-xiang;LI Da-zhou;LI Yao-song;School of Computer Science and Technology,Shenyang University of Chemical Technology;

【机构】 沈阳化工大学计算机科学与技术学院

【摘要】 随着人工智能技术的高速发展,基于神经网络的机器翻译技术愈发受到人们的重视.然而,限于有限的数据资源,基于该方法的小语种翻译效果并不理想.乌尔都语作为印度和巴基斯坦的官方语言被广泛使用,实现它与英语之间的翻译模型具有重要意义.本文基于编码器-解码器框架,提出了一种预标准化Transformer的乌英机器翻译模型.该模型在基准Transformer模型上增加了预标准化层,保证数据分布一致的同时避免发生梯度消失.实验采用BLEU作为评价指标.实验表明,在少量乌尔都语与英语平行语料库的基础上,本文提出的基于预标准化Transformer的乌英机器翻译模型能够取得较好的结果.与基准Transformer模型相比在BLEU值上有了一定的提高.

【Abstract】 With the rapid development of artificial intelligence technology,neural network-based machine translation technology has attracted more and more attention. However,limited to limited data resources,the translation effect of small languages based on this method is not ideal. Urdu is widely used as the official language of India and Pakistan. It is of great significance to implement a translation model between it and English. Based on the encoder-decoder framework,this paper proposes a pre-standardized Transformer Uyghur machine translation model. This model adds a pre-standardization layer to the benchmark Transformer model to ensure consistent data distribution and avoid gradient disappearance. In the experiment,Bilingual Evaluation Understudy was used as the evaluation index.Experiments show that based on a small number of Urdu and English parallel corpora,the Uyghur-English machine translation model based on the pre-standardized Transformer proposed in this paper can achieve good results. Compared with the benchmark Transformer model,the BLEU value has been improved.

【基金】 辽宁省教育厅科学研究基金项目(LQ2017008)资助,辽宁省教育厅科学技术研究基金项目(L2016011)资助;辽宁省博士启动基金项目(201601196)资助
  • 【文献出处】 小型微型计算机系统 ,Journal of Chinese Computer Systems , 编辑部邮箱 ,2020年11期
  • 【分类号】TP391.2
  • 【下载频次】113
节点文献中: