节点文献

基于神经网络学习的统计机器翻译研究

Neural Network Learning for Statistical Machine Translation

【作者】 杨南

【导师】 俞能海;

【作者基本信息】 中国科学技术大学 , 信号与信息处理, 2014, 博士

【摘要】 近年来,统计机器翻译(Statistical Machine Translation, SMT)研究蓬勃发展,机器翻译效果有了很大改善。然而,机器翻译研究也遇到了双语数据不足、缺乏有效特征表示等困难,影响词对齐、调序、翻译建模等机器翻译关键模块的进一步提升,机器翻译的效果仍不尽人意。与此同时,深度学习作为一种新的机器学习方法,能自动的学习抽象特征表示,建立输入与输出信号间复杂的映射关系,给统计机器翻译研究提供了新的思路。本博士论文的工作就是探索如何使用深度神经网络,对统计机器翻译中的关键问题学习能更好描述翻译现象的表示,提高统计机器翻译的性能。具体的说,本论文的主要工作和创新成果如下:·提出了一种基于深层神经网络的词对齐方法。我们的模型将一个多层神经网络和一个无向概率图模型结合,有效的利用了词汇的相似性和上下文信息对词对齐进行了更准确的建模。我们考察在单语数据和双语平行语料上进行半监督和无监督训练的方法。大规模的中文到英文词对齐实验表明,本章提出的模型相较基准系统显著的改善了词对齐的效果。·提出了一种基于神经网络的统计机器翻译预调序模型。本方法利用神经网络降维方法,从未标注数据学习任意调序特征的低维向量表示,然后利用一个多层神经网络,将低维特征表示和其他特征结合起来,融入到一个线性排序的调序模型中。中文到英文以及日文到英文的机器翻译实验结果表明,相比于基准系统,本文提出的基于神经网络的预调序模型上能显著提高机器翻译系统性能。·提出了一种新的递归重现神经网络对翻译解码过程建模。递归重现神经网络结合递归神经网络和重现神经网络,不仅能使用全局特征对翻译对应关系进行刻画,还在翻译解码过程中动态的对翻译解码树动态生成抽象表示。我们将此模型运用到机器翻译解码过程中,并提出一种分三步的半监督训练方法对此模型进行训练。此外,我们还探索了翻译短语对的表示方法,提出了一种基于翻译置信度的短语对表示。中文到英文的翻译评测实验表明,该方法能使翻译性能获得明显提升。本博士论文探讨了使用神经网络学习方法改善统计机器翻译中三个主要方面的性能。针对每个具体问题,我们设计了专门的神经网络结构,对相关特征学习了特定的抽象特征表示。在将来的研究中,我们希望对这些抽象表示进行总结,利用神经网络和统计机器翻译技术探索一种普适的语言表示,用以帮助其他的自然语言处理任务。

【Abstract】 Research on statistical machine translation (SMT) has witnessed rapid growth in recent years, leading to substantial improvement in translation quality. However, the limited amount of bilingual training data, together with the lack of effective features, have impeded further progress, affecting various key components such as word align-ment, reordering and translation modeling. Meanwhile, deep learning, as an emerging machine learning method, can automatically extract abstract feature representations, modeling complex mappings between input and output signals. This new powerful technique opens up new avenues for SMT research. In this thesis, we will explore how to leverage deep neural network to learn better representation for translation modeling.Specifically, this work mainly consists of the following three aspects:●We propose a new deep neural network for word alignment modeling. We com-bine a multilayer neural network with a undirected probabilistic graphical model, accurately modeling word alignment by automatically exploiting lexical similar-ity and context similarity. We explore both semi-supervised and unsupervised training method for word alignment model. Large scale experiment on Chinese-English alignment task has confirmed the effectiveness of our method.●We propose a neural network based reordering model for SMT. Using a neural net-work based dimension reduction technique, we learns low-dimensional embed-dings for arbitrary reorder features; through a multi-layer network, these feature embeddings are integrated with word embedding features into a linear-ordering reorder models. Experiments on Chinese-English and Japanese-English show the proposed method significantly improve over strong baseline systems.●We propose a new network structure, recursive recurrent neural network, for translation modeling. Recursive recurrent neural network combines the strength of recursive and recurrent neural network, which not only can leverage arbitrary global features, but also can dynamically generate abstract representations for translation derivation tree. We apply this model to translation decoding for SMT, and propose a three-step training method for our model. Furthermore, we also investigate methods for translation pair embedding, proposing a translation con-fidence based method. Experiment on Chinese-English translation task exhibits strong improvement by using our method.In short, this work has investigated neural network learning for three main tasks in statistical machine translation. For each task, we have designed special neural network structures and learned task-specific feature representations. In future, we hope to merge all the representations into an unified abstract feature representation by exploiting neural network and SMT resources, and apply the learned features for other natural language processing tasks.

节点文献中: 

本文链接的文献网络图示:

本文的引文网络