Abstract—Translators are becoming more and more popular
and achieving reliable results since deep learning was born.
English-Vietnamese machines translation (MT) still have
limitations due to Vietnamese contain words with many
different meanings, thus resulting in the lower accuracy of
automatic MT systems. Our study applied Named Entity
Recognition (NER) tool for Vietnamese sentences to determine
the category of words in the English-Vietnamese parallel corpus
with over 900K sentence pairs. Then, we performed
experiments to assess the effect of NER on English-Vietnamese
MT systems. The results showed that NER had a positive effect
on MT with averagely 1.24 Bi-Lingual Evaluation Understudy
(BLEU) scores and averagely 1.8 Translation Error Rate (TER)
scores increased comparing to data without using NER.
Index Terms—English-Vietnamese machine translation,
neural machine translation, named entity recognition,
English-Vietnamese bilingual corpus.
F. Van-Hai Vu and Cheol-Young Ock are with the University of Ulsan,
Ulsan, Republic of Korea (corresponding author: Cheol-Young Ock; e-mail:
haivv279@ gmail.com, okcy@ulsan.ac.kr).
Quang-Phuoc Nguyen was with FPT Korea, Seoul, Republic of Korea
(corresponding author; e-mail: phuocnq@fsoft.com.vn).
S. Pum-Mo Ryu with the Busan University of Foreign Studies, Busan,
Republic of Korea (e-mail: 20156029@bufs.ac.kr).
Cite: Van-Hai Vu, Quang-Phuoc Nguyen, Pum-Mo Ryu, and Cheol-Young Ock, "Effect of Named Entity Recognition on English-Vietnamese Neural Machine Translation," International Journal of Machine Learning and Computing vol. 12, no. 2, pp. 51-56, 2022.
Copyright © 2022 by the authors. This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited (CC BY 4.0).