Home > Archive > 2022 > Volume 12 Number 2 (Mar. 2022) >
IJMLC 2022 Vol.12(2): 51-56 ISSN: 2010-3700
DOI: 10.18178/ijmlc.2022.12.2.1078

Effect of Named Entity Recognition on English-Vietnamese Neural Machine Translation

Van-Hai Vu, Quang-Phuoc Nguyen, Pum-Mo Ryu, and Cheol-Young Ock

Abstract—Translators are becoming more and more popular and achieving reliable results since deep learning was born. English-Vietnamese machines translation (MT) still have limitations due to Vietnamese contain words with many different meanings, thus resulting in the lower accuracy of automatic MT systems. Our study applied Named Entity Recognition (NER) tool for Vietnamese sentences to determine the category of words in the English-Vietnamese parallel corpus with over 900K sentence pairs. Then, we performed experiments to assess the effect of NER on English-Vietnamese MT systems. The results showed that NER had a positive effect on MT with averagely 1.24 Bi-Lingual Evaluation Understudy (BLEU) scores and averagely 1.8 Translation Error Rate (TER) scores increased comparing to data without using NER.

Index Terms—English-Vietnamese machine translation, neural machine translation, named entity recognition, English-Vietnamese bilingual corpus.

F. Van-Hai Vu and Cheol-Young Ock are with the University of Ulsan, Ulsan, Republic of Korea (corresponding author: Cheol-Young Ock; e-mail: haivv279@ gmail.com, okcy@ulsan.ac.kr).
Quang-Phuoc Nguyen was with FPT Korea, Seoul, Republic of Korea (corresponding author; e-mail: phuocnq@fsoft.com.vn).
S. Pum-Mo Ryu with the Busan University of Foreign Studies, Busan, Republic of Korea (e-mail: 20156029@bufs.ac.kr).

[PDF]

Cite: Van-Hai Vu, Quang-Phuoc Nguyen, Pum-Mo Ryu, and Cheol-Young Ock, "Effect of Named Entity Recognition on English-Vietnamese Neural Machine Translation," International Journal of Machine Learning and Computing vol. 12, no. 2, pp. 51-56, 2022.

Copyright © 2022 by the authors. This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited (CC BY 4.0).

General Information

  • E-ISSN: 2972-368X
  • Abbreviated Title: Int. J. Mach. Learn.
  • Frequency: Quaterly
  • DOI: 10.18178/IJML
  • Editor-in-Chief: Dr. Lin Huang
  • Executive Editor:  Ms. Cherry L. Chen
  • Abstracing/Indexing: Inspec (IET), Google Scholar, Crossref, ProQuest, Electronic Journals LibraryCNKI.
  • E-mail: ijml@ejournal.net


Article Metrics in Dimensions