Abstract—Textual reasoning and abstraction, which both
take in a long text and generate a short digest, are widely
implemented in the area of Natural Language Processing (NLP).
See et al. pioneer the Seq2Seq and Pointer Generation Network
structure to address the summarisation task. Later,
Transformer model, a successor of Seq2Seq was developed.
However, research on the impact of word frequency on textgenerated tasks is not adequate. In this paper, we propose two
methods to evaluate the effect of word frequency: Smooth
Embedding and Word Sampling. The experiments witness the
improvement of Smooth Embedding performance. On the
contrary, Word Sampling fails to meet our expectation. It
increases the sensitivity of noise, which is a symbol of over-fitting.
Index Terms——Smooth embedding, text generation, sequence
to sequence model, pointer generation network, attention
mechanism.
Meiwei Zhang and Yichang Wu are with Athlone Institute of Technology, Athlone, Ireland (e-mail: zhangmwchris@gmail.com, w.yichang@research.ait.ie).
Yihui Pang is with Chinese Academy of Agricultural Science, Beijing, China (e-mail: YihuiPang_543@outlook.com).
Hao Wen and Jiaxin Wang are with the University of Electronic Science and Technology of China, Chengdu, China (e-mail: wh2015uestc@gmail.com, wangjiaxin966@gmail.com).
Cite: Meiwei Zhang and Yichang Wu, Yihui Pang, Hao Wen, and Jiaxin Wang, "Smooth Embedding and Word Sampling Research Based on Transformer Pointer Generation Network," International Journal of Machine Learning and Computing vol. 11, no. 3, pp. 262-266, 2021.
Copyright © 2021 by the authors. This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited (CC BY 4.0).