融合Self-Attention机制和n-gram卷积核的 印尼语复合名词自动识别方法研究

首页 > 过刊浏览>2020年第34卷第3期 >1-9

融合Self-Attention机制和n-gram卷积核的 印尼语复合名词自动识别方法研究
DOI:
                        
                    
作者:
                        
                        
                    
作者单位:
作者简介:
通讯作者:
中图分类号:
基金项目:广东省教育厅特色创新基金资助项目（2015KTSCX033），国家社会科学基金资助项目（17BGL068）

Automatic Recognition of Indonesian Compound Noun Phrases with a Combination of Self-Attention Mechanism and n-gram Convolution Kernel

Author:

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

文章评论

摘要:

针对印尼语复合名词短语自动识别，提出一种融合Self-Attention机制、n-gram卷积核的神经网络和统计模型相结合的方法，改进现有的多词表达抽取模型。在现有SHOMA模型的基础上，使用多层CNN和Self-Attention机制进行改进。对Universal Dependencies公开的印尼语数据进行复合名词短语自动识别的对比实验，结果表明：TextCNN+Self-Attention+CRF模型取得32.20的短语多词识别F1值和32.34的短语单字识别F1值，比SHOMA模型分别提升了4.93%和3.04%。

Abstract:

In view of the automatic recognition of Indonesian compound noun phrases, this paper proposes a method with Self-Attention mechanism, n-gram convolution kernel neural network and statistical model combined together so as to improve the performance of the existing multi-word expression extraction model. On the basis of the existing SHOMA model, a further improvement can be made by using the multi-layer CNN and Self-Attention mechanism, followed by an automatic recognition of compound noun phrases based on Indonesian data disclosed by Universal Dependencies. The comparative experiment results show that the F1 multi-word phrase recognition value of 32.20, as well as the F1 single-word recognition value of 32.34 obtained by TextCNN+Self-Attention+CRF model obtains respectively is 4.93% and 3.04% respectively higher than that of SHOMA model.

参考文献

相似文献

引证文献

引用本文

丘心颖,陈汉武,陈源,谭立聪,张皓,肖莉娴.融合Self-Attention机制和n-gram卷积核的印尼语复合名词自动识别方法研究[J].湖南工业大学学报,2020,34(3):1-9.

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:
引用次数:

历史

收稿日期:2020-03-29
最后修改日期:
录用日期:
在线发布日期: 2020-05-26
出版日期:

首页

期刊介绍

编委会

投稿指南

期刊订阅

过刊浏览

审稿指南

联系我们

引用本文

分享

文章指标

历史