基于BERT的医疗电子病历命名实体识别
DOI:
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

基金项目:

科技创新2030——“新一代人工智能”基金资助重大项目(2018AAA0100400),国家自然科学基金资助项目 (61702177),湖南省自然科学基金资助项目(2018JJ2098,2020JJ6089),湖南省教育厅基金资助重点项目(19A133)


Named Entity Recognition of Electronic Medical Records Based on BERT
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    针对中文医疗电子病历命名实体识别中,传统的字或词向量无法很好地表示上下文语义以及传统RNN并行计算能力不足等问题,提出了一个基于BERT的医疗电子病历命名实体识别模型。该模型中的BERT预训练语言模型可以更好地表示电子病历句子中的上下文语义,迭代膨胀卷积神经网络(IDCNN)对局部实体的卷积编码有更好的识别效果,多头注意力(MHA)多次计算每个字和所有字的注意力概率以获取电子病历句子的长距离依赖。实验结果表明,BERT-IDCNN-MHA-CRF模型能够较好地识别电子病历中的医疗实体,模型的精确率、召回率和F1值相比于基线模型分别提高了1.80%, 0.41%, 1.11%。

    Abstract:

    In view of the poor performance exhibited by traditional words or word vectors in expressing context semantics, as well as the insufficiency of traditional RNN parallel computing ability in Chinese medical EMR named entity recognition, a named entity recognition model of medical EMR based on Bert has thus been proposed. In this model, the BERT pre-training language model can better represent the context semantics in electronic medical records, with the iterative expanded convolutional neural network (IDCNN) characterized with a better recognition effect on convolutional coding of local entities, and with the multiple head attention (MHA) computing the attention probability of each word and all words for many times to obtain the long-distance dependence of EMR sentences. The experimental results show that the BERT-IDCNN-MHA-CRF model can better identify medical entities in electronic medical records, and compared with the baseline model, the precision, recall and F1 values of the model are increased by 1.80% , 0.41% and 1.11% respectively.

    参考文献
    相似文献
    引证文献
引用本文

梁文桐,朱艳辉,詹 飞,冀相冰.基于BERT的医疗电子病历命名实体识别[J].湖南工业大学学报,2020,34(4):54-62.

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2019-10-09
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2020-07-10
  • 出版日期: