Abstract:In view of the problem of missing sequence information based on the character granularity NER in the task of naming entity recognition of Chinese electronic medical records, as well as the low computational efficiency brought about by the introduction of external dictionary resource methods, a model based on SoftLexicon has thus been proposed. First, each character in the sequence is mapped to a dense vector; next, an external dictionary resource is introduced to construct SoftLexicon features for each character to be added to the corresponding word vector representation; then, these enhanced characters representations are to be put into the Bi-LSTM and CRF layers so as to obtain the final recognition result. The model can effectively capture the characteristics in the sentence sequence, and extract the dependencies between contexts, thus realizing the sequentiality of label prediction. With the electronic medical record data provided by the CCKS-2020 medical named entity recognition evaluation task is as the experimental data set, the proposed method, compared with the traditional NER method based on character granularity, has significantly improved entity recognition performance and efficiency.