ChinaXiv.org 中国科学院科技论文预发布平台

Submitted Date

2023
2
2017
1

Subjects

Authors

Institution

result total 3.

Hide Summary

Hits

Date

Downloads

Your conditions: 南京大学计算机科学与技术系南京 210093

1. ChinaXiv:202308.00643
Download

Research of Automatic Extraction of Entities of Data Science Recruitment and Analysis Based on Deep Learning

Subjects: Library Science，Information Science >> Library Science submitted time 2023-08-27 Cooperative journals: 《图书情报工作》

Wang Dongbo Hu Haotian Zhou Xin Zhu Danhao

Abstract： [Purpose/significance] Data science is emerging as a new interdisciplinary field which combines many fields. Extracting the corresponding entities knowledge from the announcement information of data science recruitment can not only help to understand the development of data science from a market perspective, but also help to improve the content of data science teaching.[Method/process] Based on the recruitment announcement from the recruitment website, combining with information science data collection, annotation and organization methods, data science corpus was constructed and the corresponding entities from it were extracted.[Result/conclusion] In the existing 11000 annotated data science corpus scale recruitment announcement, based on the Bi-LSTM-CRF, CRF and Bi-LSTM models, this paper compared the extraction performance of data science recruiting entities and finally determined the final data science recruitment entities automatic extraction model, designed the data science recruitment entities automatic extraction platform, and built a data science recruitment entities network.

Hits 415 Downloads 149 Comment 0
2. ChinaXiv:202308.00259
Download

A Comparative Study of Model Performances Facing Abstract Structure Function

Subjects: Library Science，Information Science >> Library Science submitted time 2023-08-26 Cooperative journals: 《图书情报工作》

Wang Dongbo Lu Haoxiang Zhou Xin Zhu Danhao

Abstract： [Purpose/significance] Abstract can explain concisely the research purposes, research methods and the final part of the statement, which is of high exploration value and significance.[Method/process] In this paper, four short-term memory networks (long short-term memory, support vector machine, LSTM-CRF and CNN-CRF) were selected to summarize the journal articles of 3672 CNKI databases.[Result/conclusion] The long-term memory network model identifies the highest F value of 69.15%, the maximum F value of LSTM-CRF neural network model is 88.76%, and the highest F value of RNN-CRF model is 89.10%. The highest support vector machine classifier classification macro F value is 72.04%. The experimental results have a high reference value for the selection of the experimental model of the functional structure of academic dissertation in the field of library and information science.

Hits 581 Downloads 199 Comment 0
3. ChinaXiv:201711.02006
Download

基于深度学习的中文机构名识别研究——一种汉字级别的循环神经网络方法

Subjects: Library Science，Information Science >> Information Science submitted time 2017-11-08 Cooperative journals: 《数据分析与知识发现》

朱丹浩杨蕾王东波

Abstract：【目的】中文机构名结构复杂、罕见词多, 识别难度大, 对其进行正确识别对于信息抽取、信息检索、知识挖掘和机构科研评价等情报学中的后续任务意义重大。【方法】基于深度学习的循环神经网络(Recurrent Neural Network, RNN)方法, 面向中文汉字和词的特点, 重新定义了机构名标注的输入和输出, 提出汉字级别的循环网络标注模型。【结果】以词级别的循环神经网络方法为基准, 本文提出的字级别模型在中文机构名识别的准确率、召回率和F 值均有明显提高, 其中F 值提高了1.54%。在包含罕见词时提高更为明显, F 值提高了11.05%。【局限】在解码时直接使用了贪心策略, 易于陷入局部最优, 如果使用条件随机场算法进行建模可能获取全局最优结果。【结论】本文方法构架简单, 能利用到汉字级别的特征来进行建模, 比只使用词特征取得了更好的结果。

Hits 3437 Downloads 2381 Comment 0

Research of Automatic Extraction of Entities of Data Science Recruitment and Analysis Based on Deep Learning

A Comparative Study of Model Performances Facing Abstract Structure Function

基于深度学习的中文机构名识别研究——一种汉字级别的循环神经网络方法