ChinaXiv.org 中国科学院科技论文预发布平台

Submitted Date

2023
8
2017
5

Subjects

Authors

Institution

result total 13.

Hide Summary

Hits

Date

Downloads

Your conditions: 陆伟

1. ChinaXiv:202307.00467
Download

Research on Structure Function Recognition of Academic Text Based on Multi-level Fusion

Subjects: Library Science，Information Science >> Library Science submitted time 2023-07-26 Cooperative journals: 《图书情报工作》

Wang Jiamin Lu Wei Liu Jiawei Cheng Qikai

Abstract： [Purpose/significance] The structure function of the academic text refers to the summarization of academic text structure and section function. While few of existed studies pay attention to the fusion of multi-level structure of academic text, and the traditional methods usually rely on artificial experience to build rules or features. After the analysis of the multi-level structure of academic text, we construct a structure function recognition model based on multi-level fusion.[Method/process] We use the academic text dataset from ScienceDirect for experiment. First, we apply deep learning algorithms to identify the structure function of academic text at different level. Then we employ the voting method to fuse the results from different levels and models.[Result/conclusion] The results show that the performance improved to varying degrees after fusion. The precision, recall and F1 value of the combined results reached 86%, 84% and 84%, respectively. Compared with the traditional machine learning algorithm SVM, the deep learning algorithm has better performance in the task of academic text classification. Finally, we analyze the misclassification of the structure function of academic text and point out the potential application fields and future research directions.

Hits 298 Downloads 143 Comment 0
2. ChinaXiv:202307.00643
Download

The Discovery of Subject Basic Vocabulary from the Perspective of Keyword Co-occurrence Network

Subjects: Library Science，Information Science >> Library Science submitted time 2023-07-26 Cooperative journals: 《图书情报工作》

Yu Fengchang Lu Wei

Abstract： [Purpose/significance] Subject basic vocabulary is an important cornerstone of subject knowledge. It is of great significance to understand the composition of the knowledge system of discipline, to clarify the knowledge context of discipline and to promote discipline education. However, for a long time, it mainly relies on manual summarization and cannot be automatically mined within a certain discipline.[Method/process] This paper proposes a method to use the keyword co-occurrence network to discover basic vocabularies within the discipline. This method takes advantage of the relatively low word frequency of the basic vocabulary and the relatively high degree of centrality in the network, and automatically obtains the subject basic vocabulary from the subject keyword dataset.[Result/conclusion] The validity of this method is verified by using the keyword datasets in the fields of computer(full dataset), user interfaces and information search and retrieval from ACM's 1969-2012 theses. Moreover, this method can use simpler steps to discover the global basic vocabulary in the data set.

Hits 280 Downloads 157 Comment 0
3. ChinaXiv:202304.00450
Download

Adopting the Past and Absorbing the New, Charring Forward the Cause and Forging Ahead into the Future: A Book Review of Taxonomy of Research Methods and Technologies in Intelligence Studies

Subjects: Library Science，Information Science >> Information Science submitted time 2023-04-01 Cooperative journals: 《图书情报工作》

Lu Wei

Abstract： [Purpose/significance] The book entitled Taxonomy of Research Methods and Technologies in Intelligence Studies is reviewed. This review aims to make readers understand the fundamental research methods, the construction process and results of taxonomy of research methods and technologies in intelligence studies.[Method/process] The theories and technologies of information organization, natural language processing and machine learning were comprehensively used to construct the taxonomy of research methods and technologies in intelligence studies, develop the knowledge base and retrieval system of research methods in intelligence studies, and explore the taxonomy of research methods in particular scenarios of intelligence studies.[Result/conclusion] The book has a unique perspective and uses machine learning to construct the research methods taxonomy in specific discipline innovatively. It plays an important role in promoting the innovation of the research method in intelligence studies, and the construction of the academic system in intelligence studies. It also provides a key for solving practical problems in disciplines and industries.

Hits 240 Downloads 122 Comment 0
4. ChinaXiv:202304.00549
Download

Novelty Measurement and Innovation Type Identification of Scientific Literature Based on Question-Method Combination

Subjects: Library Science，Information Science >> Information Science submitted time 2023-04-01 Cooperative journals: 《图书情报工作》

Qian Jiajia Luo Zhuoran Lu Wei

Abstract： [Purpose/significance] Novelty measurement is an important part of scientific achievement evaluation. This paper aims to propose a method of novelty measurement and innovation type identification of scientific papers based on the combination of question and method. [Method/process] Based on the word frequency principle, this paper calculated the question novelty, method novelty and question-method combination novelty respectively, and then calculated the overall novelty of the paper by weight assignment. In addition, based on the theory of combination innovation, this study proposed four types of innovation from the perspective of scientific paper question-method combination and a method to identify the type of innovation according to the novelty value. [Result/conclusion] Finally, this paper conducts an empirical study based on more than 200,000 ACM papers from 1951 to 2018, and proves that the novelty measurement method and innovation category identification method proposed in this paper are scientific, reasonable and feasible.

Hits 168 Downloads 94 Comment 0
5. ChinaXiv:202304.00579
Download

Research on the Recognition of Innovative Contribution Sentences of Academic Papers

Subjects: Library Science，Information Science >> Information Science submitted time 2023-04-01 Cooperative journals: 《图书情报工作》

Luo Zhuoran Cai Le Qian Jiajia Lu Wei

Abstract： [Purpose/significance] Contribution sentences of academic papers are elements to reflect the novelty and academic value of papers. This study takes the full text of academic papers and MeSH terms as data sources and uses natural language processing and deep learning techniques to achieve academic paper contribution sentence recognition. This study lays the foundation for fine-grained mining of innovative contents of academic texts, which is important for realizing the evaluation of academic papers based on cognitive computing.[Method/process] Firstly, the full-text PubMed papers were used as the data source for element analysis and feature extraction of the contributed sentences. Secondly, a semi-automatic approach was used to fulfill the data annotation. Finally, the automatic recognition of contributed sentences was realized based on Albert deep learning model.[Result/conclusion] The plausibility of the experimentally labeled training data is proved by the data consistency test, and the experimental results show that the automatic recognition model trained in this paper can identify the contribution sentences in academic papers more effectively compared with other deep learning models.

Hits 222 Downloads 148 Comment 0
6. ChinaXiv:202304.00608
Download

Research on Keyword Semantic Function Recognition Based on Multi-feature Fusion

Subjects: Library Science，Information Science >> Information Science submitted time 2023-04-01 Cooperative journals: 《图书情报工作》

Zhang Guobiao Li Pengcheng Lu Wei Cheng Qikai

Abstract： [Purpose/significance] Keywords, as a kind of vocabulary or term that can reveal the subject and core content of a text, can identify the functions and provide the underlying index support for fast and accurate acquisition of knowledge and documents.[Method/process] Aiming at the existing studies that are mostly limited to the semantic representation of symbols at the text level in vocabulary context modeling, this paper proposes a lexical function recognition model based on multi-feature fusion. On the basis of capturing the context-dependent features of keywords using the BERT model, the position information of keywords in the keyword list and the full text and prior knowledge of vocabulary functions are fused, and then the attention mechanism and feed-forward neural network are used for the identification of key words by problem-solving method.[Result/conclusion] The experimental results show that both the location information and priori knowledge of the keywords can improve their word function recognition effect, and the prior knowledge has a greater contribution to the recognition effect.

Hits 207 Downloads 99 Comment 0
7. ChinaXiv:202304.00698
Download

The Construction and Analysis of Academic Query Intent Taxonomy: An Empirical Study of Baidu's Academic Search Query Log

Subjects: Library Science，Information Science >> Information Science submitted time 2023-04-01 Cooperative journals: 《图书情报工作》

Wang Ruixue Fang Jing Li Xin Lu Wei Zhang Xian

Abstract： [Purpose/significance] During academic search, understanding, analyzing and identifying the information needs expressed by users is the first step to optimize query results and improve the user experience of academic search engines. In this paper, we called it the academic query intent, which refers to the user's ideographic information needs and potential information expressed through query. Summarizing the academic query intent taxonomy is helpful for the identification of academic query intent and the presentation of search result pages.[Method/process] Based on the A.Broder's taxonomy of query intent, this study combined with the Baidu's academic search query log to construct a taxonomy of query intent in academic search. On this basis, this paper identified the different academic query categories manually and analyzed the characteristics of query intent in different types of academic queries.[Result/conclusion] The user's academic query intentions are divided into five categories:academic literature, academic entities, academic exploration, knowledge quiz and non-academic literature. For different types of academic query intent, the study draw the approximate proportions and given the characteristics, scenarios and result pages of the query.

Hits 217 Downloads 135 Comment 0
8. ChinaXiv:202304.00711
Download

Based on Deep Learning Algorithm to Construct the Classifier of Academic Query Intent

Subjects: Library Science，Information Science >> Information Science submitted time 2023-04-01 Cooperative journals: 《图书情报工作》

Wang Ruixue Fang Jing Gui Sisi Lu Wei Zhang Xian

Abstract： [Purpose/significance] To find the solutions of automatically identifying search query intent and improve the efficiency of academic search engines. [Method/process] Combining the features of query intent and academic search, we constructed the feature from four aspects, which are the basic descriptive statistics, the special keywords, entity information and the frequency. For the experiments, we examined four types of classifiers which are the Naive Bayes, Logistic regression, SVM, Random Forest and calculated precision, recall and F-measure. A method which is extending the recognition results of academic query intent predicted by Logistic regression algorithm to large-scale data sets and extracting "keyword type" features is proposed to construct a two-layer classifier based on deep learning algorithm for academic query intent recognition. [Result/conclusion] The macro-average F1 value of the two-layer classifier is 0.651, which is superior to other algorithms. This method can effectively balance the precision and recall rate of different academic query intentions. The final second-layer prediction model receives the best classification performance, the score of F1 is 0.783.

Hits 191 Downloads 119 Comment 0
9. ChinaXiv:201711.01938
Download

基于图像语义的用户兴趣建模

Subjects: Library Science，Information Science >> Information Science submitted time 2017-11-08 Cooperative journals: 《数据分析与知识发现》

曾金陆伟丁恒陈海华

Abstract：【目的】社交网络环境下的用户兴趣建模是好友推荐、精准营销的关键, 利用微博用户分享的图像, 提出一种基于图像语义的用户兴趣建模方法, 旨在更加准确地预测用户的真实兴趣。【方法】在获取新浪微博用户图像数据的基础上, 使用图像的高层语义表达用户兴趣特征, 基于这些特征使用SVM 训练得到图像语义分类器进行预测。【结果】实验结果表明, 本文建立的模型能够较为准确地预测用户真实兴趣, 169 位用户分类的准确率达到97.38%, 召回率为98.92%, F 值为98.14%。【局限】由于实验图像数据集有限, 未能完整地覆盖用户所有的兴趣类别。【结论】该模型能够基于用户分享的图像较为准确地预测用户兴趣, 表明了图像高层语义的有效性, 同时为图像高层语义应用研究提供了一定的理论和技术基础。

Hits 2304 Downloads 1270 Comment 0
10. ChinaXiv:201711.02018
Download

查询专指度对检索效果的影响研究

Subjects: Library Science，Information Science >> Information Science submitted time 2017-11-08 Cooperative journals: 《数据分析与知识发现》

任珂陆伟丁恒

Abstract：【目的】针对不同查询专指度语句的检索效果进行全面分析, 为改善搜索引擎性能、提高用户检索体验提供借鉴。【方法】基于TREC Web Track 查询语句, 人工构建查询专指度标注集, 选用语言模型狄利克雷平滑、语言模型线性插值平滑和BM25 三种模型, 以常用的信息检索评价指标为基准, 探讨查询专指度强弱对检索效果在不同层次上的影响。【结果】在最靠前的几条检索结果中, 强弱专指度查询语句的检索效果差异最大, 强专指度的检索效果要明显好于弱专指度。【局限】仅在TREC 数据集上进行实验测试, 还需在其他数据集上进一步检验。【结论】搜索引擎在专指度这一维度下, 应重点关注最靠前的几条检索结果的准确性, 以此为切入点改善检索模型。

Hits 2193 Downloads 1240 Comment 0
11. ChinaXiv:201711.02052
Download

标准文献知识服务系统设计与实现

Subjects: Library Science，Information Science >> Information Science submitted time 2017-11-08 Cooperative journals: 《数据分析与知识发现》

丁恒陆伟

Abstract：【目的】建设面向知识层次的标准文献服务系统, 推进标准文献信息服务的知识化进程。【应用背景】标准文献知识服务系统能够对标准文献中的知识单元进行语义抽取, 依据标准文献知识之间的关联关系进行有效组织, 并为用户提供面向知识层次的标准文献信息服务。【方法】采用光符识别、自然语言处理、信息可视化等技术实现标准文献的语义组织、知识抽取、本体构建、知识图谱、本体检索等功能。【结果】用户利用标准文献知识服务系统, 能够获得面向知识层次的标准文献信息服务, 包括标准知识图谱和基于本体的标准知识检索服务【结论】标准文献知识服务系统能够改善用户体验, 满足用户的标准文献知识需求。

Hits 1992 Downloads 1168 Comment 0
12. ChinaXiv:201711.01022
Download

补饲发酵芦笋下脚料对母猪粪便形态和乳汁质量的影响

Subjects: Biology >> Zoology submitted time 2017-10-23 Cooperative journals: 《动物营养学报》

毛春瑕石显亮何余湧陆伟

Abstract：本研究旨在探讨给妊娠后期和哺乳期母猪补饲发酵芦笋下脚料对母猪粪便形态和乳汁质量的影响。将15头膘情、胎次和预产期相近的怀孕母猪随机分配到Ⅰ组、Ⅱ组和Ⅲ组，每组5个重复，每个重复1头猪。Ⅰ组、Ⅱ组和Ⅲ组母猪每头每天分别补饲0、0.25和0.50 kg发酵芦笋下脚料。试验从母猪妊娠期的第85天开始到产后第21天结束。结果表明：1）给母猪补饲发酵芦笋下脚料能改善母猪的粪便形态。2）Ⅲ组母猪初乳中乳蛋白质、生长激素、胰岛素和免疫球蛋白G水平显著高于Ⅰ组（P＜0.05），肿瘤坏死因子–α水平显著低于Ⅰ组（P＜0.05）。3）Ⅱ组和Ⅲ组母猪第10天乳汁中总超氧化物歧化酶活性显著高于Ⅰ组（P＜0.05），而第21天乳汁中总超氧化物歧化酶活性则极显著高于Ⅰ组（P＜0.01）；Ⅲ组母猪第21天乳汁中丙二醛、白细胞介素–1β、白细胞介素–6和肿瘤坏死因子–α水平分别显著低于Ⅰ组（P＜0.05）。由此得出，补饲发酵芦笋下脚料能减少怀孕后期和哺乳期母猪便秘的发生，并不同程度地改善母猪乳汁质量。

Hits 2036 Downloads 1210 Comment 0
13. ChinaXiv:201711.01202
Download

基于多知识库的短文本实体链接方法研究——以Wikipedia 和Freebase 为例

Subjects: Library Science，Information Science >> Information Science submitted time 2017-10-11 Cooperative journals: 《数据分析与知识发现》

周鹏程武川陆伟

Abstract：【目的】基于多知识库进行实体链接, 解决基于单一知识库的实体链接覆盖度低的问题。【方法】首先生成文本的n-gram 并利用词性和多个指称–实体字典获取候选指称, 然后生成指称组合并保留覆盖度最大且不被其他组合包含的指称组合, 接着生成候选实体序列并利用多知识库信息计算实体序列的相关度, 最后选择相关度最大的实体序列为最终结果。【结果】以Wikipedia 和Freebase为例的实验结果表明, 基于Wikipedia+Freebase的实体链接准确率、召回率、F 值分别达到71.81%、76.86%、74.25%。【局限】基于词性过滤n-gram 缺乏理论依据, 数据集FACC1 具有高准确率和低召回率的特点。【结论】利用多个知识库的实体信息, 能够提升实体链接效果。

Hits 2631 Downloads 1755 Comment 0