Your conditions: 任珂
  • 查询专指度对检索效果的影响研究

    Subjects: Library Science,Information Science >> Information Science submitted time 2017-11-08 Cooperative journals: 《数据分析与知识发现》

    Abstract:【目的】针对不同查询专指度语句的检索效果进行全面分析, 为改善搜索引擎性能、提高用户检索体验提供借鉴。【方法】基于TREC Web Track 查询语句, 人工构建查询专指度标注集, 选用语言模型狄利克雷平滑、语言模型线性插值平滑和BM25 三种模型, 以常用的信息检索评价指标为基准, 探讨查询专指度强弱对检索效果在不同层次上的影响。【结果】在最靠前的几条检索结果中, 强弱专指度查询语句的检索效果差异最大, 强专指度的检索效果要明显好于弱专指度。【局限】仅在TREC 数据集上进行实验测试, 还需在其他数据集上进一步检验。【结论】搜索引擎在专指度这一维度下, 应重点关注最靠前的几条检索结果的准确性, 以此为切入点改善检索模型。

  • Comparison of Three Data Mining Algorithms in Knowledge Discovery of Electronic Medical Records

    Subjects: Library Science,Information Science >> Information Science submitted time 2017-10-11 Cooperative journals: 《数据分析与知识发现》

    Abstract: 【Objective】Disease risk factors were discovered from heterogeneous electronic medical record data to provide reference for data mining and knowledge discovery. 【Method】Clinical electronic medical record data with various structures were selected, and three data mining algorithms, decision tree, logistic regression and neural network, were used to establish disease risk factor prediction models, and the three prediction models were compared and analyzed statistically. . [Results] The precision and recall of the decision tree prediction model are higher than those of logistic regression and neural network, and the overall performance of the decision tree is the best, but there is little difference between the three. [Limitations] The attributes of electronic medical records are not optimized. 【Conclusion】Decision tree is superior to logistic regression and neural network in the discovery of risk factors and prediction of disease. In the research, a knowledge discovery framework of heterogeneous data sources based on data mining algorithm is established, which provides certain reference and reference for the future domain knowledge discovery and knowledge base construction and the selection of data mining algorithms.