摘要: Knowlege is important for text-related applications. In this paper, we introduce Microsoft Concept Graph, a knowledge graph engine that provides concept tagging APIs to facilitate the understanding of human languages. Microsoft Concept Graph is built upon Probase, a universal probabilistic taxonomy consisting of instances and concepts mined from the Web. We start by introducing the construction of the knowledge graph through iterative semantic extraction and taxonomy construction procedures, which extract 2.7 million concepts from 1.68 billion Web pages. We then use conceptualization models to represent text in the concept space to empower text-related applications, such as topic search, query recommendation, Web table understanding and advertisements’ relevance. Since the release in 2016, Microsoft Concept Graph has received more than 100,000 page views, 2 million API calls and 3,000 registered downloads from 50,000 visitors over 64 countries.
-
期刊:
DATA INTELLIGENCE
-
分类:
计算机科学
>>
计算机科学的集成理论
-
引用:
ChinaXiv:202211.00454
(或此版本
ChinaXiv:202211.00454V1)
DOI: 10.1162/dint_a_00013
CSTR:32003.36.ChinaXiv.202211.00454.V1
- 推荐引用方式:
Ji, Lei ,Wang, Yujing ,Shi, Botian ,Zhang, Dawei ,Wang, Zhongyuan ,Yan, Jun .Microsoft Concept Graph: Mining Semantic Concepts for Short Text Understanding.中国科学院科技论文预发布平台https://chinaxiv.org/202211.00454.[ChinaXiv:202211.00454V1]
(点此复制)