Microsoft Concept Graph: Mining Semantic Concepts for Short Text Understanding 后印本

作者： Ji, Lei ^1,2 Wang, Yujing ¹ Shi, Botian ³ Zhang, Dawei ⁴ Wang, Zhongyuan ⁵ Yan, Jun ⁶
作者单位：

1. Microsoft Research Asia, Haidian District, Beijing 100080, China

2. Institute of Computing Technology, Chinese Academy of Sciences, Haidian District, Beijing 100049, China

3. Beijing Institute of Technology, Haidian District, Beijing 100081, China

4. MIX Labs, Haidian District, Beijing 100080, China

5. Meituan NLP Center, Chaoyang District, Beijing 100020, China

6. AI Lab of Yiducloud Inc., Huayuan North Road, Haidian District, Beijing 100089, China
通讯作者： Ji, Lei Email:leiji@microsoft.com
提交时间：2022-11-27 13:32:46

摘要: Knowlege is important for text-related applications. In this paper, we introduce Microsoft Concept Graph, a knowledge graph engine that provides concept tagging APIs to facilitate the understanding of human languages. Microsoft Concept Graph is built upon Probase, a universal probabilistic taxonomy consisting of instances and concepts mined from the Web. We start by introducing the construction of the knowledge graph through iterative semantic extraction and taxonomy construction procedures, which extract 2.7 million concepts from 1.68 billion Web pages. We then use conceptualization models to represent text in the concept space to empower text-related applications, such as topic search, query recommendation, Web table understanding and advertisements’ relevance. Since the release in 2016, Microsoft Concept Graph has received more than 100,000 page views, 2 million API calls and 3,000 registered downloads from 50,000 visitors over 64 countries.

Knowledge extraction Conceptualization Text understanding

期刊： DATA INTELLIGENCE
分类： 计算机科学 >> 计算机科学的集成理论
引用： ChinaXiv:202211.00454 (或此版本 ChinaXiv:202211.00454V1)
DOI: 10.1162/dint_a_00013
CSTR:32003.36.ChinaXiv.202211.00454.V1
推荐引用方式： Ji, Lei ,Wang, Yujing ,Shi, Botian ,Zhang, Dawei ,Wang, Zhongyuan ,Yan, Jun .Microsoft Concept Graph: Mining Semantic Concepts for Short Text Understanding.中国科学院科技论文预发布平台https://chinaxiv.org/202211.00454.[ChinaXiv:202211.00454V1] (点此复制)

版本历史

[V1]

2022-11-27 13:32:46

ChinaXiv:202211.00454V1

下载全文

相关论文推荐

1. Segment Anything for Videos: A Systematic Survey	2024-08-05
2. Does GPT-4 Play Dice?	2024-02-20
3. Overview of deep learning theory and its application	2024-01-06
4. A Novel Framework for Future Natural Language Processing From a Database Perspective	2023-11-01
5. A Conversation with ChatGPT: Dialogue of Civilizations in the Age of AI	2023-10-30
6. Simplifying Low-Light Image Enhancement Networks with Relative Loss Functions	2023-10-08
7. A Conversation with ChatGPT: Scientific Research in the Age of AI	2023-09-22
8. A Preliminary Study on the Capability Boundary of LLM and a New Implementation Approach for AGI	2023-05-06
9. Curvature-Balanced Feature Manifold Learning for Long-Tailed Classification	2023-03-22
10. Delving into Semantic Scale Imbalance	2023-02-16


公开评论匿名评论仅发给作者