• Bi-GRU Relation Extraction Model Based on Keywords Attention

    分类: 计算机科学 >> 计算机科学的集成理论 提交时间: 2022-11-28 合作期刊: 《数据智能(英文)》

    摘要: Relational extraction plays an important role in the field of natural language processing to predict semantic relationships between entities in a sentence. Currently, most models have typically utilized the natural language processing tools to capture high-level features with an attention mechanism to mitigate the adverse effects of noise in sentences for the prediction results. However, in the task of relational classification, these attention mechanisms do not take full advantage of the semantic information of some keywords which have information on relational expressions in the sentences. Therefore, we propose a novel relation extraction model based on the attention mechanism with keywords, named Relation Extraction Based on Keywords Attention (REKA). In particular, the proposed model makes use of bi-directional GRU (Bi-GRU) to reduce computation, obtain the representation of sentences , and extracts prior knowledge of entity pair without any NLP tools. Besides the calculation of the entity-pair similarity, Keywords attention in the REKA model also utilizes a linear-chain conditional random field (CRF) combining entity-pair features, similarity features between entity-pair features, and its hidden vectors, to obtain the attention weight resulting from the marginal distribution of each word. Experiments demonstrate that the proposed approach can utilize keywords incorporating relational expression semantics in sentences without the assistance of any high-level features and achieve better performance than traditional methods.

  • Data Set and Evaluation of Automated Construction of Financial Knowledge Graph

    分类: 计算机科学 >> 计算机科学的集成理论 提交时间: 2022-11-27 合作期刊: 《数据智能(英文)》

    摘要: With the technological development of entity extraction, relationship extraction, knowledge reasoning, and entity linking, the research on knowledge graph has been carried out in full swing in recent years. To better promote the development of knowledge graph, especially in the Chinese language and in the financial industry, we built a high-quality data set, named financial research report knowledge graph (FR2KG), and organized the automated construction of financial knowledge graph evaluation at the 2020 China Knowledge Graph and Semantic Computing Conference (CCKS2020). FR2KG consists of 17,799 entities, 26,798 relationship triples, and 1,328 attribute triples covering 10 entity types, 19 relationship types, and 6 attributes. Participants are required to develop a constructor that will automatically construct a financial knowledge graph based on the FR2KG. In addition, we summarized the technologies for automatically constructing knowledge graphs, and introduced the methods used by the winners and the results of this evaluation.

  • Ensemble Making Few-Shot Learning Stronger

    分类: 计算机科学 >> 计算机科学的集成理论 提交时间: 2022-11-28 合作期刊: 《数据智能(英文)》

    摘要: Few-shot learning has been proposed and rapidly emerging as a viable means for completing various tasks. Many few-shot models have been widely used for relation learning tasks. However, each of these models has a shortage of capturing a certain aspect of semantic features, for example, CNN on long-range dependencies part, Transformer on local features. It is difficult for a single model to adapt to various relation learning, which results in a high variance problem. Ensemble strategy could be competitive in improving the accuracy of few-shot relation extraction and mitigating high variance risks. This paper explores an ensemble approach to reduce the variance and introduces fine-tuning and feature attention strategies to calibrate relation-level features. Results on several few-shot relation learning tasks show that our model significantly outperforms the previous state-of-the-art models.

  • Ensemble Making Few-Shot Learning Stronger

    分类: 计算机科学 >> 计算机科学技术其他学科 提交时间: 2022-11-15

    摘要: Few-shot learning has been proposed and rapidly emerging as a viable means for completing various tasks. Many few-shot models have been widely used for relation learning tasks. However, each of these models has a shortage of capturing a certain aspect of semantic features, for example, CNN on long-range dependencies part, Transformer on local features. It is difficult for a single model to adapt to various relation learning, which results in a high variance problem. Ensemble strategy could be competitive in improving the accuracy of few-shot relation extraction and mitigating high variance risks. This paper explores an ensemble approach to reduce the variance and introduces fine-tuning and feature attention strategies to calibrate relation-level features. Results on several few-shot relation learning tasks show that our model significantly outperforms the previous state-of-the-art models.

  • Transfer Learning for Scientific Data Chain Extraction in Small Chemical Corpus with BERT-CRF Model

    分类: 计算机科学 >> 自然语言理解与机器翻译 提交时间: 2019-05-12

    摘要: Abstract. Computational chemistry develops fast in recent years due to the rapid growth and breakthroughs in AI. Thanks for the progress in natural language processing, researchers can extract more fine-grained knowledge in publications to stimulate the development in computational chemistry. While the works and corpora in chemical entity extraction have been restricted in the biomedicine or life science field instead of the chemistry field, we build a new corpus in chemical bond field anno- tated for 7 types of entities: compound, solvent, method, bond, reaction, pKa and pKa value. This paper presents a novel BERT-CRF model to build scientific chemical data chains by extracting 7 chemical entities and relations from publications. And we propose a joint model to ex- tract the entities and relations simultaneously. Experimental results on our Chemical Special Corpus demonstrate that we achieve state-of-art and competitive NER performance.