ChinaXiv.org 中国科学院科技论文预发布平台

Submitted Date

2024
1

Subjects

Information Science
1

Authors

Institution

result total 1.

Hide Summary

Hits

Date

Downloads

Your conditions: 赵志枭

1. ChinaXiv:202405.00025
Download

Performance Evaluation of Chinese Universal Large Model in the Field of Humanities and Social Sciences

Subjects: Library Science，Information Science >> Information Science submitted time 2024-05-08

Zhao Zhixiao Hu Die Liu Chang Shen Si Wang Dongbo

Abstract： Purpose/Significance This paper starts from the field of humanities and social sciences, and compares the model performance of humanities and social sciences from the aspects of basic knowledge and academic texts of humanities and social sciences. It aims to provide a systematic large language model evaluation benchmark for the field of humanities and social sciences for the reference of researchers in humanities and social sciences related fields. Methods/Processes Seven evaluation tasks related to the field of humanities and social sciences were designed and corresponding indicators were selected. On this basis, the current open-source and high-performance general-purpose domain Chinese large language models were selected to complete the domain-specific tasks in the form of questions and answers by invoking the local models, and their performance in the field of humanities and social sciences was quantitatively evaluated by selecting relevant indicators. Results/Conclusions The evaluation results show that among the open-source models selected in this paper, Qwen has the best performance, followed by Baichuan2, InternLM, and Atom is the worst performer in both the base model and the dialog model; moreover, in most cases, the dialog model shows more superior performance compared to the base model.

YES

Hits 486 Downloads 182 Comment 0

Performance Evaluation of Chinese Universal Large Model in the Field of Humanities and Social Sciences