Your conditions: 赵志枭
  • Performance Evaluation of Chinese Universal Large Model in the Field of Humanities and Social Sciences

    Subjects: Library Science,Information Science >> Information Science submitted time 2024-05-08

    Abstract: Purpose/Significance This paper starts from the field of humanities and social sciences, and compares the model performance of humanities and social sciences from the aspects of basic knowledge and academic texts of humanities and social sciences. It aims to provide a systematic large language model evaluation benchmark for the field of humanities and social sciences for the reference of researchers in humanities and social sciences related fields. Methods/Processes Seven evaluation tasks related to the field of humanities and social sciences were designed and corresponding indicators were selected. On this basis, the current open-source and high-performance general-purpose domain Chinese large language models were selected to complete the domain-specific tasks in the form of questions and answers by invoking the local models, and their performance in the field of humanities and social sciences was quantitatively evaluated by selecting relevant indicators. Results/Conclusions The evaluation results show that among the open-source models selected in this paper, Qwen has the best performance, followed by Baichuan2, InternLM, and Atom is the worst performer in both the base model and the dialog model; moreover, in most cases, the dialog model shows more superior performance compared to the base model.