Optimization of a prediction model of life satisfaction based on text data augmentation

Author: Chen Jiajing ^1,2 Hu Dingding ^1,2 Song Rui ^1,2 Tan Shiqi ^1,2 Li Yuqing ^1,2 Zhang Shengnan ^1,2 Zhu Tingshao ^1,2 Zhao Nan ^1,2
Institute:

1. Institute of Psychology, Chinese Academy of Sciences, Beijing, 100101, China

2. University of Chinese Academy of Sciences, Beijing, 100049, China
Correspondent： 朱廷劭 Email:tszhu@psych.ac.cn 赵楠
Submit Time:2024-02-29 13:31:45

Abstract: Objective With the development of network big data and machine learning, more and more studies starting to combine text analysis and machine learning algorithms to predict individual satisfaction. In the studies focused on building life satisfaction prediction models, it is often difficult to obtain large amounts of valid and labeled data. This study aims at solving this problem using data augmentation and optimizing the prediction model of life satisfaction. Method Using 357 life status descriptions annotated by self-rating life satisfaction scale scores as original text data. After preprocessing using DLUT-Emotionontology, EAD and back-translation method was applied and the prediction model was built using traditional machine learning algorithms. Results Results showed that (1) the prediction accuracy was largely enhanced after using the adapted version of DLUT-Emotionontology; (2) only linear regression model was enhanced after data augmentation; (3) rigid regression model showed the greatest prediction accuracy when trained by original data (r = 0.4131). Conclusion The improvement of feature extraction accuracy can optimize the current life satisfaction prediction model, but the text data augmentation methods, such as back translation and EDA may not be applicable for the life satisfaction prediction model based on word frequency.

Life Satisfaction DLUT-Emotionontology Text data augmentation Back translation EDA Machine learning

From: 朱廷劭
Subject: Psychology >> Applied Psychology Computer Science >> Computer Application Technology
Contribution： No Submitted
Cite as: ChinaXiv:202201.00007 (or this version ChinaXiv:202201.00007V2)
DOI:10.12074/202201.00007V2
CSTR:32003.36.ChinaXiv.202201.00007.V2
Recommended references： 陈佳婧,胡丁鼎,宋蕊,谭诗奇,李雨晴,张胜楠,朱廷劭,赵楠.(2024).基于文本数据增强的生活满意度预测模型优化.中国科学院科技论文预发布平台.doi:10.12074/202201.00007V2 (Click&Copy)

Version History

[V2]	2024-02-29 13:31:45	ChinaXiv:202201.00007V2	Download
[V1]	2022-01-04 11:07:49	ChinaXiv:202201.00007v1 View This Version	Download

Related Paper

1. 神经模拟推断：基于神经网络和模拟推断的认知建模方法	2024-07-21
2. Humans are invited to write cell backbones as complex numbers by writing polyribonucleotides as computable numbers	2024-07-01
3. 中美两国人工智能头部企业研发和创新的比较分析与启示	2024-06-28
4. 基于深度卷积神经网络的大学英语四级成绩早期预警	2024-06-28
5. 基于BERT模型的科技成果中图分类自动标引方法研究	2024-06-21
6. 甘肃方言数据库建设与研究	2024-06-12
7. 面向低资源语言机器翻译的平行语料句对齐评分	2024-06-05
8. Turing’s thinking machine and ’t Hooft’s principle of superposition of states	2024-05-14
9. 恶意代码SCMP分类方法框架与风险行为多标签机制	2024-05-09
10. 引导大语言模型生成计算机可解析内容	2024-04-21


Public comments Anonymous comments Send only to author