Current Location: > Detailed Browse

Interpreting Nonsignificant Results: A Quantitative Investigation Based on 500 Chinese Psychological Research

请选择邀稿期刊:
Abstract: Background: P-value is the most widely used statistical index for inference in science. A p value greater than 0.05, i.e., nonsignificant results, however, cannot distinguish the two following situations: the absence of evidence or the evidence of absence. Unfortunately, researchers in psychological science may not be able to interpret p-value correctly, resulting in possible mistakes in statistical inference based on nonsignificant result. Indeed, Aczel et al (2019) surveyed three empirical studies published in Psychonomic Bulletin & Review, Journal of Experimental Psychology: General, and Psychological Science. They found that about 72% of nonsignificant results were misinterpreted as evidence in favor of the null hypothesis. The misinterpretation of nonsignificant results may lead severe consequences. One such consequence is the dismay of the nonsignificant results as null effect, ignoring the small but meaningful effects (e.g., Jia, et al., 2018). More importantly, misintepreted non-signficant results when comparing certain traits (e.g., age, gender) in matched-group clinical trials may creat a false “matched” group, thus render the effect of intervention meaningless. As psychological science keeps growing in China, it is important to estimate how nonsignificant results were interpreted in the empirical studies published in Chinese Journals. However, no such meta-research has been done. To fill the gap, we surveyed 500 empirical papers published in five important Chinese psychological journals, to explore the following questions: (1) how often are nonsignificant results reported, that is, how severe is the publication bias; (2) how do researchers interpret nonsignificant results in their own studies; (3) if researcher interpreted nonsignificant as “evidence for absence,” does empirical data provide enough support the null effect. Method: Based on our pre-registration (https://osf.io/czx6f), we randomly selected empirical research papers published in 2017 and 2018 in five Chinese prominent journals (Acta Psychologica Sinica, Psychological Science, Chinese Journal of Clinical Psychology, Psychological Development and Education, Psychological and Behavioral Studies). First, according to the publication volume of each journal, we randomly selected 500 empirical research. Secondly, we screened the abstracts of the selected articles and judged whether they contained negative statements. Thirdly, we categorized each negative statement into 4 categories (Correct-frequentist, Incorrect-frequentist: whole population, Incorrect-frequentist: current sample, Difficult to judge). Finally, we calculated Bayes factors based on the t values and sample size associated with the nonsignificant results to investigate whether empirical data provide enough evidence in favor of null hypothesis. Results: Our survey revealed that: (1) out of 500 empirical research, 36% of their abstracts (n = 180) mentioned nonsignificant results; (2) there were 236 negative statements in the article that referred to nonsignificant results in abstracts, and 41% negative statements misinterpreted nonsignificant results, i.e., the authors inferred that the results provided evidence for the absence of effects; (3) 5.1% (n = 2) nonsignificant results can provide strong evidence in favor of null hypothesis (BF01 > 10). Compared with the results from Aczel et al (2019), we found that empirical papers published in Chinese journal reported more nonsignificant results (36% vs. 32%), and researchers make fewer misinterpretation based on nonsignificant results (41% vs. 72%). It worth noting that there exists a categorization of ambiguous statements about nonsignificant results in the Chinese context: “there is no significant difference between condition A and condition B”. This statement has two interpretations: it can be interpreted as a different way to say “statistically nonsignificant”, or as “there is no differences between condition A and condition B”. The percentage of misinterpretation of nonsignificant results raised to 61% if we used the second interpretation, instead of 41% when we use the first interpretation. Conclusion: The results suggest that Chinese researchers need to enhance their understanding of nonsignificant results and use more appropriate statistical methods to extract information from non-significant results. Also, more precise wording should be used in the Chinese context. "

Version History

[V2] 2020-10-17 20:40:23 ChinaXiv:202003.00056v2 View This Version Download
[V1] 2020-03-22 20:19:13 ChinaXiv:202003.00056V1 Download
Download
Preview
Peer Review Status
Awaiting Review
License Information
metrics index
  •  Hits15080
  •  Downloads3113
Comment
Share
Apply for expert review