Submitted Date
Subjects
Authors
Institution
Your conditions: 罗婉莹
  • 解读不显著结果:基于500个实证研究的量化分析

    Subjects: Psychology >> Social Psychology submitted time 2023-03-28 Cooperative journals: 《心理科学进展》

    Abstract: Background: P-value is the most widely used statistical index for inference in science. A p-value greater than 0.05, i.e., nonsignificant results, however, cannot distinguish the two following cases: the absence of evidence or the evidence of absence. Unfortunately, researchers in psychological science may not be able to interpret p-values correctly, resulting in wrong inference. For example, Aczel et al (2018), after surveying 412 empirical studies published in Psychonomic Bulletin & Review, Journal of Experimental Psychology: General, and Psychological Science, found that about 72% of nonsignificant results were misinterpreted as evidence in favor of the null hypothesis. Misinterpretations of nonsignificant results may lead to severe consequences. One such consequence is missing potentially meaningful effects. Also, in matched-group clinical trials, misinterpretations of nonsignificant results may lead to false “matched” groups, thus threatening the validity of interventions. So far, how nonsignificant results are interpreted in Chinese psychological literature is unknown. Here we surveyed 500 empirical papers published in five mainstream Chinese psychological journals, to address the following questions: (1) how often are nonsignificant results reported; (2) how do researchers interpret nonsignificant results in these published studies; (3) if researchers interpreted nonsignificant as “evidence for absence,” do empirical data provide enough evidence for null effects? Method: Based on our pre-registration (https://osf.io/czx6f), we first randomly selected 500 empirical papers from all papers published in 2017 and 2018 in five mainstream Chinese psychological journals (Acta Psychologica Sinica, Psychological Science, Chinese Journal of Clinical Psychology, Psychological Development and Education, Psychological and Behavioral Studies). Second, we screened abstracts of these selected articles to check whether they contain negative statements. For those studies which contain negative statements in their abstracts, we searched nonsignificant statistics in their results and checked whether the corresponding interpretations were correct. More specifically, all those statements were classified into four categories (Correct-frequentist, Incorrect-frequentist: whole population, Incorrect-frequentist: current sample, Difficult to judge). Finally, we calculated Bayes factors based on available t values and sample sizes associated with those nonsignificant results. The Bayes factors can help us to estimate to what extent those results provided evidence for the absence of effects (i.e., the way researchers incorrectly interpreted nonsignificant results). Results: Our survey revealed that: (1) out of 500 empirical papers, 36% of their abstracts (n = 180) contained negative statements; (2) there are 236 negative statements associated with nonsignificant statistics in those selected studies, and 41% of these 236 negative statements misinterpreted nonsignificant results, i.e., the authors inferred that the results provided evidence for the absence of effects; (3) Bayes factor analyses based on available t-values and sample sizes found that only 5.1% (n = 2) nonsignificant results could provide strong evidence for the absence of effects (BF01 > 10). Compared with the results from Aczel et al (2019), we found that empirical papers published in Chinese journals contain more negative statements (36% vs. 32%), and researchers made fewer misinterpretations of nonsignificant results (41% vs. 72%). It worth noting, however, that there exists a categorization of ambiguous interpretations of nonsignificant results in the Chinese context. More specifically, many statements corresponding to nonsignificant results were “there is no significant difference between condition A and condition B”. These statements can be understood either as “the difference is not statistically significant”, which is correct, or “there is no difference”, which is incorrect. The percentage of misinterpretations of nonsignificant results raised to 64% if we adopt the second way to understand these statements, in contrast to 41% if we used the first understanding. Conclusion: Our results suggest that Chinese researchers need to improve their understanding of nonsignificant results and use more appropriate statistical methods to extract information from nonsignificant results. Also, more precise wordings should be used in the Chinese context.

  • Interpreting Nonsignificant Results: A Quantitative Investigation Based on 500 Chinese Psychological Research

    Subjects: Psychology >> Statistics in Psychology submitted time 2020-10-17

    Abstract: P-value is the most widely used statistical index for inference in science. Unfortunately, researchers in psychological science may not be able to interpret p-value correctly, resulting in possible mistakes in statistical inference. Our specific goal was to estimate how nonsignificant results were interpreted in the empirical studies published in Chinese Journals. Frist, We randomly selected 500 empirical research papers published in 2017 and 2018 in five Chinese prominent journals (Acta Psychological Sinica, Psychological Science, Chinese Journal of Clinical Psychology, Psychological Development and Education, Psychological and Behavioral Studies). Secondly, we screened the abstracts of the selected articles and judged whether they contained negative statements. Thirdly, we categorized each negative statement into 4 categories (Correct-frequentist, Incorrect-frequentist: whole population, Incorrect-frequentist: current sample, Difficult to judge). Finally, we calculated Bayes factors based on the t values and sample size associated with the nonsignificant results to investigate whether empirical data provide enough evidence in favor of null hypothesis. Our survey revealed that: (1) 36% of these abstracts (n = 180) mentioned nonsignificant results; (2) there were 236 negative statements in the article that referred to nonsignificant results in abstracts, and 41% negative statements misinterpreted nonsignificant results; (3) 5.1% (n = 2) nonsignificant results can provide strong evidence in favor of null hypothesis (BF01 > 10). The results suggest that Chinese researchers need to enhance their understanding of nonsignificant results and use more appropriate statistical methods to extract information from non-significant results.