SAFETYLIT WEEKLY UPDATE

We compile citations and summaries of about 400 new articles every week.
RSS Feed

HELP: Tutorials | FAQ
CONTACT US: Contact info

Search Results

Journal Article

Citation

Perezgonzalez JD. Front. Psychol. 2016; 7: e1504.

Copyright

(Copyright © 2016, Frontiers Research Foundation)

DOI

10.3389/fpsyg.2016.01504

PMID

unavailable

Abstract

Dienes's (2016) article is one of the contributions to the special issue "Bayes factors for testing hypotheses in psychological research…" being published by the Journal of Mathematical Psychology. It is the article most accessible to non-Bayesians, offering a good understanding of Jeffreys's data testing approach (Bayes factors) with little in the form of mathematical expressions.

Dienes's main argument is one of best-fit-for-purpose: the Bayes factor pits the probability of the data under one hypothesis against that under another on equal ground, providing a symmetric assessment--the data may favor either hypothesis, or neither--as a continuous measure of evidence in the form of odds. For Dienes, Jeffreys's (1961) approach is, if not perfect, at least superior to those of Fisher's (1954) tests of significance and of Neyman and Pearson's (1933) tests of acceptance. Unlike Bayes factors, Fisher's approach only tests data under a null hypothesis so that the resulting p-value is asymmetric, capable of providing evidence against such hypothesis but not in its favor. Neyman-Pearson's approach, on the other hand, uses two hypotheses and allows some ground for asserting either if the power of the test is adequate; however, it is not evidential insofar it has little use for sample statistics such as p-values and post-hoc power. Therefore, not only the use of Bayes factors is a much better approach for testing research data but such use will also "help solve some…of the problems leading to the credibility crisis" (p. ii) posed by the latter two approaches.

One concern I have with Dienes's article is its "one-size-fits-all" philosophy. Allow me to argue the point using non-research affairs, which seem more relatable. Most (if not all) of us have certainly been in the position of having to choose between valuable alternatives, pitting one against the other and selecting that which came on top. Such positions may range from the serious-- "Which cancer treatment to choose, radiotherapy or surgery?"--to the rather banal--"Coffee or tea?" However, there are times when decisions do not need, nor benefit from, such pitting among defined alternatives. "Do I have a temperature?" is a question that calls for assessing data against a known cut-off that rejects the normal hypothesis in favor of the sick hypothesis without the need to test the latter. There are also many times when decisions are based on assessing just a single model in reference to standards of its own and not in relation to alternative hypotheses, such as deciding whether we are enjoying our lunch or whether we are happy with our lives.

Furthermore, there are occasions in which any of the three methods may be used depending on how the situation comes to us. For it is possible for the same person to decide to divorce if a comparatively better person comes along one day, as it is for him or her to divorce only after high thresholds of regret between omission and commission have been breached in a long-run of mulling over the possibilities, as it is to divorce for reasons other than the existence of alternatives (e.g., because the person has just been abused by her or his current partner).

Most research in psychology fit well the aims of a test of significance, especially in regards to null hypotheses been uninteresting models which serve only the purpose of offering an exact distribution against which to test the research data at hand. For if one is just interested in the significance of a treatment (as in its practical importance using small samples, Perezgonzalez, 2015a), what is to be gained from supporting the null? The situation would certainly be different if one were interested in both, for example because ineffective treatments could be used as placebo in future research projects or because the null represents a general law (Jeffreys, 1961; also Robert, 2016). In the latter cases, a null model is equally interesting and Bayes factors relevant. Thus, I find it naive that a single approach is still proposed as the one and only tool for testing data. It is true that a research question may be adapted to suit a particular tool but this does not guarantee that such tool will address the research question correctly.


Language: en

NEW SEARCH


All SafetyLit records are available for automatic download to Zotero & Mendeley
Print