Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pg.bayesfactor_pearson(r=-0.856, n=64, alternative="greater") returns wrong BF value #427

Open
tomasdominik opened this issue Jun 20, 2024 · 2 comments
Assignees
Labels
bug 💥 Something isn't working

Comments

@tomasdominik
Copy link

tomasdominik commented Jun 20, 2024

I have two sets of paired data (n=64). Person correlation between the sets is -0.856 (precise value -0.856341390601075). Using pg.bayesfactor_pearson(r=-0.856, n=64, alternative="two-sided"), I receive correct Bayes factor of ~2.7309e+16 (checked with JASP 0.17).

Using pg.bayesfactor_pearson(r=-0.856, n=64, alternative="less"), I receive Bayes factor of ~5.4617e+16, which also corresponds with JASP.

However, pg.bayesfactor_pearson(r=-0.856, n=64, alternative="greater") returns Bayes factor of 976, which is incorrect (JASP shows 1e-317). It also obviously cannot be the right answer either way, because the correlation is negative, and so one-tailed correlation test assuming positive correlation cannot show evidence for the alternative.

Is this a bug in pingouin or is this a case of float underflow due to the miniscule size of the correct BF?

@raphaelvallat raphaelvallat self-assigned this Jun 22, 2024
@raphaelvallat raphaelvallat added the bug 💥 Something isn't working label Jun 22, 2024
@raphaelvallat
Copy link
Owner

Thanks for opening the issue. I finally found some time to dive into this. I think it is the latter, but unfortunately I have not been able to find a solution for it. For reference:

for r in [-0.95, -0.8, -0.6, -0.54, -0.5, -0.2, -0.1]:
    print(f"r = {r}, n = 200")
    pg.bayesfactor_pearson(r=r, n=200, alternative="less", method="ly")
    print()

gives

r = -0.95, n = 200
two-tailed = nan, less = nan, greater = nan

r = -0.8, n = 200
two-tailed = 2.726641354240274e+42, less = 5.45328270848037e+42, greater = 1.82596155794594e+29

r = -0.6, n = 200
two-tailed = 8.807899473881192e+17, less = 1.76157989477619e+18, greater = 50944.0

r = -0.54, n = 200
two-tailed = 41931590197275.336, less = 83863180394547.7, greater = 3.0

r = -0.5, n = 200
two-tailed = 156109336868.3086, less = 312218673736.598, greater = 0.01953125

r = -0.2, n = 200
two-tailed = 4.839589594799137, less = 9.65632205215847, greater = 0.0228571374398054

r = -0.1, n = 200
two-tailed = 0.23705535587016244, less = 0.435957165271682, greater = 0.0381535464686424

Note the incorrect shift from a BF_greater < 1 to a BF_greater > 1 at r=-0.54. Instead, we'd expect to see BF_greater becoming smaller and smaller with a more negative r.

I think a workaround could be to set BF_greater to 0 if BF_less is very large (and vice versa).

@tomasdominik
Copy link
Author

I agree, that's exactly the workaround I ended up using. There probably is a more correct way to deal with the problem, but applying what you suggest (perhaps with a warning for the user) might be good enough. Thank you for looking into it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug 💥 Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants