Skip to content

WIP: remove Rmath dependence #138

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 22 commits into from
Closed

WIP: remove Rmath dependence #138

wants to merge 22 commits into from

Conversation

simonbyrne
Copy link
Member

I've created a branch to remove the Rmath dependence. I've done some of the easy distributions, but we're going to need some more special functions before we can finish.

  • Beta
  • Binomial
  • Cauchy
  • Chisq
  • FDist
  • Gamma
  • Geometric
  • HyperGeometric
  • Logistic
  • LogNormal
  • NegativeBinomial
  • NoncentralBeta
  • NoncentralChisq
  • NoncentralF
  • NoncentralT
  • Normal
  • Poisson
  • TDist
  • Uniform
  • Weibull

Other tasks

  • HyperGeometric still needs a proper rand, but I don't have access to the Kachitvichyanukul & Schmeiser paper.
  • Lots of log-precision functions are still missing. In particular, it would be really useful to have the log of rcomp and brcomp, which are used in a couple of pdfs.
  • Make the NSWC special functions more usable, by renaming, throwing appropriate errors, and sorting out dispatch (I've made a start on this).
  • The Noncentral distributions
  • More tests.

@lindahua
Copy link
Contributor

I would be nice to have a benchmark to keep track of the performance change when replacing Rmath functions with our own implementations.

@simonbyrne
Copy link
Member Author

@lindahua Agreed, that's why I didn't want to merge it in right away. I would also like some more stringent accuracy checks, particularly for the extreme tails: BigFloats could come in useful here.

@andreasnoack
Copy link
Member

I have translated special functions from the NSWC library to Julian which could cover at least Beta, Gamma and Binomial except for random number generation. Maybe other distributions would also be covered from these special functions. If we are willing to consider NSWC public domain (as discussed several times) I can contribute them.

@simonbyrne
Copy link
Member Author

@andreasnoackjensen That would be great,

@johnmyleswhite
Copy link
Member

@andreasnoackjensen, please do merge in the NWSC code.

For accuracy checking, let's start by making a table of BigFloat results from Mathematica for all the functions we replace.

@StefanKarpinski
Copy link

This is great stuff. Getting rid of the Rmath dependency will be great on multiple levels – liberal licensing + having pure Julia implementations of all of these basic stats functions.

@ViralBShah
Copy link
Contributor

This really is amazing work guys!

@ViralBShah
Copy link
Contributor

This also means that we can avoid distributing Rmath, which is currently part of julia base, and a big GPL dependency.

@dmbates
Copy link
Contributor

dmbates commented Sep 12, 2013

I did get a chance to talk with @mmaechler (Martin Maechler) this summer about the implementation of the incormplete beta function in Rmath (src/toms708.c). Martin has made considerable effort to ensure accuracy of the bratio function defined there across a wide range of input values. It would be wise to solicit Martin's opinion on implementations of the incomplete beta function and others derived from it.

@andreasnoack
Copy link
Member

It would be good to hear about any experiences from the translation of bratio. My translation is almost a literal translation of the Fortran source, except when Julia already had implementation of the functions used, e.g. erf. I have tried to test the function on many different inputs and compared with the version from Rmath. When they differed, I compared to Mathematica and with BigFloat input. I couldn't find values where it did worse than Rmath, but there could be special combinations that I haven't tried.

@ViralBShah
Copy link
Contributor

Does it make sense to merge this PR and replace Rmath functionality piecewise over time?

@StefanKarpinski
Copy link

That seems like it will be less chaotic and get more eyeballs on the replacement implementations that we already have.

@simonbyrne
Copy link
Member Author

Although I'm happy to leave the noncentral distributions for the time being, it would be nice to get the rest completed before merging.

However I don't want to lose Rmath altogether, as it would be a useful point of comparison (both for performance and accuracy). Also, although I've ticked some of the above as completed, we're still lacking log-precision (i.e. logpdf, logcdf) functions for some of these distributions (@andreasnoackjensen, any interest in a log-incomplete beta function?)

From what I can tell, Rmath is no longer used in Base: would it be possible to split it off into a separate package?

@StefanKarpinski
Copy link

From what I can tell, Rmath is no longer used in Base: would it be possible to split it off into a separate package?

That's a good idea. It might have a tricky build process, but it ought to be doable.

@johnmyleswhite
Copy link
Member

If this doesn't depend on the NWSC code, I'd be happy to see it merged whenever @simonbyrne feels ready for that.

@simonbyrne
Copy link
Member Author

I've merged in #148, so this does include the @andreasnoackjensen's NSWC-derived functions.

@jiahao
Copy link
Member

jiahao commented Nov 25, 2014

One possibility is to implement special functions with ApproxFun.jl.

cc: @dlfivefifty

@simonbyrne
Copy link
Member Author

As an update on this, Caltech/JPL have now released their software library (based on the NSWF code) under an MIT-style licence: http://netlib.org/math/

@StefanKarpinski
Copy link

It's very hard not to read that as NSFW. I was wondering if that was an oblique way of saying that it has a license that means you shouldn't look at the code.

@ViralBShah
Copy link
Contributor

Strange that people still release stuff on netlib even today!

@matbesancon
Copy link
Member

@simonbyrne maybe we can fix and merge this one and open another issue for the remaining R dependencies?

@StefanKarpinski
Copy link

Why not merge what's already complete and then continue to try to eliminate the Rmath dependency?

@simonbyrne
Copy link
Member Author

Honestly, because I don't trust past self's code.

@ViralBShah
Copy link
Contributor

Should we make it a GSoC project to try and get high quality tests and make this happen?

@simonbyrne
Copy link
Member Author

Yes. We had a special functions one this year, but it didn't pan out.

@matbesancon
Copy link
Member

Fixes #770 (I think)
@simonbyrne if we don't trust your past self code, at least we can trust your past self tests I suppose

@StefanKarpinski
Copy link

Aw, I trust past @simonbyrne. He's always seems like such a stand up guy!

@ararslan
Copy link
Member

I don't know what happened since then :trollface:

@matbesancon
Copy link
Member

ok so what should be done about this one? Closing for now and waiting for a GSoC project?

@ViralBShah
Copy link
Contributor

ViralBShah commented Nov 4, 2018

I say we just leave it open for a GSoC project, if @simonbyrne can mentor. Leaving it open will lead to people discovering what needs to be done.

@ViralBShah
Copy link
Contributor

Perhaps we can merge the parts that work well, and slowly chip away at the dependence on Rmath as opposed to trying to remove it all in one go.

@StefanKarpinski
Copy link

StefanKarpinski commented Nov 4, 2018

I have to say that I feel like this PR is a perfect example of why one doesn't make a single huge PR with an enormous checklist of things to finish. That kind of PR never gets done and always goes stale. Instead, it's much better to make incremental changes and merge them when they're ready. Sure, they may be imperfect but they'll get fixed if they're merged and used. It's fine to use a checklist to keep track of what remains to be done but work should merged when it's ready.

@matbesancon
Copy link
Member

One point to keep in mind is how old this PR is, the conflicts to resolve will just be ugly.
Given @ViralBShah's proposal, we can close this, the PR is simply archived and a future GSoC student can check it to build their own

@ararslan
Copy link
Member

ararslan commented Nov 4, 2018

I don't think we should close this; closed PRs are not discoverable. Folks can also cherry pick bits and pieces from this to modernize and incorporate into the repo rather than attempting to rebase it all at once.

@matbesancon matbesancon added the wont-merge PR will not be merged, kept for archiving purpose label Feb 10, 2019
@richardreeve
Copy link
Contributor

I’ve fixed quite a few of the Rfunction issues (hopefully!) as part of making random number generation multithreading-compatible in #830.

@richardreeve
Copy link
Contributor

I have now taken all of the RNG code from here and put it into #830. Like this PR, it doesn't fix all of the RFunctions dependencies, but that isn't the purpose of the PR.

@andreasnoack
Copy link
Member

I believe almost all of this has been implemented in https://github.com/JuliaStats/StatsFuns.jl at this point so I will go ahead and close it.

@andreasnoack andreasnoack deleted the rm-rmath branch May 27, 2025 18:52
@ViralBShah
Copy link
Contributor

Does this mean we no longer need to worry about Rmath_jll maintenance, since it isn't used by Distributions.jl anymore - which probably was its largest user?

@andreasnoack
Copy link
Member

Unfortunately not yet. There are a few missing functions which are also unchecked in the list here. The progress is tracked in JuliaStats/StatsFuns.jl#37. It is mostly the non-central distributions. We should maybe try to set up a GSOC project for translating these from https://iandjmsmith.wordpress.com/. @devmotion and I might be able to help with that but we missed the boat this year.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
internal wont-merge PR will not be merged, kept for archiving purpose
Projects
None yet
Development

Successfully merging this pull request may close these issues.