A different approach to compute besselk in intermediate zones

@cgeoga I was thinking some more about how we compute the region between asymptotic expansions (power series, uniform expansions, and large argument). Which corresponds to roughly `0<nu<25` and `2 < x < 30`...

Right now, we are using the wronskian relation to `besseli`. The reason we can do this is that the power series of `besseli` is convergent in this zone so using that along with the continued fraction and wronskian we can get `besselk` pretty accurately. Now there are a couple drawbacks to that: 

1. We need `besseli(nu, x)` and `besseli(nu + 1, x)` and then also need to compute `besselk(nu+1,x)/besselk(nu,x)`.
2. Power series are slower to converge as `x` gets bigger
3. The power series of `besseli(nu, x)` suffers from cancellation for complex numbers so can't be as uniformly used.

The first point was taken care of by using SIMD instructions to compute both `besseli(nu, x)` and `besseli(nu + 1, x)`. So that is pretty optimized (though we pay an extra penalty for complex numbers). But points 2 and 3 are pretty much unsolvable. Power series will always be slow for larger arguments and can't use much for complex numbers.

I was toying with an alternative which links a few ideas. Now the continued fraction approach to compute `besselk(nu+1,x)/`besselk(nu,x)` converges more quickly as `x` gets bigger so it has an inverse relation to the power series. (i.e., power series converges faster for small `x` while continued fraction converges faster for big `x`). So what if there was a good way to use the uniform expansions to link with the smaller orders. Now Miller's algorithm links stable backward recurrence to higher orders with a normalization factor calculated from low orders. That is fine but usually we do not know the lower orders unless for integer arguments.

I'm sure this idea has been described before but can't seem to find a description of it. But there appears to be a way to link stable forward recurrence with a normalization factor calculated at higher orders to compute `besselk` at lower orders. It works by instead of considering trial start values (0 and epsilon) in the Miller backwards scheme, it uses the continued fraction approach to generate start values in the forward scheme. So the algorithm would look something like:
1. Generate start values `1.0` and `knup1/knu`.
2. Use upward recurrence up until a order you can use asymptotic expansion (`nu_ceil`).
3. Compute `besselk(nu_ceil, x)` using asymptotic expansion
4. Normalize starting value 

It may seem complicated but instead of having to compute `besseli(nu, x)`, `besseli(nu + 1, x)` and `besselk(nu+1,x)/`besselk(nu,x)` we just need `besselk(nu_ceil, x)` and `besselk(nu+1,x)/`besselk(nu,x)`. There is a few extra steps of course to generate the recurrence but that is pretty small.

So now we have an algorithm that computes `besselk(nu_ceil, x)` at constant time using uniform asymptotic expansion and algorithm to compute `besselk(nu+1,x)/`besselk(nu,x)` which converges faster as `x` increases. A sample implementation of the algorithm would look like..

```julia
using Bessels

function test(nu, x)
    fpart, _ = modf(nu)
    nu_ceil = 25.0 + fpart
    knu_ceil = Bessels.besselk_large_orders(nu_ceil, x)
    knup1 = Bessels.besselk_ratio_knu_knup1(nu, x)
    return knu_ceil / Bessels.besselk_up_recurrence(x, knup1, 1.0, nu + 1, nu_ceil)[1]
end
```

Speed tests to other approach.

```julia
## high precision calculation
julia> 2.308550810254649099632E-14
2.308550810254649e-14

julia> test(2.2, 30.0)
2.3085508102546522e-14

julia> Bessels.besselk_continued_fraction(2.2, 30.0)
2.3085508102546462e-14

julia> @benchmark test(2.2, x) setup = (x = 30.0 + rand())
BenchmarkTools.Trial: 10000 samples with 968 evaluations.
 Range (min … max):  79.846 ns … 120.610 ns  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     79.976 ns               ┊ GC (median):    0.00%
 Time  (mean ± σ):   80.832 ns ±   2.255 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▆█▃                    ▂▃           ▃▅▁                      ▁
  ███▆▅▅▄▃▄▄▄▄▆▄▄▃▁▁▄▃▄▁▆██▆▅▃▃▃▄▄▄▅▄▆███▅▄▅▃▃▁▃▁▄▃▁▄▃▃▁▃▃▁▁▃█ █
  79.8 ns       Histogram: log(frequency) by time      87.2 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark Bessels.besselk_continued_fraction(2.2, x) setup = (x = 30.0 + rand())
BenchmarkTools.Trial: 10000 samples with 898 evaluations.
 Range (min … max):  127.180 ns … 178.360 ns  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     129.315 ns               ┊ GC (median):    0.00%
 Time  (mean ± σ):   128.840 ns ±   1.969 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

   █▇▂              █▇                                           
  ▃███▃▂▂▂▂▂▂▂▂▁▂▂▂███▄▃▂▂▂▂▂▃▃▃▄▃▂▂▂▂▂▂▁▂▂▂▂▃▃▂▂▂▂▂▁▁▂▂▂▂▂▂▂▂▂ ▃
  127 ns           Histogram: frequency by time          134 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.
```

Which is about 40% faster for bigger arguments. For smallish arguments its about 10% faster.
```julia
julia> @benchmark Bessels.besselk_continued_fraction(2.2, x) setup = (x = 10.0 + rand())
BenchmarkTools.Trial: 10000 samples with 940 evaluations.
 Range (min … max):  102.305 ns … 145.612 ns  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     104.122 ns               ┊ GC (median):    0.00%
 Time  (mean ± σ):   104.488 ns ±   2.483 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▅▆     ▄█▅▂   ▁    ▅▄▃▁     ▁                                 ▂
  ██▇▁▁▁▁████▆▄▄█▇▅▅▄████▇▅▅▄▇█▇▆▆▇▇▅▄▁▃▄▅▅▆▅▆▅▆▆▅▄▅▄▄▃▄▄▅▅▃▄▄▄ █
  102 ns        Histogram: log(frequency) by time        115 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark test(2.2, x) setup = (x = 10.0 + rand())
BenchmarkTools.Trial: 10000 samples with 954 evaluations.
 Range (min … max):  93.203 ns … 134.565 ns  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     97.615 ns               ┊ GC (median):    0.00%
 Time  (mean ± σ):   96.402 ns ±   2.687 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▇▅              ▁       █▅              ▁         ▁▃         ▂
  ██▇▅▄▅▄▅▅▄▄▁▁▃▄▅█▆▅▄▁▁▁███▇▅▅▅▅▇▆▆▅▁▄▁▃▇█▇▆▅▅▄▃▅▁▅██▆▃▄▁▄▁▄▃ █
  93.2 ns       Histogram: log(frequency) by time       104 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.
```

And for much smaller arguments it's hard to beat the power series so it is slower.
```julia
julia> @benchmark test(2.2, x) setup = (x = 4.0 + rand())
BenchmarkTools.Trial: 10000 samples with 918 evaluations.
 Range (min … max):  115.423 ns … 157.588 ns  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     121.642 ns               ┊ GC (median):    0.00%
 Time  (mean ± σ):   122.518 ns ±   4.119 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▆              █             ▂█              ▆              ▃ ▂
  █▇▃▄▄▆▃▃▁▁▃▁▁▁▇██▅▃▆█▆▄▄▄▃▃▄▁███▄▁▅█▆▄▃▄▁▃▃▄▅██▃▄▄▅▅▃▁▃▁▄▁▃▁█ █
  115 ns        Histogram: log(frequency) by time        133 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark Bessels.besselk_continued_fraction(2.2, x) setup = (x = 4.0 + rand())
BenchmarkTools.Trial: 10000 samples with 942 evaluations.
 Range (min … max):  100.849 ns … 158.882 ns  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     108.678 ns               ┊ GC (median):    0.00%
 Time  (mean ± σ):   108.736 ns ±   4.641 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▄    ▅      █▂  ▅▂  ▂▆  ▇▄   ▃  ▂▅  ▅▃   ▁   ▃   ▃            ▂
  █▄▁▄▇█▅▁▆▃▃▄██▆▆██▄▄███▅██▆▆██▆▄██▇▆██▅▆▅█▆▅██▇▅▆█▅▅▄▇▄▄▄██▁█ █
  101 ns        Histogram: log(frequency) by time        123 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.
```

So there is of course things to consider but this approach has the advantage of being more applicable to a large parameter set including complex numbers and has the potential for large speed gains over a fairly big domain. My initial implementation could also probably be improved to speed things up a bit but it does serve as a good way to link directly to the asymptotic expansions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

A different approach to compute besselk in intermediate zones #74

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

A different approach to compute besselk in intermediate zones #74

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions