Randomized qmc #57

dmetivie · 2022-11-24T14:34:31Z

So this is my first PR :).
Thanks for @ParadaCarleton and @ChrisRackauckas for encouraging that.

The purpose of this PR is to add randomization methods by merging the RandomizedQuasiMonteCarlo.jl (which is now meant to be deprecated).
In particular, there are scrambling methods as Nested Uniform Scrambling and Linear Matrix Scrambling and Digital Shift in arbitrary base b.
These methods are used to randomize deterministic Quasi Monte Carlo methods. In general RQMC has better performance than QMC plus it allows error estimations.
Compare to other packages like SciPy or qmcpy or piece of code you can find on internet, the randomization is done independently of the sequence. In other API, you choose at generation if you want a randomize version or not (see my post).

I have multiple questions and comments attached to this PR
— As for now, arrays x are of the form n×dim while sample produces a dims×n array.
I wondered why this latter convention was decided? Is it because Distributions.jl do like that? If yes, can someone explain why this choice?
In scrambling code, it seems the opposite seem more consistent with column major computing (randomization is done independently for each dimension).

— About the API, I tend to want to separate RQMC methods from deterministic QMC methods rand(ShiftMethod(), x). However, I understand one would want something like that sample(m, SamplingMethod(rand = ShiftMethod())).
So far, I just give the RQMC methods as functions of a point set (see the Readme.md).

nested_uniform_scramble(x, b; M = M)
linear_matrix_scramble(x, base; M = M).
digital_shift(x, base; M = M).
shift(x).
where b is the base used to scramble and M the number of bits in base b used to represent digits.

— Julia question: would it be relevant (and more efficient) to create a class for “binary expansion” in base b≥2 and related operation (like xor).
In the code, I have a lot of AbstractArrays of <:Integer. In base 2 it should simplifies to <:Bool.
I am asking about some kind of type “Bool{b}" which allows Integers from 0 to b-1 only.
When b=2 it would be the good old Bool with certainly performance increase.

ParadaCarleton · 2022-11-24T16:32:24Z

This looks amazing--thank you so much for all the hard work!

Compare to other packages like SciPy or qmcpy or piece of code you can find on internet, the randomization is done independently of the sequence. In other API, you choose at generation if you want a randomize version or not (see my post).

I think it's nice having separate functions for randomization as well (in case users want to randomize themselves), but do you think you could include code that directly samples a randomized sequence? Generating randomized sequences is sometimes a lot faster--see this paper, for example, which gets speedups of several orders of magnitude. You don't have to add code for generating these in-place, though; we can get that done in a future PR. For now, we can just include some kind of randomized constructor like:

sample(n, dims, FaureSample(Randomizer(rng)))

And have this generate the deterministic point set, before randomizing with the methods you've added. We can work on higher-performance implementations later.

I have multiple questions and comments attached to this PR — As for now, arrays x are of the form n×dim while sample produces a dims×n array. I wondered why this latter convention was decided? Is it because Distributions.jl do like that? If yes, can someone explain why this choice? In scrambling code, it seems the opposite seem more consistent with column major computing (randomization is done independently for each dimension).

Mostly this is because of what users will need when using a point set. Users typically iterate over an array one point at a time, rather than one dimension at a time, so we want it to be in an efficient format for that. Memory layout is probably not going to be the bottleneck for any realistic application, though.

— Julia question: would it be relevant (and more efficient) to create a class for “binary expansion” in base b≥2 and related operation (like xor). In the code, I have a lot of AbstractArrays of <:Integer. In base 2 it should simplifies to <:Bool. I am asking about some kind of type “Bool{b}" which allows Integers from 0 to b-1 only. When b=2 it would be the good old Bool with certainly performance increase.

Hmm, possibly, but I'm not 100% sure. There's easy gains are from using either Int16 or Int32 depending on the base, but I'm not sure whether it's possible to do any better than that.

ChrisRackauckas · 2022-11-27T01:37:29Z

I wondered why this latter convention was decided? Is it because Distributions.jl do like that? If yes, can someone explain why this choice?

It's column major for performance assuming you using slices/views of points.

README.md

ChrisRackauckas · 2022-11-27T01:45:46Z

However, I understand one would want something like #17 (comment) sample(m, SamplingMethod(rand = ShiftMethod())).

Yes, this is very important because remember that the vast majority of this kind of feature is transitive, not direct. The vast majority of users are not going to be using QuasiMonteCarlo, they will be using tools like NeuralPDE.jl and Surrogates.jl. In that sense, the only place where this is then exposed to a user is "please pass me a type that tells me how to do quasi-random sampling". So it's imperative that there is a way to say stuff like FaureSample(rand = ShiftMethod()) so that it can be passed down to the algorithms actually using it, like in NeuralPDE: QuasiRandomStrategy(sampler = FaureSample(rand = ShiftMethod())). If any customization needs someone to call different functions, then it cannot be used in such a scenario, and is thus very limiting to its reach.

We need to rebase this so it gets the update that makes it run the downstream tests, though currently as noted in #58 the downstream tests are broken as a prior change seems to have broke Surrogates.jl. It's a similar piece to above: the interfaces need to be solid because the way the package is used is mostly going to be transitively through other packages, so we really need to be sure that there is a single entry point that can be guaranteed and controlled simply through passing type information.

dmetivie · 2022-11-29T15:03:41Z

Thanks for the feedback.

It's column major for performance assuming you using slices/views of points.

I'll change the current code to include permutedims to match the array convention.

dmetivie · 2022-11-29T15:04:23Z

imperative that there is a way to say stuff like FaureSample(rand = ShiftMethod()) so that it can be passed down to the algorithms

Ok, I'll add a field rand to QMC methods, e.g.

Base.@kwdef struct LowDiscrepancySample{T} <: SamplingAlgorithm
    base::T
    rand = ShiftMethod()
end

with default as

rand = NestedUniformSramblingMethod(base) (Sobol, Faure).
I'll remove the current default randomization of LowDiscrepencySample (aka Halton), LatticeRule and GridSample to put rand = ShiftMethod()
Uniform and other distribution should have no field rand or rand = NoRand().

and then do as @ParadaCarleton says

And have this generate the deterministic point set, before randomizing with the methods you've added.

However, I'll also export the randomization methods directly, it should not cause issue or interference with other packages.

dmetivie · 2022-11-29T15:04:30Z

We need to rebase this so it gets the update that makes it run the downstream tests, though currently as noted in #58 the downstream tests are broken as a prior change seems to have broke Surrogates.jl. It's a similar piece to above: the interfaces need to be solid because the way the package is used is mostly going to be transitively through other packages, so we really need to be sure that there is a single entry point that can be guaranteed and controlled simply through passing type information.

I am not sure what you mean by rebase.

ParadaCarleton · 2022-11-30T02:13:24Z

imperative that there is a way to say stuff like FaureSample(rand = ShiftMethod()) so that it can be passed down to the algorithms

Ok, I'll add a field rand to QMC methods, e.g.
Base.@kwdef struct LowDiscrepancySample{T} <: SamplingAlgorithm
    base::T
    rand = ShiftMethod()
end
with default as

rand = NestedUniformSramblingMethod(base) (Sobol, Faure).

I'll remove the current default randomization of LowDiscrepencySample (aka Halton), LatticeRule and GridSample to put rand = ShiftMethod()

Uniform and other distribution should have no field rand or rand = NoRand().

and then do as @ParadaCarleton says

And have this generate the deterministic point set, before randomizing with the methods you've added.

However, I'll also export the randomization methods directly, it should not cause issue or interference with other packages.

Thanks! Can you change the name from rand (clashes with the function, might be confusing), and maybe shorten some of the names? (e.g. OwenScramble instead of NestedUniformScramblingMethod).

One nice way to let users randomize directly might be to use function objects; randomizer = OwenScramble(base, rng) could return a callable struct.

…eCarlo.jl into randomizedQMC

dmetivie · 2022-11-30T13:34:45Z

One nice way to let users randomize directly might be to use function objects; randomizer = OwenScramble(base, rng) could return a callable struct.

Could you illustrate a bit more what you mean? I have a hard time understanding what difference would
function objects; randomizer = OwenScramble(base, rng) make with a standard function.

dmetivie · 2022-11-30T14:02:02Z

I feel like some changes mentioned in #44 should be implemented directly in this PR.
The easiest one: changing LowDiscrepencySample to Halton
The big one:
The randomization methods that naturally apply to points between 0 and 1. Hence, I feel like we should

Remove all ub, lb to just have one _sample(n,d,Sample(arg)) method for each QMC method

function _sample(n,d,Sample(arg))
    # Actual QMC (or MC) method returning points in [0,1]^d
end

Randomization is done separately in one generic function

function sample(n, d, Sampler(arg, randomizer))
    x = _sample(n, d, Sampler(arg))
    x[:] = randomizer(x)
    return x
end

where randomizer is different for each method (maybe related to your previous comment @ParadaCarleton ?).
As mentioned here

I think it's nice having separate functions for randomization as well (in case users want to randomize themselves), but do you think you could include code that directly samples a randomized sequence? Generating randomized sequences is sometimes a lot faster--see this paper, for example, which gets speedups of several orders of magnitude. You don't have to add code for generating these in-place, though; we can get that done in a future PR.

methods which randomize in place i.e. as they generate the sample will be done separately latter, they will have specialized sample methods.

Lastly we have one rescaling function with

# From Faure.jl
function sample(n, lb, ub, Sampler(arg, randomizer))
    length(lb) == length(ub) || DimensionMismatch("Lower and upper bounds do not match.")
    d = length(lb)
    x = sample(n, d, Sampler(arg, randomizer))
    @inbounds for (row_idx, row) in enumerate(eachrow(x))
        @. row = (ub[row_idx] - lb[row_idx]) * row + lb[row_idx]
    end
    return x
end

Doing all that should remove plenty of repeated code lines.
Now remained the type issue raised here which I did not understand how to overcome cleanly.

ParadaCarleton · 2022-11-30T19:12:42Z

Could you illustrate a bit more what you mean? I have a hard time understanding what difference would function objects; randomizer = OwenScramble(base, rng) make with a standard function.

It wouldn't make a huge difference, it's just one way you could set up an interface I think would look nice, since it wouldn't require creating a randomization function separate from each object. Defining a randomizer like this would let you use this syntax:

randomizer = OwenScramble(base, rng)
randomizer(qmc_sequence, other_args)  # returns randomized QMC sequence

I just find this aesthetically pleasing, but there's not a big need for it if you think otherwise.

I feel like some changes mentioned in #44 should be implemented directly in this PR.

I think it makes sense to avoid some of these if they're not directly related, e.g. renaming LowDiscrepancySample to Halton. If you'd like, you can make those in a separate PR first, since that might be easier, and then we can merge this PR.

ChrisRackauckas · 2022-12-02T14:22:34Z

I gave @ParadaCarleton maintainer rights for this repo, so he'll be able to merge than he think it's ready. I am agreeing with his interface ideas here and he's deeper into this field than I, so I'll let him take this PR from here. Though it's a big change so I wouldn't aim for perfectionism: let's try to get something agreeable in, and then start fixing it in small bits.

I think it makes sense to avoid some of these if they're not directly related, e.g. renaming LowDiscrepancySample to Halton. If you'd like, you can make those in a separate PR first, since that might be easier, and then we can merge this PR.

This kind of stuff should be follow up PRs. When there's a giant PR like this, everyone is a bit reluctant to make other changes since then this needs to rebase onto the new master all of the time. So let's try to find a good stopping point here, get it in, and then the smaller details can be nibbled at by everyone until we're happy. Then do the high level interface overhaul to the new names and such, and that sounds like a good v1.0.

ParadaCarleton · 2022-12-02T23:32:43Z

We need to rebase this so it gets the update that makes it run the downstream tests, though currently as noted in #58 the downstream tests are broken as a prior change seems to have broke Surrogates.jl. It's a similar piece to above: the interfaces need to be solid because the way the package is used is mostly going to be transitively through other packages, so we really need to be sure that there is a single entry point that can be guaranteed and controlled simply through passing type information.

I am not sure what you mean by rebase.

Git works by tracking all the changes (the history) of a file, rather than by tracking the current state. If you know all the changes that have been made to a file, you can reconstruct what it looks like now, but this representation makes it easier for multiple people to work on the same code.

Currently, your changes are "based" on an old version of QMC that you pulled from the server about a week ago. While you were making these changes, we made a separate bug fix to correct a problem with our implementation of Kronecker sequences.

"Rebase" means you take your current branch, then change the base (from the old version you started with, to the most recently-updated version). That way, your changes are applied on top of the newest version, instead of being applied on top of the old one. You can find a tutorial on how to do this in GitKraken (a git GUI) here, or you can use some other git GUI like Github Desktop (which should have their own tutorials).

ParadaCarleton · 2023-01-11T20:09:22Z

@ChrisRackauckas @vikram-s-narayan could you explain the use case for the generate_design_matrices function and what it's supposed to be doing?

dmetivie · 2023-01-12T14:17:02Z

I found two mentions of generate_design_matrices in other code.

In both cases, each matrice should be i.i.d. I guess. I believe this can only be achieved cleanly with multiple independent randomization of original QMC sequence.

It seems to me like the current result of generate_design_matrices in these packages is kinda wrong.

If the default randomization was set to something else than NoRand() (like Shift for Halton and Lattice, and some ScrambleMethod for nets) the quality of generate_design_matrices would improve.

BTW maybe we should also consider an on place version (or iterator) for generate_design_matrices as it might be super costly to store a long vector of large matrices.

ParadaCarleton · 2023-01-12T15:11:29Z

Right, I see. I think all of those are good ideas; generate_design_matrices should be in-place (returning a lazy generator that mutates a single matrix). For randomization, I think Owen scrambling is the best default for T, M, S nets. Halton sequences can also be scrambled in the same way.

@ChrisRackauckas it looks like some of these packages need much more rigorous testing, to make sure the programs are returning correct answers. It’s a huge problem that none of this was caught when reviewing the relevant PRs.

ChrisRackauckas · 2023-01-12T15:41:10Z

They are passing their convergence tests. Downstream GlobalSensitivity.jl has a bunch of regression tests against other GSA libraries (like sensitivity from R) https://github.com/SciML/GlobalSensitivity.jl/blob/master/test/sobol_method.jl, the NeuralPDE library has a ton of convergence testing against analytical solutions https://github.com/SciML/NeuralPDE.jl/blob/master/test/NNPDE_tests.jl#L55
They just aren't testing every choice of sampler with every choice of algorithm. We don't have the compute power for that (right now, I'm trying to secure more compute), so generally they are testing Sobol or LatinHypercube which is fine. It's fine to say that allowing different samplers is still a work in progress, as you see.

This is why the downstream tests were added. Downstream usage and applications have much more rigorous tests of the results, though they stick to a subset of samplers that we know work.

it looks like some of these packages need much more rigorous testing

Yes, this one. We know that this one was created pretty hastily to get the interface together but the internals need more than a little bit of work. That said, all applications right now stick to the two samplers without randomization or restarting so it works fine, though is of course not optimal. Basically, we needed the interfaces to exist so it could be started to be developed on downstream, but other than the two wrap algorithms called in full the rest was left pretty WIP in QMC. The docs (https://docs.sciml.ai/QuasiMonteCarlo/dev/) pretty much scream "this is still work in progress" though, so it's at least understood that this generalization is still in progress.

BTW, we should probably write up a synopsis for a QMC GSoC. Would you be interested to help mentor someone on this?

Right, I see. I think all of those are good ideas; generate_design_matrices should be in-place (returning a lazy generator that mutates a single matrix).

It should have an API for that, but making it simple to get actual matrices is still very necessary since it can be difficult to use that lazy generator for example in GPU kernels, which is where this tends to be used downstream the most (NeuralPDE and GSA generally try to make use of GPUs)

dmetivie · 2023-01-12T16:20:57Z

Just to clarify, I said the current implementation is “kinda wrong” in the sense it is not pure QMC, however, it might still be better than MC. Hence, all the tests mentioned by @ChrisRackauckas can still pass.
With “true randomized QMC”, I would expect better precision for the same matrix size.
However, it is longer to run (feel free to speed up the randomization code!).
So indeed I believe this QMC gain over MC has to be tested inside this package and not in each downstream.

About QMC GSoC, I have no previous experience, but I would be happy to mentor.

About in place iterator, I do not feel competent ! Besides that, for this PR I am kind of done in terms of features I wanted to add

dmetivie · 2023-01-12T16:27:38Z

Here is a kind of test we could add (in test and doc to illustrate)

using QuasiMonteCarlo, Plots, StatsBase
b = 2
d = 2
m = 12
M = 1000
N = b^m

dic = Dict{String,Vector{Matrix{Float64}}}()
dic["MC"] = [rand(d, N) for i in 1:M] # MC Shoud be ok 
dic["u_lat_no"] = QuasiMonteCarlo.generate_design_matrices(N, d, LatticeRuleSample(), NoRand(), M) # Should be sort of weird QMC
dic["u_lat_si"] = QuasiMonteCarlo.generate_design_matrices(N, d, LatticeRuleSample(), Shift(), M) # Should be ok QMC
dic["u_sob_no"] = QuasiMonteCarlo.generate_design_matrices(N, d, SobolSample(), NoRand(), M) # Should be sort of weird QMC
dic["u_sob_ma"] = QuasiMonteCarlo.generate_design_matrices(N, d, SobolSample(), MatousekScramble(base=b, pad=2m), M) # Should be ok QMC

func_test(x) = sqrt(sum(x.^2)) < 1 ? 1 : 0 # inside the quarter circle? Integral is π/4 exactly

res = Dict{String,Vector{Float64}}()
for (key, val) in pairs(dic)
    res[key] = [mean(func_test(val[r][:, i]) - π/4 for i in 1:N) for r in 1:M]
end

plot()
[histogram!(val, label=key, alpha=0.5) for (key, val) in pairs(res)]
plot!()

This shows that the NoRand() are better that MC worse than QMC. In particular, they have weird heavy tails.
I'll let you play zoom on that.

ChrisRackauckas · 2023-01-12T16:51:46Z

We should add a downstream test to GlobalSensitivity.jl though, that's a good point because it is one of the big consumers of this package right now. However, I would be heavily surprised if GSA or physics-informed neural network tests would be able to see a major difference due to this effect: it probably improves downstream, but the more general tests won't be sensitive enough to that.

So definitely one of the next things we need to do is get QMC.jl to have a much more robust test set that is showing some of the finer details of the properties. I 100% agree on that.

About in place iterator, I do not feel competent ! Besides that, for this PR I am kind of done in terms of features I wanted to add

As we tend to always do, we can merge, open an issue, and get that added later. Open source is an ecosystem and a community, no one needs to solve the whole problem.

QuasiMonteCarlo.generate_design_matrices(N, d, SobolSample(), NoRand(), M) # Should be sort of weird QMC why would this case be weird? IIUC, the issue with the randomization shouldn't come into play if you just use a single sequence, and the generate_design_matrices functions generate the whole design matrix as a single sequence?

dmetivie · 2023-01-13T10:27:13Z

QuasiMonteCarlo.generate_design_matrices(N, d, SobolSample(), NoRand(), M) # Should be sort of weird QMC why would this case be weird? IIUC, the issue with the randomization shouldn't come into play if you just use a single sequence, and the generate_design_matrices functions generate the whole design matrix as a single sequence?

In my example M = num_mats = 1000 so it produces a 1000 design matrices.
QuasiMonteCarlo.generate_design_matrices(N, d, SobolSample(), NoRand(), M)
does

out = sample(n, num_mats * d, SobolSample(), T) -> one fine QMC sequence of length n dimension num_mats * d.
return [out[(j * d + 1):((j + 1) * d), :] for j in 0:(num_mats - 1)] -> split the dimension coordinates to form num_mats sequences of length n. Are these sequences individually true QMC sequence ? No.
Are they still better than MC? From numerics, it seems to be the case.

Indeed, if M = num_mats = 1 then generate_design_matrices(N, d, SobolSample(), NoRand(), M) = sample(N, d*1, SobolSample(), NoRand()) which is a fine QMC.

README.md

ChrisRackauckas · 2023-01-13T16:31:16Z

README.md

+    x_digital_shift = randomize(x_faure, DigitalShift(base = b, pad = pad))
+    x_shift = randomize(x_faure, Shift())
+    x_uniform = rand(d, N) # plain i.i.d. uniform
+```
+
+```julia
+using Plots
+# Setting I like for plotting
+default(fontfamily="Computer Modern", linewidth=1, label=nothing, grid=true, framestyle=:default)
+```
+
+Plot the resulting sequences along dimensions `1` and `3`.
+
+```julia
+    d1 = 1
+    d2 = 3
+    sequences = [x_uniform, x_faure, x_shift, x_digital_shift, x_lms, x_nus]
+    names = ["Uniform", "Faure (unrandomized)", "Shift", "Digital Shift", "Linear Matrix Scrambling", "Nested Uniform Scrambling"]
+    p = [plot(thickness_scaling=1.5, aspect_ratio=:equal) for i in sequences]
+    for (i, x) in enumerate(sequences)
+        scatter!(p[i], x[d1, :], x[d2, :], ms=0.9, c=1, grid=false)
+        title!(names[i])


These have an extra tab in them

ChrisRackauckas · 2023-01-13T16:31:42Z

README.md

+begin
+    d1 = 1 
+    d2 = 3
+    x = x_nus
+    p = [plot(thickness_scaling=1.5, aspect_ratio=:equal) for i in 0:m]
+    for i in 0:m
+        j = m - i
+        xᵢ = range(0, 1, step=1 / b^(i))
+        xⱼ = range(0, 1, step=1 / b^(j))
+        scatter!(p[i+1], x[d1, :], x[d2, :], ms=2, c=1, grid=false)
+        xlims!(p[i+1], (0, 1.01))
+        ylims!(p[i+1], (0, 1.01))
+        yticks!(p[i+1], [0, 1])
+        xticks!(p[i+1], [0, 1])
+        hline!(p[i+1], xᵢ, c=:gray, alpha=0.2)
+        vline!(p[i+1], xⱼ, c=:gray, alpha=0.2)
+    end
+    plot(p..., size=(1500, 900))
+end


Suggested change

begin

d1 = 1

d2 = 3

x = x_nus

p = [plot(thickness_scaling=1.5, aspect_ratio=:equal) for i in 0:m]

for i in 0:m

j = m - i

xᵢ = range(0, 1, step=1 / b^(i))

xⱼ = range(0, 1, step=1 / b^(j))

scatter!(p[i+1], x[d1, :], x[d2, :], ms=2, c=1, grid=false)

xlims!(p[i+1], (0, 1.01))

ylims!(p[i+1], (0, 1.01))

yticks!(p[i+1], [0, 1])

xticks!(p[i+1], [0, 1])

hline!(p[i+1], xᵢ, c=:gray, alpha=0.2)

vline!(p[i+1], xⱼ, c=:gray, alpha=0.2)

end

plot(p..., size=(1500, 900))

end

d1 = 1

d2 = 3

x = x_nus

p = [plot(thickness_scaling=1.5, aspect_ratio=:equal) for i in 0:m]

for i in 0:m

j = m - i

xᵢ = range(0, 1, step=1 / b^(i))

xⱼ = range(0, 1, step=1 / b^(j))

scatter!(p[i+1], x[d1, :], x[d2, :], ms=2, c=1, grid=false)

xlims!(p[i+1], (0, 1.01))

ylims!(p[i+1], (0, 1.01))

yticks!(p[i+1], [0, 1])

xticks!(p[i+1], [0, 1])

hline!(p[i+1], xᵢ, c=:gray, alpha=0.2)

vline!(p[i+1], xⱼ, c=:gray, alpha=0.2)

end

plot(p..., size=(1500, 900))

src/RandomizedQuasiMonteCarlo/net_utilities.jl

ChrisRackauckas · 2023-01-13T16:51:34Z

This looks good to me, though I don't know about that test function. If you put a reference to it I'll take a look there.

I'll let @ParadaCarleton take a stroll through this though with a finer comb.

ParadaCarleton · 2023-01-13T20:06:19Z

This looks good to me, though I don't know about that test function. If you put a reference to it I'll take a look there.

I'll let @ParadaCarleton take a stroll through this though with a finer comb.

The test is effectively the same as my old test for t, m, s properties; it verifies the defining property of all t, m, s sequences, i.e. that all elementary intervals of size b^(t - m) have the same number of points.

ChrisRackauckas · 2023-01-14T15:57:10Z

So then is this good to merge? Do we call this 1.0?

ParadaCarleton · 2023-01-14T16:27:34Z

So then is this good to merge? Do we call this 1.0?

Still haven’t had a chance to review this, but I will soon. I’d definitely hold off on 1.0 for now, though (there’s still quite a bit left before that).

ParadaCarleton · 2023-01-16T16:29:40Z

@dmetivie Do you think this is ready for review?

ParadaCarleton · 2023-01-16T16:30:38Z

Also, could you move the information you've added to the README to a new page in the docs? The README should really just be a brief description of the package and what it does, rather than comprehensive documentation about all the different methods.

dmetivie · 2023-01-17T13:39:52Z

@dmetivie Do you think this is ready for review?

Yes!
In terms of features, I am done for this PR.

I put all the text I wanted for now in the doc and simplified the readme.md (I could not run the documentation and have a preview, so not sure how it will look. )

Thanks for the review!

ParadaCarleton

Looks good, just a few minor fixes missing!

ParadaCarleton · 2023-01-21T21:27:45Z

Project.toml

 ConcreteStructs = "2569d6c7-a4a2-43d3-a901-331e8e4be471"
 Distributions = "31c24e10-a181-5473-b8eb-7969acd0382f"
+IntervalArithmetic = "d1acc4aa-44c8-5952-acd4-ba5d80a2a253"


Is this addition necessary? It looks like it's only a test dependency.

If we keep istms in scr as I suggested in another comment, we need to keep that one

If we keep istms in scr as I suggested in another comment, we need to keep that one

I think istms is neat, but we don’t want to add new dependencies for it, especially heavy ones like IntervalArithmetic. I’d prefer moving istms to the tests.

I am not too concerned with dependency in general, so of course do as you see fit for this package.
Maybe some day we could do a QuasiMonteCarloUI.jl package with this function and other discrepancy measurements, something practitioners would not care too much about, but theoretician would like.

ParadaCarleton · 2023-01-21T21:31:29Z

src/Kronecker.jl

@@ -1,5 +1,5 @@
 """
-    KroneckerSample(generator::AbstractVector) <: SamplingAlgorithm
+    KroneckerSample(generator::AbstractVector) <: DeterministicSamplingAlgorithm


What's the reason for introducing the "Deterministic Sampling Algorithm" type? I'm not sure we need different types for the two kinds of algorithms, given you can turn a deterministic algorithm into a randomized one quite easily.

I agree I should document that at some point to clarify.
Most classical QMC are deterministic (with the NoRand() option). Of course, they can all been randomized, but at their core they are deterministic. Whereas, other random sequences like LHS and uniform are random at their core.
The distinction becomes important when using generate_design_matrices.
If you want independent realization for a Random Algorithm: easy just sample several times (see code), no need for randomize. For Deterministic Algorithm, the independent matrix will be made using randomize.

If we were to implement fast scrambled Sobol like in this paper you mentioned, then we would classify it as RandomAlgo as the randomization is done while creating the sequence.

Isn’t the same algorithm applicable for both sets of matrices? In both cases we can take . I know the current implementation doesn’t randomize while creating the sequence, but creating multiple sequences with a randomizer other than NoRand should generate independent matrices.

I first introduced that to keep the default with NoRand as it is now, i.e. generating different matrices. If we use the same method as RandomAlgo it will produce with NoRand identical matrix and probably makes downstream bug.
I do agree that for NoRand it is not too crazy to expect that generating matrix is useless as it should always generate the same matrix. However, it seems that some people use [out[(j * d + 1):((j + 1) * d), :] for j in 0:(num_mats - 1)].

The second reasons, is that the way I coded DeterministicAlgo

Produces the NoRand sequence

Then randomize it in place several times. It is faster, especially for scrambling
Whereas [sample(n, d, sampler(R= RandMethod), T) for j in 1:num_mats] is longer.

Third, (but again this is personal taste) I find it nice to classify algo by random or deterministic nature. (Especially for people not very familiar with QMC?)

I think that makes sense. Looks good!

src/RandomizedQuasiMonteCarlo/net_utilities.jl

README.md

dmetivie added 10 commits November 22, 2022 16:22

add include and export from RQMC

00830b2

scrambling methods

646b3cc

shifting methods

b8b5329

test if RQMC functions are working

ab44dd4

Add ranmization section with example of current API

e73b2e4

fix bug in unif2bits + Type in bits2unif

563d554

test conversion unif2bits and bits2unif

9f3a39e

update readme

74d751d

update Owen link

49be1e2

Delete .vscode directory

b99865c

ChrisRackauckas reviewed Nov 27, 2022

View reviewed changes

README.md Outdated Show resolved Hide resolved

Update README.md

ab885ba

Merge branch 'SciML:master' into randomizedQMC

fc461d0

dmetivie mentioned this pull request Nov 29, 2022

fix CR rotation #61

Merged

dmetivie and others added 2 commits November 29, 2022 20:43

Merge branch 'SciML:master' into randomizedQMC

e4c995f

format

7e96838

dmetivie added 2 commits November 30, 2022 11:33

comment about explicit Type in bits2unif

8a8eeac

Merge branch 'randomizedQMC' of https://github.com/dmetivie/QuasiMont…

6e2f76d

…eCarlo.jl into randomizedQMC

ChrisRackauckas reviewed Jan 13, 2023

View reviewed changes

README.md Outdated Show resolved Hide resolved

Update README.md

d5730ea

ChrisRackauckas reviewed Jan 13, 2023

View reviewed changes

src/RandomizedQuasiMonteCarlo/net_utilities.jl Outdated Show resolved Hide resolved

dmetivie added 2 commits January 16, 2023 15:23

update randomize part of readme

1f19588

update and organize readme

bc70815

dmetivie added 2 commits January 17, 2023 14:26

change the generating_design_matrices methods

b1a45c0

change i readme and docs

96b1ec0

ParadaCarleton requested changes Jan 21, 2023

View reviewed changes

add various comments + test LHS is tms net

f3068a3

dmetivie requested a review from ParadaCarleton January 23, 2023 15:09

rm istmsnet from pkg to put in test

d49cefa

ParadaCarleton merged commit 2dce990 into SciML:master Jan 25, 2023

dmetivie deleted the randomizedQMC branch January 25, 2023 15:14

Randomized qmc #57

Randomized qmc #57

Conversation

dmetivie commented Nov 24, 2022

ParadaCarleton commented Nov 24, 2022 • edited Loading

ChrisRackauckas commented Nov 27, 2022

ChrisRackauckas commented Nov 27, 2022

dmetivie commented Nov 29, 2022

dmetivie commented Nov 29, 2022 • edited Loading

dmetivie commented Nov 29, 2022

ParadaCarleton commented Nov 30, 2022

dmetivie commented Nov 30, 2022

dmetivie commented Nov 30, 2022

ParadaCarleton commented Nov 30, 2022

ChrisRackauckas commented Dec 2, 2022

ParadaCarleton commented Dec 2, 2022 • edited Loading

ParadaCarleton commented Jan 11, 2023

dmetivie commented Jan 12, 2023 • edited Loading

ParadaCarleton commented Jan 12, 2023

ChrisRackauckas commented Jan 12, 2023

dmetivie commented Jan 12, 2023

dmetivie commented Jan 12, 2023 • edited Loading

ChrisRackauckas commented Jan 12, 2023

dmetivie commented Jan 13, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ChrisRackauckas commented Jan 13, 2023

ParadaCarleton commented Jan 13, 2023 • edited Loading

ChrisRackauckas commented Jan 14, 2023

ParadaCarleton commented Jan 14, 2023

ParadaCarleton commented Jan 16, 2023

ParadaCarleton commented Jan 16, 2023

dmetivie commented Jan 17, 2023

ParadaCarleton left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dmetivie Jan 23, 2023 • edited Loading

Choose a reason for hiding this comment

ParadaCarleton Jan 24, 2023 • edited Loading

Choose a reason for hiding this comment

dmetivie Jan 24, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ParadaCarleton commented Nov 24, 2022 •

edited

Loading

dmetivie commented Nov 29, 2022 •

edited

Loading

ParadaCarleton commented Dec 2, 2022 •

edited

Loading

dmetivie commented Jan 12, 2023 •

edited

Loading

dmetivie commented Jan 12, 2023 •

edited

Loading

dmetivie commented Jan 13, 2023 •

edited

Loading

ParadaCarleton commented Jan 13, 2023 •

edited

Loading

dmetivie Jan 23, 2023 •

edited

Loading

ParadaCarleton Jan 24, 2023 •

edited

Loading

dmetivie Jan 24, 2023 •

edited

Loading