Releases: rapidsai/cuml
Releases · rapidsai/cuml
v21.12.00
🚨 Breaking Changes
- Fix indexing of PCA to use safer types (#4255) @lowener
- RF: Add Gamma and Inverse Gaussian loss criteria (#4216) @venkywonka
- update RF docs (#4138) @venkywonka
🐛 Bug Fixes
- Update conda recipe to have explicit libcusolver (#4392) @dantegd
- Restore FIL convention of inlining code (#4366) @levsnv
- Fix SVR intercept AttributeError (#4358) @lowener
- Fix
is_stable_build
logic for CI scripts (#4350) @ajschmidt8 - Temporarily disable rmm devicebuffer in array.py (#4333) @dantegd
- Fix categorical test in python (#4326) @levsnv
- Revert "Merge pull request #4319 from AyodeAwe/branch-21.12" (#4325) @ajschmidt8
- Preserve indexing in methods when applied to DataFrame and Series objects (#4317) @dantegd
- Fix potential CUDA context poison when negative (invalid) categories provided to FIL model (#4314) @levsnv
- Using sparse expanded distances where possible (#4310) @cjnolet
- Fix for
mean_squared_error
(#4287) @viclafargue - Fix for Categorical Naive Bayes sparse handling (#4277) @lowener
- Throw an explicit excpetion if the input array is empty in DBSCAN.fit #4273 (#4275) @viktorkovesd
- Fix KernelExplainer returning TypeError for certain input (#4272) @Nanthini10
- Remove most warnings from pytest suite (#4196) @dantegd
📖 Documentation
- Add experimental GPUTreeSHAP to API doc (#4398) @hcho3
- Fix GLM typo on device/host pointer (#4320) @lowener
- update RF docs (#4138) @venkywonka
🚀 New Features
- Add GPUTreeSHAP to cuML explainer module (experimental) (#4351) @hcho3
- Enable training single GPU cuML models using Dask DataFrames and Series (#4300) @ChrisJar
- LinearSVM using QN solvers (#4268) @achirkin
- Add support for exogenous variables to ARIMA (#4221) @Nyrio
- Use opt-in shared memory carveout for FIL (#3759) @levsnv
- Symbolic Regression/Classification C/C++ (#3638) @vimarsh6739
🛠️ Improvements
- Fix Changelog Merge Conflicts for
branch-21.12
(#4393) @ajschmidt8 - Pin max
dask
anddistributed
to2012.11.2
(#4390) @galipremsagar - Fix forward merge #4349 (#4374) @dantegd
- Upgrade
clang
to11.1.0
(#4372) @galipremsagar - Update clang-format version in docs; allow unanchored version string (#4365) @zbjornson
- Add CUDA 11.5 developer environment (#4364) @dantegd
- Fix aliasing violation in t-SNE (#4363) @zbjornson
- Promote FITSNE from experimental (#4361) @lowener
- Fix unnecessary f32/f64 conversions in t-SNE KL calc (#4331) @zbjornson
- Update rapids-cmake version (#4330) @dantegd
- rapids-cmake version update to 21.12 (#4327) @dantegd
- Use compute-sanitizer instead of cuda-memcheck (#4324) @teju85
- Ability to pass fp64 type to cuml benchmarks (#4323) @teju85
- Split treelite fil import from
forest
object definition (#4306) @levsnv - update xgboost version (#4301) @msadang
- Accounting for RAFT updates to matrix, stats, and random implementations in detail (#4294) @divyegala
- Update cudf matrix calls for to_numpy and to_cupy (#4293) @dantegd
- Update
conda
recipes for Enhanced Compatibility effort (#4288) @ajschmidt8 - Increase parallelism from 4 to 8 jobs in CI (#4286) @dantegd
- RAFT distance prims public API update (#4280) @cjnolet
- Update to UCX-Py 0.23 (#4274) @pentschev
- In FIL, clip blocks_per_sm to one wave instead of asserting (#4271) @levsnv
- Update of "Gracefully accept 'n_jobs', a common sklearn parameter, in NearestNeighbors Estimator" (#4267) @NV-jpt
- Improve numerical stability of the Kalman filter for ARIMA (#4259) @Nyrio
- Fix indexing of PCA to use safer types (#4255) @lowener
- Change calculation of ARIMA confidence intervals (#4248) @Nyrio
- Unpin
dask
&distributed
in CI (#4235) @galipremsagar - RF: Add Gamma and Inverse Gaussian loss criteria (#4216) @venkywonka
- Exposing KL divergence in TSNE (#4208) @viclafargue
- Unify template parameter dispatch for FIL inference and shared memory footprint estimation (#4013) @levsnv
v21.10.02
v21.10.01
v21.08.03
v21.10.00
🚨 Breaking Changes
- RF: python api behaviour refactor (#4207) @venkywonka
- Implement vector leaf for random forest (#4191) @RAMitchell
- Random forest refactoring (#4166) @RAMitchell
- RF: Add Poisson deviance impurity criterion (#4156) @venkywonka
- avoid paramsSolver::{n_rows,n_cols} shadowing their base class counterparts (#4130) @yitao-li
- Apply modifications to account for RAFT changes (#4077) @viclafargue
🐛 Bug Fixes
- Update scikit-learn version in conda dev envs to 0.24 (#4241) @dantegd
- Using pinned host memory for Random Forest and DBSCAN (#4215) @divyegala
- Make sure we keep the rapids-cmake and cuml cal version in sync (#4213) @robertmaynard
- Add thrust_create_target to install export in CMakeLists (#4209) @dantegd
- Change the error type to match sklearn. (#4198) @achirkin
- Fixing remaining hdbscan bug (#4179) @cjnolet
- Fix for cuDF changes to cudf.core (#4168) @dantegd
- Fixing UMAP reproducibility pytest failures in 11.4 by using random init for now (#4152) @cjnolet
- avoid paramsSolver::{n_rows,n_cols} shadowing their base class counterparts (#4130) @yitao-li
- Use the new RAPIDS.cmake to fetch rapids-cmake (#4102) @robertmaynard
📖 Documentation
- Expose train_test_split in API doc (#4234) @hcho3
- Adding docs for
.get_feature_names()
insideTfidfVectorizer
(#4226) @mayankanand007 - Removing experimental flag from hdbscan description in docs (#4211) @cjnolet
- updated build instructions (#4200) @shaneding
- Forward-merge branch-21.08 to branch-21.10 (#4171) @jakirkham
🚀 New Features
- Experimental option to build libcuml++ only with FIL (#4225) @dantegd
- FIL to import categorical models from treelite (#4173) @levsnv
- Add hamming, jensen-shannon, kl-divergence, correlation and russellrao distance metrics (#4155) @mdoijade
- Add Categorical Naive Bayes (#4150) @lowener
- FIL to infer categorical forests and generate them in C++ tests (#4092) @levsnv
- Add Gaussian Naive Bayes (#4079) @lowener
- ARIMA - Add support for missing observations and padding (#4058) @Nyrio
🛠️ Improvements
- Pin max
dask
anddistributed
versions to 2021.09.1 (#4229) @galipremsagar - Fea/umap refine (#4228) @AjayThorve
- Upgrade Treelite to 2.1.0 (#4220) @hcho3
- Add option to clone RAFT even if it is in the environment (#4217) @dantegd
- RF: python api behaviour refactor (#4207) @venkywonka
- Pytest updates for Scikit-learn 0.24 (#4205) @dantegd
- Faster glm ols-via-eigendecomposition algorithm (#4201) @achirkin
- Implement vector leaf for random forest (#4191) @RAMitchell
- Refactor kmeans sampling code (#4190) @Nanthini10
- Gracefully accept 'n_jobs', a common sklearn parameter, in NearestNeighbors Estimator (#4178) @NV-jpt
- Update with rapids cmake new features (#4175) @robertmaynard
- Update to UCX-Py 0.22 (#4174) @pentschev
- Random forest refactoring (#4166) @RAMitchell
- Fix log level for dask tree_reduce (#4163) @lowener
- Add CUDA 11.4 development environment (#4160) @dantegd
- RF: Add Poisson deviance impurity criterion (#4156) @venkywonka
- Split FIL infer_k into phases to speed up compilation (when a patch is applied) (#4148) @levsnv
- RF node queue rewrite (#4125) @RAMitchell
- Remove max version pin for
dask
&distributed
on development branch (#4118) @galipremsagar - Correct name of a cmake function in get_spdlog.cmake (#4106) @robertmaynard
- Apply modifications to account for RAFT changes (#4077) @viclafargue
- Warnings are errors (#4075) @harrism
- ENH Replace gpuci_conda_retry with gpuci_mamba_retry (#4065) @dillon-cullinan
- Changes to NearestNeighbors to call 2d random ball cover (#4003) @cjnolet
- support space in workspace (#3752) @jolorunyomi
v21.08.02
v21.08.01
v21.08.00
🚨 Breaking Changes
- Remove deprecated target_weights in UMAP (#4081) @lowener
- Upgrade Treelite to 2.0.0 (#4072) @hcho3
- RF/DT cleanup (#4005) @venkywonka
- RF: memset and batch size optimization for computing splits (#4001) @venkywonka
- Remove old RF backend (#3868) @RAMitchell
- Enable warp-per-tree inference in FIL for regression and binary classification (#3760) @levsnv
🐛 Bug Fixes
- Disabling umap reproducibility tests for cuda 11.4 (#4128) @cjnolet
- Fix for crash in RF when
max_leaves
parameter is specified (#4126) @vinaydes - Running umap mnmg test twice (#4112) @cjnolet
- Minimal fix for
SparseRandomProjection
(#4100) @viclafargue - Creating copy of
components
in PCA transform and inverse transform (#4099) @divyegala - Fix SVM model parameter handling in case n_support=0 (#4097) @tfeher
- Fix set_params for linear models (#4096) @lowener
- Fix train test split pytest comparison (#4062) @dantegd
- Fix fit_transform on KMeans (#4055) @lowener
- Fixing -1 key access in 1nn reduce op in HDBSCAN (#4052) @divyegala
- Disable installing gbench to avoid container permission issues (#4049) @dantegd
- Fix double fit crash in preprocessing models (#4040) @viclafargue
- Always add
faiss
library alias if it's missing (#4028) @trxcllnt - Fixing intermittent HBDSCAN pytest failure in CI (#4025) @divyegala
- HDBSCAN bug on A100 (#4024) @divyegala
- Add treelite include paths to treelite targets (#4023) @trxcllnt
- Add Treelite_BINARY_DIR include to
cuml++
build interface include paths (#4018) @trxcllnt - Small ARIMA-related bug fixes in Hessenberg reduction and make_arima (#4017) @Nyrio
- Update setup.py (#4015) @ajschmidt8
- Update
treelite
version inget_treelite.cmake
(#4014) @ajschmidt8 - Fix build with latest RAFT branch-21.08 (#4012) @trxcllnt
- Skipping hdbscan pytests when gpu is a100 (#4007) @cjnolet
- Using 64-bit array lengths to increase scale of pca & tsvd (#3983) @cjnolet
- Fix MNMG test in Dask RF (#3964) @hcho3
- Use nested include in destination of install headers to avoid docker permission issues (#3962) @dantegd
- Fix automerge #3939 (#3952) @dantegd
- Update UCX-Py version to 0.21 (#3950) @pentschev
- Fix kernel and line info in cmake (#3941) @dantegd
- Fix for multi GPU PCA compute failing bug after transform and added error handling when n_components is not passed (#3912) @akaanirban
- Tolerate QN linesearch failures when it's harmless (#3791) @achirkin
📖 Documentation
- Improve docstrings for silhouette score metrics. (#4026) @bdice
- Update CHANGELOG.md link (#3956) @Salonijain27
- Update documentation build examples to be generator agnostic (#3909) @robertmaynard
- Improve FIL code readability and documentation (#3056) @levsnv
🚀 New Features
- Add Multinomial and Bernoulli Naive Bayes variants (#4053) @lowener
- Add weighted K-Means sampling for SHAP (#4051) @Nanthini10
- Use chebyshev, canberra, hellinger and minkowski distance metrics (#3990) @mdoijade
- Implement vector leaf prediction for fil. (#3917) @RAMitchell
- change TargetEncoder's smooth argument from ratio to count (#3876) @daxiongshu
- Enable warp-per-tree inference in FIL for regression and binary classification (#3760) @levsnv
🛠️ Improvements
- Remove clang/clang-tools from conda recipe (#4109) @dantegd
- Pin dask version (#4108) @galipremsagar
- ANN warnings/tests updates (#4101) @viclafargue
- Removing local memory operations from computeSplitKernel and other optimizations (#4083) @vinaydes
- Fix libfaiss dependency to not expressly depend on conda-forge (#4082) @Ethyling
- Remove deprecated target_weights in UMAP (#4081) @lowener
- Upgrade Treelite to 2.0.0 (#4072) @hcho3
- Optimize dtype conversion for FIL (#4070) @dantegd
- Adding quick notes to HDBSCAN public API docs as to why discrepancies may occur between cpu and gpu impls. (#4061) @cjnolet
- Update
conda
environment name for CI (#4039) @ajschmidt8 - Rewrite random forest gtests (#4038) @RAMitchell
- Updating Clang Version to 11.0.0 (#4029) @codereport
- Raise ARIMA parameter limits from 4 to 8 (#4022) @Nyrio
- Testing extract clusters in HDBSCAN (#4009) @divyegala
- ARIMA - Kalman loop rewrite: single megakernel instead of host loop (#4006) @Nyrio
- RF/DT cleanup (#4005) @venkywonka
- Exposing condensed hierarchy through cython for easier unit-level testing (#4004) @cjnolet
- Use the 21.08 branch of rapids-cmake as rmm requires it (#4002) @robertmaynard
- RF: memset and batch size optimization for computing splits (#4001) @venkywonka
- Reducing cluster size to number of selected clusters. Returning stability scores (#3987) @cjnolet
- HDBSCAN: Lazy-loading (and caching) condensed & single-linkage tree objects (#3986) @cjnolet
- Fix
21.08
forward-merge conflicts (#3982) @ajschmidt8 - Update Dask/Distributed version (#3978) @pentschev
- Use clang-tools on x86 only (#3969) @jakirkham
- Promote
trustworthiness_score
to public header, add missing includes, update dependencies (#3968) @trxcllnt - Moving FAISS ANN wrapper to raft (#3963) @cjnolet
- Add MG weighted k-means (#3959) @lowener
- Remove unused code in UMAP. (#3931) @trivialfis
- Fix automerge #3900 and correct package versions in meta packages (#3918) @dantegd
- Adaptive stress tests when GPU memory capacity is insufficient (#3916) @lowener
- Fix merge conflicts (#3892) @ajschmidt8
- Remove old RF backend (#3868) @RAMitchell
- Refactor to extract random forest objectives (#3854) @RAMitchell