Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix tests warnings #1628

Merged
merged 4 commits into from
Nov 14, 2024
Merged

Fix tests warnings #1628

merged 4 commits into from
Nov 14, 2024

Conversation

e10e3
Copy link
Contributor

@e10e3 e10e3 commented Nov 13, 2024

River's test suite raises an important quantity of warnings, and their number may hide defects in new code. Most warnings are created by a few pieces of code run many times. This PR aims to greatly reduce the number of warnings in the tests.

Before: 209139 warnings
After : 17 warnings

Most warnings left are caused by libraries like NumPy and Scikit-learn.

Four of the remaining warnings are a deprecation : pickle, copy, and deepcopy support will be removed from itertools in Python 3.14. This may warrant an action.

This PR also fixes a semi-bug:
In naives_bayes.BernoulliNB.joint_log_likelihood_many(), the if hasattr(X, "sparse") path was never used (at least in the tests) because sparse dataframes were completed with non-sparse values, becoming non-sparse.

Thankfully, both types of dataframes have the same behaviour so nothing bad happened and only a warning was issued by Pandas.
In removing the warning, I restored the intended behaviour (or the one I believe was intended).

The function scipy.special.btdtri was deprecated in SciPy 1.12, and will
be removed in SciPy 1.14.

scipy.special.betaincinv should be a drop-in replacement.
datetime.datetime.utcnow() is deprecated because it is too easy of a
footgun.

Since we don't need timezone-aware objects, now() has the same behaviour
(to the difference of the local time difference to UTC).

The exact time used (or timezone-equivalent) has no impact on this test,
the only requirements is for both calls to have the same timestamp.
Concatenation of empty dataframes causes a FutureWarning in pandas.
None values in concatenation are ignored, as long as not all objects to
concatenate are None.
In a sparse dataframe, all elements must have a sparse Dtype.
Values added after the fact need to be converted.
@MaxHalford
Copy link
Member

Nice PR! Good job

@MaxHalford
Copy link
Member

Maybe it's worth reactivating the pytest warnings?

@e10e3
Copy link
Contributor Author

e10e3 commented Nov 13, 2024

Maybe it's worth reactivating the pytest warnings?

Sorry, I don't understand what you mean by this. The warnings are active and displayed at the end of every Pytest run, this is how I realised there were so many.
Is your question related to the CI?

@MaxHalford
Copy link
Member

No my bad. They're deactivated for me by default, which is why I didn't notice them 👍

@MaxHalford MaxHalford merged commit ada5ada into online-ml:main Nov 14, 2024
4 checks passed
@e10e3 e10e3 deleted the fix-tests-warnings branch November 14, 2024 14:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants