Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use units based on powers of 2 #1583

Merged
merged 2 commits into from
Jul 25, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions docs/releases/unreleased.md
Original file line number Diff line number Diff line change
@@ -1 +1,3 @@
# Unreleased

- The units used in River have been corrected to be based on powers of 2 (KiB, MiB). This only changes the display, the behaviour is unchanged.
4 changes: 2 additions & 2 deletions river/forest/adaptive_random_forest.py
Original file line number Diff line number Diff line change
Expand Up @@ -534,7 +534,7 @@ class ARFClassifier(BaseForest, base.Classifier):
in the majority class is smaller than this parameter value. This parameter avoids
performing splits when most of the data belongs to a single class.
max_size
[*Tree parameter*] Maximum memory (MB) consumed by the tree.
[*Tree parameter*] Maximum memory (MiB) consumed by the tree.
memory_estimate_period
[*Tree parameter*] Number of instances between memory consumption checks.
stop_mem_management
Expand Down Expand Up @@ -808,7 +808,7 @@ class ARFRegressor(BaseForest, base.Regressor):
binary_split
[*Tree parameter*] If True, only allow binary splits.
max_size
[*Tree parameter*] Maximum memory (MB) consumed by the tree.
[*Tree parameter*] Maximum memory (MiB) consumed by the tree.
memory_estimate_period
[*Tree parameter*] Number of instances between memory consumption checks.
stop_mem_management
Expand Down
2 changes: 1 addition & 1 deletion river/forest/online_extra_trees.py
Original file line number Diff line number Diff line change
Expand Up @@ -583,7 +583,7 @@ class OXTRegressor(ExtraTrees, base.Regressor):
binary_split
[*Tree parameter*] If True, only allow binary splits.
max_size
[*Tree parameter*] Maximum memory (MB) consumed by the tree.
[*Tree parameter*] Maximum memory (MiB) consumed by the tree.
memory_estimate_period
[*Tree parameter*] Number of instances between memory consumption checks.
stop_mem_management
Expand Down
2 changes: 1 addition & 1 deletion river/stream/twitch_chat_stream.py
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ class TwitchChatStream:
channels
A list of channel names like `["asmongold", "shroud"]` you want to collect messages from.
buffer_size
Size of buffer in bytes used for receiving responses from Twitch with IRC (default 2 kB).
Size of buffer in bytes used for receiving responses from Twitch with IRC (default 2 KiB).
timeout
A timeout value in seconds for waiting response from Twitch (default 60s). It can be useful if all requested channels are offline or chat is not active enough.

Expand Down
2 changes: 1 addition & 1 deletion river/tree/extremely_fast_decision_tree.py
Original file line number Diff line number Diff line change
Expand Up @@ -80,7 +80,7 @@ class ExtremelyFastDecisionTreeClassifier(HoeffdingTreeClassifier):
smaller than this parameter value. This parameter avoids performing splits when most
of the data belongs to a single class.
max_size
The max size of the tree, in Megabytes (MB).
The max size of the tree, in mebibytes (MiB).
memory_estimate_period
Interval (number of processed instances) between memory consumption checks.
stop_mem_management
Expand Down
2 changes: 1 addition & 1 deletion river/tree/hoeffding_adaptive_tree_classifier.py
Original file line number Diff line number Diff line change
Expand Up @@ -77,7 +77,7 @@ class HoeffdingAdaptiveTreeClassifier(HoeffdingTreeClassifier):
smaller than this parameter value. This parameter avoids performing splits when most
of the data belongs to a single class.
max_size
The max size of the tree, in Megabytes (MB).
The max size of the tree, in mebibytes (MiB).
memory_estimate_period
Interval (number of processed instances) between memory consumption checks.
stop_mem_management
Expand Down
2 changes: 1 addition & 1 deletion river/tree/hoeffding_adaptive_tree_regressor.py
Original file line number Diff line number Diff line change
Expand Up @@ -84,7 +84,7 @@ class HoeffdingAdaptiveTreeRegressor(HoeffdingTreeRegressor):
binary_split
If True, only allow binary splits.
max_size
The max size of the tree, in Megabytes (MB).
The max size of the tree, in mebibytes (MiB).
memory_estimate_period
Interval (number of processed instances) between memory consumption checks.
stop_mem_management
Expand Down
4 changes: 2 additions & 2 deletions river/tree/hoeffding_tree.py
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ class HoeffdingTree(ABC):
binary_split
If True, only allow binary splits.
max_size
The max size of the tree, in Megabytes (MB).
The max size of the tree, in mebibytes (MiB).
memory_estimate_period
Interval (number of processed instances) between memory consumption checks.
stop_mem_management
Expand Down Expand Up @@ -111,7 +111,7 @@ def _hoeffding_bound(range_val, confidence, n):

@property
def max_size(self):
"""Max allowed size tree can reach (in MB)."""
"""Max allowed size tree can reach (in MiB)."""
return self._max_size

@max_size.setter
Expand Down
2 changes: 1 addition & 1 deletion river/tree/hoeffding_tree_classifier.py
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@ class HoeffdingTreeClassifier(HoeffdingTree, base.Classifier):
smaller than this parameter value. This parameter avoids performing splits when most
of the data belongs to a single class.
max_size
The max size of the tree, in Megabytes (MB).
The max size of the tree, in mebibytes (MiB).
memory_estimate_period
Interval (number of processed instances) between memory consumption checks.
stop_mem_management
Expand Down
2 changes: 1 addition & 1 deletion river/tree/hoeffding_tree_regressor.py
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ class HoeffdingTreeRegressor(HoeffdingTree, base.Regressor):
binary_split
If True, only allow binary splits.
max_size
The max size of the tree, in Megabytes (MB).
The max size of the tree, in mebibytes (MiB).
memory_estimate_period
Interval (number of processed instances) between memory consumption checks.
stop_mem_management
Expand Down
2 changes: 1 addition & 1 deletion river/tree/isoup_tree_regressor.py
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,7 @@ class iSOUPTreeRegressor(tree.HoeffdingTreeRegressor, base.MultiTargetRegressor)
binary_split
If True, only allow binary splits.
max_size
The max size of the tree, in Megabytes (MB).
The max size of the tree, in mebibytes (MiB).
memory_estimate_period
Interval (number of processed instances) between memory consumption checks.
stop_mem_management
Expand Down
6 changes: 3 additions & 3 deletions river/tree/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -251,7 +251,7 @@ def calculate_object_size(obj: typing.Any, unit: str = "byte") -> int:
Object to evaluate.
unit
The unit in which the accounted value is going to be returned.
Values: 'byte', 'kB', 'MB' (Default: 'byte').
Values: 'byte', 'KiB', 'MiB' (Default: 'byte').

Returns
-------
Expand Down Expand Up @@ -295,9 +295,9 @@ def calculate_object_size(obj: typing.Any, unit: str = "byte") -> int:
for i in obj:
to_visit.append(i)

if unit == "kB":
if unit == "KiB":
final_size = byte_size / 1024
elif unit == "MB":
elif unit == "MiB":
final_size = byte_size / (2**20)
else:
final_size = byte_size
Expand Down
2 changes: 1 addition & 1 deletion river/utils/pretty.py
Original file line number Diff line number Diff line change
Expand Up @@ -68,7 +68,7 @@ def humanize_bytes(n_bytes: int):
n_bytes

"""
suffixes = ["B", "KB", "MB", "GB", "TB", "PB"]
suffixes = ["B", "KiB", "MiB", "GiB", "TiB", "PiB"]
human = float(n_bytes)
rank = 0
if n_bytes != 0:
Expand Down
Loading