Skip to content

Commit 704aec3

Browse files
committed
2. add base_encode, base_decode, gen_id, timeti to utils
1 parent 51d234c commit 704aec3

File tree

4 files changed

+385
-3
lines changed

4 files changed

+385
-3
lines changed

CHANGELOG.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
### 1.2.1 (2025-03-03)
22
1. add `utils.get_size` to get size of objects recursively.
3-
2.
3+
2. add `base_encode, base_decode, gen_id, timeti` to `utils`
44

55

66
### 1.2.0 (2025-02-16)

README.md

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -115,6 +115,20 @@ print(morebuiltins.__file__)
115115

116116
1.31 `b2i` - Convert a byte sequence to an integer.
117117

118+
1.32 `get_hash_int` - Generates a int hash(like docid) from the given input bytes.
119+
120+
1.33 `iter_weights` - Generates an element sequence based on weights.
121+
122+
1.34 `get_size` - Recursively get size of objects.
123+
124+
1.35 `base_encode` - Encode a number to a base-N string.
125+
126+
1.36 `base_decode` - Decode a base-N string to a number.
127+
128+
1.37 `gen_id` - Generate a unique ID based on the current time and random bytes
129+
130+
1.38 `timeti` - Return the number of iterations per second for a given statement.
131+
118132

119133
## 2. morebuiltins.date
120134

doc.md

Lines changed: 243 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -431,8 +431,26 @@
431431
... tuple_obj: tuple
432432
... set_obj: set
433433
... dict_obj: dict
434-
>>> default_dict(Demo, bytes_obj=b'1')
434+
>>> item = default_dict(Demo, bytes_obj=b'1')
435+
>>> item
435436
{'int_obj': 0, 'float_obj': 0.0, 'bytes_obj': b'1', 'str_obj': '', 'list_obj': [], 'tuple_obj': (), 'set_obj': set(), 'dict_obj': {}}
437+
>>> type(item)
438+
<class 'morebuiltins.utils.Demo'>
439+
>>> from typing import TypedDict
440+
>>> class Demo(TypedDict):
441+
... int_obj: int
442+
... float_obj: float
443+
... bytes_obj: bytes
444+
... str_obj: str
445+
... list_obj: list
446+
... tuple_obj: tuple
447+
... set_obj: set
448+
... dict_obj: dict
449+
>>> item = default_dict(Demo, bytes_obj=b'1')
450+
>>> item
451+
{'int_obj': 0, 'float_obj': 0.0, 'bytes_obj': b'1', 'str_obj': '', 'list_obj': [], 'tuple_obj': (), 'set_obj': set(), 'dict_obj': {}}
452+
>>> type(item)
453+
<class 'dict'>
436454

437455
```
438456

@@ -973,6 +991,230 @@
973991
---
974992

975993

994+
995+
1.32 `get_hash_int` - Generates a int hash(like docid) from the given input bytes.
996+
997+
998+
```python
999+
1000+
>>> get_hash_int(1)
1001+
2035485573088411
1002+
>>> get_hash_int("string")
1003+
1418352543534881
1004+
>>> get_hash_int(b'123456', 16)
1005+
4524183350839358
1006+
>>> get_hash_int(b'123', 10)
1007+
5024125808
1008+
>>> get_hash_int(b'123', 13, func=hashlib.sha256)
1009+
1787542395619
1010+
>>> get_hash_int(b'123', 13, func=hashlib.sha512)
1011+
3045057537218
1012+
>>> get_hash_int(b'123', 13, func=hashlib.sha1)
1013+
5537183137519
1014+
1015+
```
1016+
1017+
1018+
---
1019+
1020+
1021+
1022+
1.33 `iter_weights` - Generates an element sequence based on weights.
1023+
1024+
1025+
```python
1026+
1027+
This function produces a sequence of elements where each element's frequency of occurrence
1028+
is proportional to its weight from the provided dictionary. Elements with higher weights
1029+
appear more frequently in the sequence. The total cycle length can be adjusted via the
1030+
`loop_length` parameter. The `round_int` parameter allows customization of the rounding
1031+
function to control the precision of weight calculations.
1032+
1033+
Keys with weights greater than 0 will be yielded.
1034+
1035+
Parameters:
1036+
- weight_dict: A dictionary where keys are elements and values are their respective weights.
1037+
- loop_length: An integer defining the total length of the repeating cycle, defaulting to 100.
1038+
- round_int: A function used for rounding, defaulting to Python's built-in `round`.
1039+
1040+
Yields:
1041+
A generator that yields a sequence of elements distributed according to their weights.
1042+
1043+
Examples:
1044+
>>> list(iter_weights({"6": 6, "3": 3, "1": 0.4}, 10))
1045+
['6', '3', '6', '3', '6', '3', '6', '6', '6']
1046+
>>> list(iter_weights({"6": 6, "3": 3, "1": 0.9}, 10))
1047+
['6', '3', '1', '6', '3', '6', '3', '6', '6', '6']
1048+
>>> list(iter_weights({"6": 6, "3": 3, "1": 0.9}, 10, round_int=int))
1049+
['6', '3', '6', '3', '6', '3', '6', '6', '6']
1050+
>>> from itertools import cycle
1051+
>>> c = cycle(iter_weights({"6": 6, "3": 3, "1": 1}, loop_length=10))
1052+
>>> [next(c) for i in range(20)]
1053+
['6', '3', '1', '6', '3', '6', '3', '6', '6', '6', '6', '3', '1', '6', '3', '6', '3', '6', '6', '6']
1054+
1055+
```
1056+
1057+
1058+
---
1059+
1060+
1061+
1062+
1.34 `get_size` - Recursively get size of objects.
1063+
1064+
1065+
```python
1066+
1067+
Args:
1068+
obj: object of any type
1069+
seen (set): set of ids of objects already seen
1070+
iterate_unsafe (bool, optional): whether to iterate through generators/iterators. Defaults to False.
1071+
1072+
Returns:
1073+
int: size of object in bytes
1074+
1075+
Examples:
1076+
>>> get_size("") > 0
1077+
True
1078+
>>> get_size([]) > 0
1079+
True
1080+
>>> def gen():
1081+
... for i in range(10):
1082+
... yield i
1083+
>>> g = gen()
1084+
>>> get_size(g) > 0
1085+
True
1086+
>>> next(g)
1087+
0
1088+
>>> get_size(g, iterate_unsafe=True) > 0
1089+
True
1090+
>>> try:
1091+
... next(g)
1092+
... except StopIteration:
1093+
... "StopIteration"
1094+
'StopIteration'
1095+
1096+
```
1097+
1098+
1099+
---
1100+
1101+
1102+
1103+
1.35 `base_encode` - Encode a number to a base-N string.
1104+
1105+
1106+
```python
1107+
1108+
Args:
1109+
num (int): The number to encode.
1110+
alphabet (str, optional): Defaults to "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ".
1111+
1112+
Returns:
1113+
str: The encoded string.
1114+
1115+
Examples:
1116+
>>> base_encode(0)
1117+
'0'
1118+
>>> base_encode(1)
1119+
'1'
1120+
>>> base_encode(10000000000000)
1121+
'2Q3rKTOE'
1122+
>>> base_encode(10000000000000, "0123456789")
1123+
'10000000000000'
1124+
1125+
```
1126+
1127+
1128+
---
1129+
1130+
1131+
1132+
1.36 `base_decode` - Decode a base-N string to a number.
1133+
1134+
1135+
```python
1136+
1137+
Args:
1138+
string (str): The string to decode.
1139+
alphabet (str, optional): Defaults to "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ".
1140+
1141+
Returns:
1142+
int: The decoded number.
1143+
1144+
Examples:
1145+
>>> base_decode("0")
1146+
0
1147+
>>> base_decode("1")
1148+
1
1149+
>>> base_decode("2Q3rKTOE")
1150+
10000000000000
1151+
>>> base_decode("10000000000000", "0123456789")
1152+
10000000000000
1153+
1154+
```
1155+
1156+
1157+
---
1158+
1159+
1160+
1161+
1.37 `gen_id` - Generate a unique ID based on the current time and random bytes
1162+
1163+
1164+
```python
1165+
1166+
Args:
1167+
rand_len (int, optional): Defaults to 4.
1168+
1169+
Returns:
1170+
str: The generated ID.
1171+
1172+
Examples:
1173+
>>> a, b = gen_id(), gen_id()
1174+
>>> a != b
1175+
True
1176+
>>> import time
1177+
>>> ids = [time.sleep(0.000001) or gen_id() for _ in range(1000)]
1178+
>>> len(set(ids))
1179+
1000
1180+
>>> ids = [gen_id(rand_len=1) for _ in range(10000)]
1181+
>>> len(set(ids)) < 10000
1182+
True
1183+
1184+
```
1185+
1186+
1187+
---
1188+
1189+
1190+
1191+
1.38 `timeti` - Return the number of iterations per second for a given statement.
1192+
1193+
1194+
```python
1195+
1196+
Args:
1197+
stmt (str, optional): Defaults to "pass".
1198+
setup (str, optional): Defaults to "pass".
1199+
timer (optional): Defaults to timeit.default_timer.
1200+
number (int, optional): Defaults to 1000000.
1201+
globals (dict, optional): Defaults to None.
1202+
1203+
Returns:
1204+
int: The number of iterations per second.
1205+
1206+
Examples:
1207+
>>> timeti("1 / 1") > 1000000
1208+
True
1209+
>>> timeti(lambda : 1 + 1, number=100000) > 100000
1210+
True
1211+
1212+
```
1213+
1214+
1215+
---
1216+
1217+
9761218
## 2. morebuiltins.date
9771219

9781220

0 commit comments

Comments
 (0)