Skip to content

Commit 2a69019

Browse files
committed
v0.1.2
Slight re-design, added multiaddr tests
1 parent fa46bf4 commit 2a69019

35 files changed

+3360
-1308
lines changed

README.md

Lines changed: 91 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -61,22 +61,65 @@ Multicodec(name='identity', tag='multihash', code=0,
6161
status='permanent', description='raw binary')
6262
```
6363

64-
The `exists` and `get` functions can be used to check whether a multicodec with given name or code is known, and if so to get the corresponding object:
64+
Core functionality is provided by the `get`, `exists`, `wrap` and `unwrap` functions.
65+
The `get` and `exists` functions can be used to check whether a multicodec with given name or code is known,
66+
and if so to get the corresponding object:
6567

6668
```py
67-
>>> from multiformats import multicodec
6869
>>> multicodec.exists("identity")
6970
True
70-
>>> multicodec.exists(0x01)
71+
>>> multicodec.exists(code=0x01)
7172
True
7273
>>> multicodec.get("identity")
7374
Multicodec(name='identity', tag='multihash', code=0,
7475
status='permanent', description='raw binary')
75-
>>> multicodec.get(0x01)
76-
Multicodec(name='cidv1', tag='ipld', code=1,
76+
>>> multicodec.get(code=0x01)
77+
Multicodec(name='cidv1', tag='cid', code=1,
7778
status='permanent', description='CIDv1')
7879
```
7980

81+
The `wrap` and `unwrap` functions can be use to wrap raw binary data into multicodec data
82+
(prepending the varint-encoded multicodec code) and to unwrap multicodec data into a pair
83+
of multicodec code and raw binary data:
84+
85+
```py
86+
>>> raw_data = bytes([192, 168, 0, 254])
87+
>>> multicodec_data = wrap("ip4", raw_data)
88+
>>> raw_data.hex()
89+
'c0a800fe'
90+
>>> multicodec_data.hex()
91+
'04c0a800fe'
92+
>>> varint.encode(0x04).hex()
93+
'04' # 0x04 ^^^^ is the multicodec code for 'ip4'
94+
>>> codec, raw_data = unwrap(multicodec_data)
95+
>>> raw_data.hex()
96+
'c0a800fe'
97+
>>> codec
98+
Multicodec(name='ip4', tag='multiaddr', code='0x04', status='permanent', description='')
99+
```
100+
101+
The `Multicodec.wrap` and `Multicodec.unwrap` methods perform analogous functionality
102+
with an object-oriented API, additionally enforcing that the unwrapped code is actually
103+
the code of the multicodec being used:
104+
105+
```py
106+
>>> ip4 = multicodec.get("ip4")
107+
>>> ip4
108+
Multicodec(name='ip4', tag='multiaddr', code='0x04', status='permanent', description='')
109+
>>> raw_data = bytes([192, 168, 0, 254])
110+
>>> multicodec_data = ip4.wrap(raw_data)
111+
>>> raw_data.hex()
112+
'c0a800fe'
113+
>>> multicodec_data.hex()
114+
'04c0a800fe'
115+
>>> varint.encode(0x04).hex()
116+
'04' # 0x04 ^^^^ is the multicodec code for 'ip4'
117+
>>> ip4.unwrap(multicodec_data).hex()
118+
'c0a800fe'
119+
>>> ip4.unwrap(bytes.fromhex('00c0a800fe')) # 'identity' multicodec data
120+
multiformats.multicodec.err.ValueError: Found code 0x00 when unwrapping data, expected code 0x04.
121+
```
122+
80123
The `table` function can be used to iterate through known multicodecs, optionally restrictiong to one or more tags and/or statuses:
81124

82125
```py
@@ -143,44 +186,72 @@ For advanced usage, see the [API documentation](https://hashberg-io.github.io/mu
143186
### Multihash
144187

145188
The `multihash` module implements the [multihash spec](https://github.com/multiformats/multihash).
146-
The `exists` and `get` functions can be used to check whether a multihash multicodec with given name or code is known, and if so to get the corresponding object:
147-
148189

149-
Core functionality is provided by the `digest`, `encode`, `decode` functions.
150-
The `digest` function can be used to create a multihash digest directly from data:
190+
Core functionality is provided by the `digest`, `wrap`, `unwrap` functions, or the correspondingly-named methods of the `Multihash` class.
191+
The `digest` function and `Multihash.digest` method can be used to create a multihash digest directly from data:
151192

152193
```py
153194
>>> data = b"Hello world!"
154-
>>> multihash_digest = multihash.digest(data, "sha2-256")
155-
>>> multihash_digest.hex()
195+
>>> digest = multihash.digest(data, "sha2-256")
196+
>>> digest.hex()
197+
'1220c0535e4be2b79ffd93291305436bf889314e4a3faec05ecffcbb7df31ad9e51a'
198+
```
199+
200+
```py
201+
>>> sha2_256 = multihash.get("sha2-256")
202+
>>> digest = sha2_256.digest(data)
203+
>>> digest.hex()
156204
'1220c0535e4be2b79ffd93291305436bf889314e4a3faec05ecffcbb7df31ad9e51a'
157205
```
206+
158207
By default, the full digest produced by the hash function is used.
159208
Optionally, a smaller digest size can be specified to produce truncated hashes:
160209

161210
```py
162-
>>> multihash_digest = multihash.digest(data, "sha2-256", size=20)
163-
# optional truncated hash size, in bytes ^^^^^^^
211+
>>> digest = multihash.digest(data, "sha2-256", size=20)
212+
# optional truncated hash size, in bytes ^^^^^^^
164213
>>> multihash_digest.hex()
165214
'1214c0535e4be2b79ffd93291305436bf889314e4a3f' # 20-bytes truncated hash
166215
```
167216

168-
The `decode` function can be used to extract the raw hash digest from a multihash digest:
217+
The `unwrap` function can be used to extract the raw digest from a multihash digest:
169218

170219
```py
171-
>>> multihash_digest.hex()
220+
>>> digest.hex()
172221
'1214c0535e4be2b79ffd93291305436bf889314e4a3f'
173-
>>> hash_digest = multihash.decode(multihash_digest)
174-
>>> hash_digest.hex()
222+
>>> raw_digest = multihash.unwrap(digest)
223+
>>> raw_digest.hex()
224+
'c0535e4be2b79ffd93291305436bf889314e4a3f'
225+
```
226+
227+
The `Multihash.unwrap` method performs the same functionality, but additionally checks
228+
that the multihash digest is valid for the multihash:
229+
230+
```py
231+
>>> raw_digest = sha2_256.unwrap(digest)
232+
>>> raw_digest.hex()
175233
'c0535e4be2b79ffd93291305436bf889314e4a3f'
176234
```
177235

178-
The `encode` function can be used to encode a raw hash digest into a multihash digest:
236+
```py
237+
>>> sha1 = multihash.get("sha1")
238+
>>> (sha2_256.code, sha1.code)
239+
(18, 17)
240+
>>> sha1.unwrap(digest)
241+
err.ValueError: Decoded code 18 differs from multihash code 17.
242+
```
243+
244+
The `wrap` function and `Multihash.wrap` method can be used to wrap a raw digest into a multihash digest:
179245

180246
```py
181-
>>> hash_digest.hex()
247+
>>> raw_digest.hex()
182248
'c0535e4be2b79ffd93291305436bf889314e4a3f'
183-
>>> multihash.encode(hash_digest, "sha2-256").hex()
249+
>>> multihash.wrap(raw_digest, "sha2-256").hex()
250+
'1214c0535e4be2b79ffd93291305436bf889314e4a3f'
251+
```
252+
253+
```py
254+
>>> sha2_256.wrap(raw_digest).hex()
184255
'1214c0535e4be2b79ffd93291305436bf889314e4a3f'
185256
```
186257

docs/multiformats/cid.html

Lines changed: 16 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -191,7 +191,7 @@ <h1 class="title">Module <code>multiformats.cid</code></h1>
191191
if isinstance(digest, str):
192192
digest = bytes.fromhex(digest)
193193
raw_digest: BytesLike
194-
code, raw_digest = multihash.decode_raw(digest)
194+
code, raw_digest = multihash.unwrap_raw(digest)
195195
hashfun = _CID_validate_multihash(code)
196196
raw_digest = _CID_validate_raw_digest(raw_digest, hashfun)
197197
return hashfun, raw_digest
@@ -326,7 +326,7 @@ <h1 class="title">Module <code>multiformats.cid</code></h1>
326326
if not isinstance(raw_digest, bytes):
327327
raw_digest = bytes(raw_digest)
328328
assert _hashfun == hashfun, &#34;You passed different multihashes to a _new_instance call with digest as a pair.&#34;
329-
instance._digest = hashfun.encode(raw_digest)
329+
instance._digest = hashfun.wrap(raw_digest)
330330
return instance
331331

332332
@property
@@ -430,7 +430,7 @@ <h1 class="title">Module <code>multiformats.cid</code></h1>
430430
&#39;6e6ff7950a36187a801613426e858dce686cd7d7e3c0fc42ee0330072d245c95&#39;
431431
```
432432
&#34;&#34;&#34;
433-
return multihash.decode(self._digest)
433+
return multihash.unwrap(self._digest)
434434

435435
@property
436436
def human_readable(self) -&gt; str:
@@ -452,7 +452,7 @@ <h1 class="title">Module <code>multiformats.cid</code></h1>
452452

453453
def encode(self, base: Union[None, str, Multibase] = None) -&gt; str:
454454
&#34;&#34;&#34;
455-
Encodes the CID using a given multibase. If no multibase is give,
455+
Encodes the CID using a given multibase. If no multibase is given,
456456
the CID&#39;s own multibase is used by default.
457457

458458
Example usage:
@@ -641,10 +641,10 @@ <h1 class="title">Module <code>multiformats.cid</code></h1>
641641
raise ValueError(&#34;CID versions 2 and 3 are reserved for future use.&#34;)
642642
if v != 1:
643643
raise ValueError(f&#34;CIDv{v} is currently not supported.&#34;)
644-
mc_code, _, cid = varint.decode_raw(cid) # multicodec
644+
mc_code, _, cid = multicodec.unwrap_raw(cid) # multicodec
645645
digest = cid # multihash digest is what&#39;s left
646646
mc = multicodec.get(code=mc_code)
647-
mh_code, _ = multihash.decode_raw(digest)
647+
mh_code, _ = multihash.unwrap_raw(digest)
648648
mh = multihash.get(code=mh_code)
649649
return CID._new_instance(CID, mb, v, mc, mh, digest)
650650

@@ -930,7 +930,7 @@ <h2 class="section-title" id="header-classes">Classes</h2>
930930
if not isinstance(raw_digest, bytes):
931931
raw_digest = bytes(raw_digest)
932932
assert _hashfun == hashfun, &#34;You passed different multihashes to a _new_instance call with digest as a pair.&#34;
933-
instance._digest = hashfun.encode(raw_digest)
933+
instance._digest = hashfun.wrap(raw_digest)
934934
return instance
935935

936936
@property
@@ -1034,7 +1034,7 @@ <h2 class="section-title" id="header-classes">Classes</h2>
10341034
&#39;6e6ff7950a36187a801613426e858dce686cd7d7e3c0fc42ee0330072d245c95&#39;
10351035
```
10361036
&#34;&#34;&#34;
1037-
return multihash.decode(self._digest)
1037+
return multihash.unwrap(self._digest)
10381038

10391039
@property
10401040
def human_readable(self) -&gt; str:
@@ -1056,7 +1056,7 @@ <h2 class="section-title" id="header-classes">Classes</h2>
10561056

10571057
def encode(self, base: Union[None, str, Multibase] = None) -&gt; str:
10581058
&#34;&#34;&#34;
1059-
Encodes the CID using a given multibase. If no multibase is give,
1059+
Encodes the CID using a given multibase. If no multibase is given,
10601060
the CID&#39;s own multibase is used by default.
10611061

10621062
Example usage:
@@ -1245,10 +1245,10 @@ <h2 class="section-title" id="header-classes">Classes</h2>
12451245
raise ValueError(&#34;CID versions 2 and 3 are reserved for future use.&#34;)
12461246
if v != 1:
12471247
raise ValueError(f&#34;CIDv{v} is currently not supported.&#34;)
1248-
mc_code, _, cid = varint.decode_raw(cid) # multicodec
1248+
mc_code, _, cid = multicodec.unwrap_raw(cid) # multicodec
12491249
digest = cid # multihash digest is what&#39;s left
12501250
mc = multicodec.get(code=mc_code)
1251-
mh_code, _ = multihash.decode_raw(digest)
1251+
mh_code, _ = multihash.unwrap_raw(digest)
12521252
mh = multihash.get(code=mh_code)
12531253
return CID._new_instance(CID, mb, v, mc, mh, digest)
12541254

@@ -1467,10 +1467,10 @@ <h3>Static methods</h3>
14671467
raise ValueError(&#34;CID versions 2 and 3 are reserved for future use.&#34;)
14681468
if v != 1:
14691469
raise ValueError(f&#34;CIDv{v} is currently not supported.&#34;)
1470-
mc_code, _, cid = varint.decode_raw(cid) # multicodec
1470+
mc_code, _, cid = multicodec.unwrap_raw(cid) # multicodec
14711471
digest = cid # multihash digest is what&#39;s left
14721472
mc = multicodec.get(code=mc_code)
1473-
mh_code, _ = multihash.decode_raw(digest)
1473+
mh_code, _ = multihash.unwrap_raw(digest)
14741474
mh = multihash.get(code=mh_code)
14751475
return CID._new_instance(CID, mb, v, mc, mh, digest)</code></pre>
14761476
</details>
@@ -1868,7 +1868,7 @@ <h3>Instance variables</h3>
18681868
&#39;6e6ff7950a36187a801613426e858dce686cd7d7e3c0fc42ee0330072d245c95&#39;
18691869
```
18701870
&#34;&#34;&#34;
1871-
return multihash.decode(self._digest)</code></pre>
1871+
return multihash.unwrap(self._digest)</code></pre>
18721872
</details>
18731873
</dd>
18741874
<dt id="multiformats.cid.CID.version"><code class="name">var <span class="ident">version</span> : Literal[0, 1]</code></dt>
@@ -1908,7 +1908,7 @@ <h3>Methods</h3>
19081908
<span>def <span class="ident">encode</span></span>(<span>self, base: Union[ForwardRef(None), str, <a title="multiformats.multibase.Multibase" href="multibase/index.html#multiformats.multibase.Multibase">Multibase</a>] = None) ‑> str</span>
19091909
</code></dt>
19101910
<dd>
1911-
<div class="desc"><p>Encodes the CID using a given multibase. If no multibase is give,
1911+
<div class="desc"><p>Encodes the CID using a given multibase. If no multibase is given,
19121912
the CID's own multibase is used by default.</p>
19131913
<p>Example usage:</p>
19141914
<pre><code class="language-py">&gt;&gt;&gt; s = &quot;zb2rhe5P4gXftAwvA4eXQ5HJwsER2owDyS9sKaQRRVQPn93bA&quot;
@@ -1924,7 +1924,7 @@ <h3>Methods</h3>
19241924
</summary>
19251925
<pre><code class="python">def encode(self, base: Union[None, str, Multibase] = None) -&gt; str:
19261926
&#34;&#34;&#34;
1927-
Encodes the CID using a given multibase. If no multibase is give,
1927+
Encodes the CID using a given multibase. If no multibase is given,
19281928
the CID&#39;s own multibase is used by default.
19291929

19301930
Example usage:

0 commit comments

Comments
 (0)