You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+41-20
Original file line number
Diff line number
Diff line change
@@ -14,6 +14,9 @@ SharpToken is a C# library that serves as a port of the Python [tiktoken](https:
14
14
It provides functionality for encoding and decoding tokens using GPT-based encodings. This library is built for .NET 6, .NET 8
15
15
and .NET Standard 2.0, making it compatible with a wide range of frameworks.
16
16
17
+
> [!Important]
18
+
> The functionality in `SharpToken` has been added to [`Microsoft.ML.Tokenizers`](https://www.nuget.org/packages/Microsoft.ML.Tokenizers). `Microsoft.ML.Tokenizers` is a tokenizer library being developed by the .NET team and going forward, the central place for tokenizer development in .NET. By using `Microsoft.ML.Tokenizers`, you should see improved performance over existing tokenizer library implementations, including `SharpToken`. A stable release of `Microsoft.ML.Tokenizers` is expected alongside the .NET 9.0 release (November 2024). Instructions for migration can be found at https://github.com/dotnet/machinelearning/blob/main/docs/code/microsoft-ml-tokenizers-migration-guide.md.
19
+
17
20
## Installation
18
21
19
22
To install SharpToken, use the NuGet package manager:
@@ -200,6 +203,7 @@ public class CompareBenchmark
200
203
privateGptEncoding_sharpToken;
201
204
privateTikToken_tikToken;
202
205
privateITokenizer_tokenizer;
206
+
privateTokenizer_mlTokenizer;
203
207
privatestring_kLongText;
204
208
205
209
[GlobalSetup]
@@ -252,35 +256,52 @@ public class CompareBenchmark
252
256
253
257
returnsum;
254
258
}
259
+
260
+
[Benchmark]
261
+
publicintMLTokenizers()
262
+
{
263
+
varsum=0;
264
+
for (vari=0; i<10000; i++)
265
+
{
266
+
varencoded=_mlTokenizer.EncodeToIds(_kLongText);
267
+
vardecoded=_mlTokenizer.Decode(encoded);
268
+
sum+=decoded.Length;
269
+
}
270
+
271
+
returnsum;
272
+
}
255
273
}
256
274
```
257
275
258
276
</details>
259
277
260
278
```
261
-
BenchmarkDotNet v0.13.12, Windows 11 (10.0.22631.3296/23H2/2023Update/SunValley3)
_kLongText="King Lear, one of Shakespeare's darkest and most savage plays, tells the story of the foolish and Job-like Lear, who divides his kingdom, as he does his affections, according to vanity and whim. Lear’s failure as a father engulfs himself and his world in turmoil and tragedy.";
0 commit comments