You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Other packages promise to provide at least similar functionality (scikit-llm), why should you choose stormtrooper instead?
12
+
### `Trooper`
13
+
The brand new `Trooper` interface allows you not to have to specify what model type you wish to use.
14
+
Stormtrooper will automatically detect the model type from the specified name.
13
15
14
-
1. Fine-grained control over you pipeline.
15
-
- Variety: stormtrooper allows you to use virtually all canonical approaches for zero and few-shot classification including NLI, Seq2Seq and Generative open-access models from Transformers, SetFit and even OpenAI's large language models.
16
-
- Prompt engineering: You can adjust prompt templates to your hearts content.
17
-
2. Performance
18
-
- Easy inference on GPU if you have access to it.
19
-
- Interfacing HuggingFace's TextGenerationInference API, the most efficient way to host models locally.
20
-
- Async interaction with external APIs, this can speed up inference with OpenAI's models quite drastically.
- Throrough API reference and loads of examples to get you started.
23
-
3. Battle-hardened
24
-
- We at the Center For Humanities Computing are making extensive use of this package. This means you can rest assured that the package works under real-world pressure. As such you can expect regular updates and maintance.
25
-
4. Simple
26
-
- We opted for as bare-bones of an implementation and little coupling as possible. The library works at the lowest level of abstraction possible, and we hope our code will be rather easy for others to understand and contribute to.
16
+
```python
17
+
from stormtrooper import Trooper
18
+
19
+
# This loads a setfit model
20
+
model = Trooper("all-MiniLM-L6-v2")
21
+
22
+
# This loads an OpenAI model
23
+
model = Trooper("gpt-4")
27
24
25
+
# This loads a Text2Text model
26
+
model = Trooper("google/flan-t5-base")
27
+
```
28
28
29
-
##New in version 0.5.0
29
+
### Unified zero and few-shot classification
30
30
31
-
stormtrooper now uses chat templates from HuggingFace transformers for generative models.
32
-
This means that you no longer have to pass model-specific prompt templates to these and can define system and user prompts separately.
31
+
You no longer have to specify whether a model should be a few or a zero-shot classifier when initialising it.
32
+
If you do not pass any training examples, it will be automatically assumed that the model should be zero-shot.
33
33
34
34
```python
35
-
from stormtrooper import GenerativeZeroShotClassifier
35
+
# This is a zero-shot model
36
+
model.fit(None, ["dog", "cat"])
36
37
37
-
system_prompt ="You're a helpful assistant."
38
-
user_prompt ="""
39
-
Classify a text into one of the following categories: {classes}
40
-
Text to clasify:
41
-
"{X}"
42
-
"""
38
+
# This is a few-shot model
39
+
model.fit(["he was a good boy", "just lay down on my laptop"], ["dog", "cat"])
43
40
44
-
model = GenerativeZeroShotClassifier().fit(None, ["political", "not political"])
45
-
model.predict("Joe Biden is no longer the candidate of the Democrats.")
46
41
```
42
+
## Model types
43
+
44
+
You can use all sorts of transformer models for few and zero-shot classification in Stormtrooper.
47
45
46
+
1. Instruction fine-tuned generative models, e.g. `Trooper("HuggingFaceH4/zephyr-7b-beta")`
47
+
2. Encoder models with SetFit, e.g. `Trooper("all-MiniLM-L6-v2")`
48
+
3. Text2Text models e.g. `Trooper("google/flan-t5-base")`
49
+
4. OpenAI models e.g. `Trooper("gpt-4")`
50
+
5. NLI models e.g. `Trooper("facebook/bart-large-mnli")`
48
51
49
-
## Examples
52
+
## Example usage
50
53
51
-
Here are a couple of motivating examples to get you hooked. Find more in our [docs](https://centre-for-humanities-computing.github.io/stormtrooper/).
54
+
Find more in our [docs](https://centre-for-humanities-computing.github.io/stormtrooper/).
You can run a model on multiple devices in order of device priority `GPU -> CPU + Ram -> Disk` and on multiple devices by using the `device_map` argument.
102
+
Note that this only works with text2text and generative models.
103
+
104
+
```
105
+
model = Trooper("HuggingFaceH4/zephyr-7b-beta", device_map="auto")
0 commit comments