-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow multiple OpenAI clients per Pipeline #563
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
# Multiple LLM clients in a single Pipeline | ||
|
||
For advanced use-cases, PipelineContext accepts a `clients` dictionary of string to OpenAI client mappings. The special string of "default" sets the OpenAI client used for LLMBlocks by default, but individual LLMBlocks can override the client used by the `client` parameter in their yaml config. | ||
|
||
See `pipeline.yaml` in this directory for an example of a Pipeline that uses different clients per `LLMBlock`. |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,16 @@ | ||
system: You are a helpful AI assistant. | ||
|
||
introduction: | | ||
Repeat the document below back to me verbatim. | ||
|
||
principles: | | ||
Do not change anything. | ||
|
||
examples: "" | ||
|
||
generation: | | ||
Document: | ||
{{document}} | ||
|
||
start_tags: [""] | ||
end_tags: [""] |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,44 @@ | ||
version: "1.0" | ||
blocks: | ||
# This uses the default client, since we don't specify one | ||
- name: default_client | ||
type: LLMBlock | ||
config: | ||
model_family: mixtral | ||
model_id: Mixtral-8x7B-Instruct-v0.1 | ||
config_path: llm_config.yaml | ||
output_cols: | ||
- column_one | ||
|
||
# We can also explicitly specify the default client | ||
- name: also_default_client | ||
type: LLMBlock | ||
config: | ||
client: default | ||
model_family: mixtral | ||
model_id: Mixtral-8x7B-Instruct-v0.1 | ||
config_path: llm_config.yaml | ||
output_cols: | ||
- column_two | ||
|
||
# This uses the "server_a" client explicitly | ||
- name: server_a_client | ||
type: LLMBlock | ||
config: | ||
client: server_a | ||
model_family: granite | ||
model_id: granite-7b-lab | ||
config_path: llm_config.yaml | ||
output_cols: | ||
- column_three | ||
|
||
# This uses the "server_b" client explicitly | ||
- name: server_b_client | ||
type: LLMBlock | ||
config: | ||
client: server_b | ||
model_family: granite | ||
model_id: granite-7b-lab | ||
config_path: llm_config.yaml | ||
output_cols: | ||
- column_four |
Original file line number | Diff line number | Diff line change | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
|
@@ -60,7 +60,10 @@ class PipelineContext: # pylint: disable=too-many-instance-attributes | |||||||||
# on individual datasets | ||||||||||
DEFAULT_DATASET_NUM_PROCS = 8 | ||||||||||
|
||||||||||
client: OpenAI | ||||||||||
# The key of our default client | ||||||||||
DEFAULT_CLIENT_KEY = "default" | ||||||||||
|
||||||||||
client: Optional[OpenAI] = None | ||||||||||
model_family: Optional[str] = None | ||||||||||
model_id: Optional[str] = None | ||||||||||
num_instructions_to_generate: Optional[int] = None | ||||||||||
|
@@ -70,6 +73,9 @@ class PipelineContext: # pylint: disable=too-many-instance-attributes | |||||||||
max_num_tokens: Optional[int] = llmblock.DEFAULT_MAX_NUM_TOKENS | ||||||||||
batch_size: int = DEFAULT_BATCH_SIZE | ||||||||||
batch_num_workers: Optional[int] = None | ||||||||||
clients: Optional[Dict[str, OpenAI]] = None | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Great changes and excellent thoughts for supporting backwards compatibility. One small nit: Here, it seems that, there's no mandate in So maybe should we have a check either in |
||||||||||
|
||||||||||
_clients = None | ||||||||||
|
||||||||||
@property | ||||||||||
def batching_enabled(self) -> bool: | ||||||||||
|
@@ -78,6 +84,33 @@ def batching_enabled(self) -> bool: | |||||||||
""" | ||||||||||
return self.batch_size > 0 and self.batch_num_workers != 1 | ||||||||||
|
||||||||||
@property # type: ignore | ||||||||||
def client(self): | ||||||||||
return self.clients.get(self.DEFAULT_CLIENT_KEY, None) | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. A few questions:
|
||||||||||
|
||||||||||
@client.setter | ||||||||||
def client(self, value): | ||||||||||
if isinstance(value, property): | ||||||||||
# No default value | ||||||||||
value = None | ||||||||||
self.clients[self.DEFAULT_CLIENT_KEY] = value | ||||||||||
|
||||||||||
@property # type: ignore | ||||||||||
def clients(self): | ||||||||||
if self._clients is None: | ||||||||||
self._clients = {} | ||||||||||
return self._clients | ||||||||||
|
||||||||||
@clients.setter | ||||||||||
def clients(self, value): | ||||||||||
if isinstance(value, property): | ||||||||||
# Empty hash default value | ||||||||||
value = {} | ||||||||||
if value: | ||||||||||
# Only set _clients if passed in a value, so we don't | ||||||||||
# override it with the default of None from the @dataclass | ||||||||||
self._clients = value | ||||||||||
|
||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Something like this could be added to make sure 'default' key is explicitly mentioned, so the blocks know how to behave during fallback path (?). Just a suggestion, please feel free to disregard or consider alternate implementations for the same.
Suggested change
|
||||||||||
|
||||||||||
# This is part of the public API. | ||||||||||
class PipelineBlockError(Exception): | ||||||||||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we include type-hints here?