Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SpeziLLMOpenAI: Repalce MacPaw/OpenAI With Generated API Calls #64

Open
wants to merge 39 commits into
base: main
Choose a base branch
from

Conversation

paulhdk
Copy link
Contributor

@paulhdk paulhdk commented Aug 27, 2024

SpeziLLMOpenAI: Repalce MacPaw/OpenAI With Generated API Calls

♻️ Current situation & Problem

This PR replaces the MacPaw/OpenAI package with generated API calls by the swift-openapi-generator package.
Calls are generated from OpenAI's official OpenAPI spec.
As discussed with @PSchmiedmayer, this marks the first step in adding the ability to send local image content to the OpenAI API.

This PR does not add any new features but simply replicates the existing feature set with the generated API calls.

I've tried my best to keep track of any known issues in-code with FIXMEs as well as in the following list.

Current Issues

  • Sources/SpeziLLMOpenAI/LLMOpenAISession+Generation.swift does not handle the "[DONE]" message sent by the API to conclude a stream. There is currently a hacky workaround that catches the error that is thrown in that case. I'm not quite sure yet how to handle that case elegantly.
  • There are several do ... catch blocks that catch OpenAI package-specific errors, which I had to commented out. I have not found a semantically equivalent solution for the generated API calls yet.
  • The LLMFunctionParameterItemSchema type does not use a generated type yet.
  • The convenience initialisers in SpeziLLMOpenAI/FunctionCalling should be, if possible, refactored, as they currently have a lot of optional bindings.
  • Correct error handling
  • Currently, openapi-generator-swift expects and an openapi.yaml and a configuration file in the TestApp, which is why there are duplicate openapi specs and configuration files in this PR. I'm not quite sure why it's expecting them in the TestApp, but I suspect it has something to do with the generated types being used in the TestApp's model selection mechanism.
  • The SpeziLLMTests are currently not passing. Because the test errors are realted to the above issues, I’ll update the tests once I’ve addressed all of the issues above.

⚙️ Release Notes

  • Replace the MacPaw/OpenAI package with Apple/swift-openapi-generator, which is able to generate API calls directly from OpenAI's official OpenAPI spec.

📚 Documentation

As no new functionality is added, nothing should change here (unless I missed something).

✅ Testing

This PR passes the existing tests. Since no new functionality has been added, I believe this should suffice.

📝 Code of Conduct & Contributing Guidelines

By submitting creating this pull request, you agree to follow our Code of Conduct and Contributing Guidelines:

@paulhdk paulhdk marked this pull request as ready for review September 13, 2024 17:49
Copy link

codecov bot commented Oct 4, 2024

Codecov Report

Attention: Patch coverage is 52.01342% with 286 lines in your changes missing coverage. Please review.

Project coverage is 31.24%. Comparing base (e53bc15) to head (974449b).

Files with missing lines Patch % Lines
...peziLLMOpenAI/LLMOpenAISession+Configuration.swift 0.00% 77 Missing ⚠️
...ources/SpeziLLMOpenAI/LLMOpenAISession+Setup.swift 0.00% 51 Missing ⚠️
...s/SpeziLLMOpenAI/LLMOpenAISession+Generation.swift 0.00% 39 Missing ⚠️
Sources/SpeziLLMOpenAI/LLMOpenAIError.swift 0.00% 22 Missing ⚠️
...tionCalling/LLMFunctionParameterWrapper+Enum.swift 70.59% 20 Missing ⚠️
...ng/LLMFunctionParameterWrapper+OptionalTypes.swift 86.56% 16 Missing ⚠️
...SpeziLLMOpenAI/Helpers/LLMOpenAIStreamResult.swift 0.00% 16 Missing ⚠️
...nCalling/LLMFunctionParameterSchemaCollector.swift 0.00% 11 Missing ⚠️
...lling/LLMFunctionParameterWrapper+ArrayTypes.swift 88.74% 8 Missing ⚠️
...g/LLMFunctionParameterWrapper+PrimitiveTypes.swift 83.68% 8 Missing ⚠️
... and 5 more
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main      #64      +/-   ##
==========================================
+ Coverage   31.18%   31.24%   +0.07%     
==========================================
  Files          67       68       +1     
  Lines        3012     3198     +186     
==========================================
+ Hits          939      999      +60     
- Misses       2073     2199     +126     
Files with missing lines Coverage Δ
...penAI/Configuration/LLMOpenAIModelParameters.swift 100.00% <100.00%> (ø)
...iLLMOpenAI/Configuration/LLMOpenAIParameters.swift 100.00% <ø> (ø)
Sources/SpeziLLMOpenAI/Helpers/Chat+OpenAI.swift 0.00% <ø> (ø)
...I/Onboarding/LLMOpenAIAPITokenOnboardingStep.swift 100.00% <ø> (ø)
...enAI/Onboarding/LLMOpenAIModelOnboardingStep.swift 97.88% <100.00%> (-2.12%) ⬇️
Sources/SpeziLLM/Models/LLMContextEntity.swift 33.34% <0.00%> (-1.66%) ⬇️
.../FunctionCalling/LLMFunctionParameterWrapper.swift 60.00% <33.34%> (+22.50%) ⬆️
...ling/LLMFunctionParameterWrapper+CustomTypes.swift 92.00% <91.31%> (-8.00%) ⬇️
Sources/SpeziLLMOpenAI/LLMOpenAISession.swift 22.86% <0.00%> (ø)
...urces/SpeziLLMOpenAI/LLMOpenAIAuthMiddleware.swift 0.00% <0.00%> (ø)
... and 10 more

... and 5 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e53bc15...974449b. Read the comment docs.

Copy link
Member

@PSchmiedmayer PSchmiedmayer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for all the work here @paulhdk; very important to improve this setup and build on the OpenAPI specification!

It would be amazing to get a first insight from @philippzagar to get a good round of feedback.

Package.swift Outdated Show resolved Hide resolved
@@ -51,7 +50,7 @@ public struct LLMOpenAIModelParameters: Sendable {
/// - logitBias: Alters specific token's likelihood in completion.
/// - user: Unique identifier for the end-user, aiding in abuse monitoring.
public init(
responseFormat: ChatQuery.ResponseFormat? = nil,
responseFormat: Components.Schemas.CreateChatCompletionRequest.response_formatPayload? = nil,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering if we should add compact type aliases for this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’ve added an LLMOpenAIRequestType alias. Does that work for you?

Should we also introduce an alias for Components.Schemas in general? This won’t make the types shorter, but something like LLMOpenAIGeneratedTypes could improve readability, maybe?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can introduce well defined and named typealias for the specific types that we use in our API surface; we should see if we can make them compact and focus on them.

Comment on lines +31 to +34
/// "firstName": [
/// "type": "string",
/// "description": "The first name of the person")
/// ],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering if we can add a nicely typed type for this instead of a dictionary; it can always map to a dictionary under the hood. Would be cool to avoid loosing that type-safe element?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously, SpeziLLMOpenAI wrapped around the Swift types provided by the OpenAI package, which would then eventually be passed to the API.
With the OpenAI OpenAPI spec, such types aren't generated, but the JSON schemas are instead validated for correctness as they're being encoded in the OpenAPIObjectContainer type.

Introducing such wrapper types again would require precise alignment with the OpenAI, which would make it, I could imagine, harder to maintain over time.
I could imagine that’s one reason why the official OpenAI Python package, which is also generated from the OpenAI OpenAPI specification, does not offer wrapper types either, AFAICT.

What do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think adding an extension initializer/function that takes the well-typed arguments of one wants to use them would be beneficial and would avoid issues with string keys that are not correct or malformatted. Still allowing to pass in a dictionary might be an escape hatch that we can still provide. The OpenAPI surface is quite stable and if we use e.g. an enum for the type of the parameter can also have an other case with an associated string value.

Sources/SpeziLLMOpenAI/LLMOpenAIError.swift Show resolved Hide resolved
Sources/SpeziLLMOpenAI/LLMOpenAISession.swift Show resolved Hide resolved
@paulhdk paulhdk changed the title SpeziLLMOpenAI: Repalce MacPaw/OpenAI With Generated API Calls S Oct 24, 2024
@paulhdk paulhdk changed the title S SpeziLLMOpenAI: Repalce MacPaw/OpenAI With Generated API Calls Oct 24, 2024
Copy link
Member

@PSchmiedmayer PSchmiedmayer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for continuing to work on this @paulhdk!

I had a quick sync with @philippzagar and he will take a closer look at the PR to provide insights on the different changes; would be great to update the PR to the latest version of main to resolve the conflicts; I think after the feedback from @philippzagar we should be ready to get this merged 🚀

Comment on lines +38 to +56
.Input(body: .json(LLMOpenAIRequestType(
messages: openAIContext,
model: schema.parameters.modelType,
frequency_penalty: schema.modelParameters.frequencyPenalty,
logit_bias: schema.modelParameters.logitBias.additionalProperties.isEmpty ? nil : schema
.modelParameters
.logitBias,
max_tokens: schema.modelParameters.maxOutputLength,
n: schema.modelParameters.completionsPerOutput,
presence_penalty: schema.modelParameters.presencePenalty,
response_format: schema.modelParameters.responseFormat,
seed: schema.modelParameters.seed,
stop: LLMOpenAIRequestType.stopPayload.case2(schema.modelParameters.stopSequence),
stream: true,
temperature: schema.modelParameters.temperature,
top_p: schema.modelParameters.topP,
tools: functions.isEmpty ? nil : functions,
user: schema.modelParameters.user
)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be nice to format this similar to our other code bases; might be applicable to other parts as well:

Suggested change
.Input(body: .json(LLMOpenAIRequestType(
messages: openAIContext,
model: schema.parameters.modelType,
frequency_penalty: schema.modelParameters.frequencyPenalty,
logit_bias: schema.modelParameters.logitBias.additionalProperties.isEmpty ? nil : schema
.modelParameters
.logitBias,
max_tokens: schema.modelParameters.maxOutputLength,
n: schema.modelParameters.completionsPerOutput,
presence_penalty: schema.modelParameters.presencePenalty,
response_format: schema.modelParameters.responseFormat,
seed: schema.modelParameters.seed,
stop: LLMOpenAIRequestType.stopPayload.case2(schema.modelParameters.stopSequence),
stream: true,
temperature: schema.modelParameters.temperature,
top_p: schema.modelParameters.topP,
tools: functions.isEmpty ? nil : functions,
user: schema.modelParameters.user
)))
.Input(body:
.json(
LLMOpenAIRequestType(
messages: openAIContext,
model: schema.parameters.modelType,
frequency_penalty: schema.modelParameters.frequencyPenalty,
logit_bias: schema.modelParameters.logitBias.additionalProperties.isEmpty ? nil : schema
.modelParameters
.logitBias,
max_tokens: schema.modelParameters.maxOutputLength,
n: schema.modelParameters.completionsPerOutput,
presence_penalty: schema.modelParameters.presencePenalty,
response_format: schema.modelParameters.responseFormat,
seed: schema.modelParameters.seed,
stop: LLMOpenAIRequestType.stopPayload.case2(schema.modelParameters.stopSequence),
stream: true,
temperature: schema.modelParameters.temperature,
top_p: schema.modelParameters.topP,
tools: functions.isEmpty ? nil : functions,
user: schema.modelParameters.user
)
)
)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: In Progress
Development

Successfully merging this pull request may close these issues.

2 participants