Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve Acceleration Framework Integration #205

Open
fabianlim opened this issue Jun 20, 2024 · 0 comments
Open

Improve Acceleration Framework Integration #205

fabianlim opened this issue Jun 20, 2024 · 0 comments
Assignees

Comments

@fabianlim
Copy link
Collaborator

Is your feature request related to a problem? Please describe.

@Ssukriti has some suggestions to improve the integration that was completed in #157

Remaining work in subsequent PRs after this PR is merged:

we need to ensure that in CI/CD all the tests run regularly and they are not skipped. That means all dependencuies should be installed for our tests to run regularly . Purpose is to ensure with every release, all tests pass.
Unit tests - Additional unit tests added are good, thank you. I did want to ensure model after tuning after GPTQLora is of correct format , and can be loaded and inferred correctly. We have had issues in past, when something would change and model format produced is no longer correct - we should have tests to capture that to have full confidence (will DM about this)

Describe the solution you'd like

To enable the unit tests, we need to enable cuda in the GH workflows. This is because quantized kernels can only run on GPU.

Also we need to maybe make changes to the inference script to incorporate the AccelerationFramework there also

Describe alternatives you've considered

A clear and concise description of any alternative solutions or features you've considered.

Additional context

Add any other context about the feature request here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant