Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(rag): secret vault injection #64

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

wwwehr
Copy link
Member

@wwwehr wwwehr commented Feb 21, 2025

this is not yet ready for review

repaired nilql pip import

updating to nilrag that pins to nilql@alpha10

added support for git branches in pyproject requirements

removed gpu affinity for api service
…tadata based on arbitary values; TODO: add retry mechanism to the llm transform so that it retries until getting the schema format right
…at the nildb save always uses a worker/tools model
@wwwehr wwwehr requested a review from jcabrero March 2, 2025 23:57
@wwwehr wwwehr marked this pull request as ready for review March 2, 2025 23:57
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this differ significantly in behaviour from the original Deepseek 14B docker compose that we had before? If the behaviour is better (which I assume, given your improvements), would it make sense to merge both into a single Deepseek 14B file? This avoids having to maintain multiple source files after this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I committed the new one so that we can discuss the differences to see if we want to integrate new points.

  1. I noticed that start up of two models simultaneously on the H100 often failed. I put in the depends_on for the worker model. I propose we always have a worker model available, but it doesn't need to live on the same server...
  2. I didn't include the tensor-parallel-size, it's set to the default value
  3. I added the "reasoning" MODEL_ROLE flag

wdyt?

Copy link
Member

@jcabrero jcabrero Mar 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really know. I have a discussion with ecosystem and product to decide which models we include by default. I wonder whether the worker model preserves privacy and it's something we are willing to live with.

Copy link
Member

@jcabrero jcabrero Mar 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do we use this MODEL_ROLE in practice? Is this something standardized that LangChain or autogen use in any way? If this is the case, and given we've got it as a model parameter description, I would add it to the rest of the docker-compose files.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from the client, I needed a way to distinguish between "this is a reasoning model" and "this is a worker model". rather than a client needing knowledge of the model itself, as a developer, I simply want a "tag" that informs me so that I can automatically select the kind of model I want. wdyt?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds perfect. I agree. As I comment in the review, I think we should extend it to all models. Potentially, even considering it an enum or a list of potential values in a Pydantic model.

Copy link
Member

@jcabrero jcabrero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the PR is mostly ready. However, I wonder whether we should handle cases where the functionality depends on the Watt model or a “worker” model. If neither exists and users try to use NilDB, it won’t work. We should probably add a check to either ignore NilDB or return an error if no model worker is available, even before proceeding with trying to connect to nilDB, inference, etc...

"""Build a validator to validate the candidate document against loaded schema."""
return validators.extend(Draft7Validator)

def data_reveal(self, filter: dict = {}) -> list[dict]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above for: filter: dict{}. Can you make it into a Pydantic model?

logger.info(f"Error retrieving records in node: {e!r}")
return []

def post(self, data_to_store: list) -> list:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment for data_to_store.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants