-
Notifications
You must be signed in to change notification settings - Fork 5.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add nvidia ingest component #6333
base: main
Are you sure you want to change the base?
Conversation
Open Items:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Noticed : transformers was getting installed as part of the dependencies.
@ogabrielluiz Looking forward to your views on it.
] | ||
|
||
outputs = [ | ||
Output(display_name="Data", name="data", method="load_file"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the component is having a list[Data]
I would suggest to add two outputs. one for Data and other for DataFrame
ref. URL.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we think that introducing this secondary dataframe output in the ingest components is cluttering the UI?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea is anything that is List[Data] , Dataframe is suggested, in most cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jordanrfrazier Yes, but it is a migration strategy. Soon we will have more Components with DataFrame input and it we can start removing the list[Data] outputs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @jordanrfrazier
What's the most common use case here? Just because we have the capability doesn't mean we have to include it. If most users primarily work with Data, adding a secondary DataFrame output clutters the UI and introduces [more] decision fatigue.
Is there a strong use case where exposing both by default improves usability?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ogabrielluiz Isn't this more of an internal implementation detail, in that case? i.e. if the langflow.Dataframe
can be handled in the same way as a list[Data]
(and by extension, Data
, I'm assuming), shouldn't we be able to just do a single swap of output/input types with a conversion method in the langflow.Dataframe
class to Data
if still required by the component?
Side note: Is there a strong reason for the name change to Dataframe
? Does that unconsciously couple us to pandas
too much, in the scenario we decide to change the backing data type in the future?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we consider inclusion of DataFrame as a future improvement?
Yeah.. that's not good. It didn't install pytorch, though which is the heaviest one. Maybe it isn't that bad but I'd try to avoid if possible, maybe checking extras could help. |
I moved to an optional dependency for now |
d6a54ca
to
7065d5b
Compare
@jordanrfrazier, Can you resolve the conflicts? |
ee63f60
to
1179f2d
Compare
CodSpeed Performance ReportMerging #6333 will degrade performances by 10.32%Comparing Summary
Benchmarks breakdown
|
26649e8
to
5560b0c
Compare
Adds the nv-ingest component. Adds nv-ingest-client as an optional dependency, as it introduces several large packages: