-
Notifications
You must be signed in to change notification settings - Fork 177
[New connector] Onelake connector #3057
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…onelake connector)
💚 CLA has been signed |
|
||
for path in doc_paths: | ||
file_name = path.name.split("/")[-1] | ||
field_client = await self._get_file_client(file_name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at the code, for each file multiple clients are created. Is there a way to reuse the clients between calls? I can see how this can become a problem with RAM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello Artem, thank you so much for your feedback.
Regarding this comment: each file client represents a specific file and is initialized with the file name, so it’s not possible to reuse the client. I believe the garbage collector should remove unused clients, but that’s just an assumption.
connectors/sources/onelake.py
Outdated
def _get_account_url(self): | ||
"""Get the account URL for OneLake | ||
|
||
Returns: | ||
str: Account URL | ||
""" | ||
|
||
return f"https://{self.configuration['account_name']}.dfs.fabric.microsoft.com" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can just become an field of the class that's set during init
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved it to the constructor method:
…ker-compose and connector.json
Hello! I made changes regarding the asynchrony. I converted the synchronous methods to asynchronous using asyncio, based on the implementation of the Google Drive connector. |
TITLE: [New connector] Onelake connector
Closes #3051
Added Onelake connector files, the connector’s code, test, requirements and the reference to the connector in the sources list (connectors/config.py)
Checklists
Pre-Review Checklist
config.yml.example
)v7.13.2
,v7.14.0
,v8.0.0
)