-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Excel to entities #56
base: main
Are you sure you want to change the base?
Conversation
…ython model entities
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice! can you add some testing and some excels where to do this?
Also, I added a few minor comments on minor design this you can decide if accept or just keep doing it with your style.
Then, one open point: some code doing checks on the format is commented out. We would actually like to keep this, right? Is this the issue you were having with the logger?
@@ -38,20 +38,44 @@ def cli(): | |||
it is using the value of the `OPENBIS_URL` environment variable. | |||
""", | |||
) | |||
def fill_masterdata(url): | |||
@click.option( | |||
"--path", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe be a bit more specific, something like --excel-file
?
import re | ||
from typing import TYPE_CHECKING, Any | ||
|
||
# import logger |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from bam_masterdata.logger import logger
# logger = logger() | ||
|
||
|
||
def index_to_excel_column(index): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typing and description? I let you decide
return column | ||
|
||
|
||
def get_last_non_empty_row(sheet, start_index): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typing? again, you decide if whether doing it
I think is:
def get_last_non_empty_row(sheet: Worksheet, start_index: int) -> int:
"Section", | ||
"Property label", | ||
"Data type", | ||
"Vocabulary code", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add "Object code"
commented out? This is because in the excel import system is missing, but for our Python and other stuff we need this information when the Data type
is OBJECT
codes.append(cell.value) | ||
else: | ||
codes.append(None) | ||
# invalid_codes = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
now that I see this code, this is not dead code, but actually we should have these checks and print the messages to the logger, right?
entity_type = sheet[entity_type_position].value | ||
|
||
entity_types = [ | ||
"OBJECT_TYPE", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need both OBJECT_TYPE and SAMPLE_TYPE here?
) | ||
# return "\n".join(errors) | ||
else: | ||
if entity_type == "SAMPLE_TYPE" or entity_type == "OBJECT_TYPE": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for readability, this and the previous ifs in the terms_to_dict
could be part of a class and methods applied to each type. There you can identify common abstracted methods to be applied
# Check if we've reached the end of the sheet or found two consecutive empty rows | ||
if last_non_empty_row is None: | ||
if i == len(sheet_names) - 1: # Check if it's the last sheet | ||
print( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logger.info(...)
start_time = time.time() | ||
|
||
# Define output directory | ||
output_directory = "./artifacts/tmp/" if path else DATAMODEL_DIR |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what about exporting to os.path.join(DATAMODEL_DIR, "tmp")
instead of in artifacts? This way the one creating this temporary files can compare if there are already some python modules under DATAMODEL_DIR
Pull Request Test Coverage Report for Build 12952511664Details
💛 - Coveralls |
No description provided.