Skip to content

Excel export not import-compatible #220

@ibresslerBAM

Description

@ibresslerBAM

I am in progress of setting up a separate OpenBIS instance (with regular tear-down/setup) for testing our data models of further instruments to integrate but also for testing specific upload code against new models and adjustments during development.

For replicating the production environment as closely as possible, one idea is to use this repo and its tools as source for the masterdata already in place (ground-truth). The command bam_masterdata export_to_excel comes in handy for this. Next would be to go into the admin interface -> Tools -> Import. Turns out, the generated excel files are not compatible with OpenBIS, the version we run currently (20.10.11) does not like it:

Image

To sum it up, for reference and for those who try to attempt a similar thing, there are multiple issues. I did not propose fixes yet, since, for making it work, I changed some parts for which I do not fully understand their purpose in other parts of the project, therefore I'd just like to document & discuss, what needs changed (if there is interest to make it compatible, of course ;) ):

  1. In metadata/definitions.py I have to filter object attributes which are not known to OpenBIS server, they must not show up in the generated excel file:

    --- a/bam_masterdata/metadata/definitions.py
    +++ b/bam_masterdata/metadata/definitions.py
    @@ -162,7 +162,7 @@ class EntityDef(BaseModel):
             fields = [
                 k
                 for k in self.model_fields.keys()
    -            if k not in ["iri", "id", "row_location"]
    +            if k not in ["iri", "id", "row_location", "object_code", "internal_assignment", "unique", "main_dataset_path", "main_dataset_pattern", "url_template", "official"]
             ]
             headers: dict = {}
             for f in fields:
    
  2. The dynamic/validation scripts are not exported, e.g. DEFAULT_EXPERIMENT.date_range_validation does not exist here, right? -> Are scripts to be covered by bam-masterdata?

    It gives an error during import, saying UserFailureException: sheet: 1 line: 11 message: Object with PluginPermId = [DEFAULT_EXPERIMENT] has not been found. (Context: [setting validation script (1/1)

    If it were available, a folder scripts containing them alongside the .XLSX in a .ZIP archive should be provided for upload to OpenBIS (see also ). As quick fix, it should not be mentioned in the exported Excel file:

    --- a/bam_masterdata/cli/entities_to_excel.py
    +++ b/bam_masterdata/cli/entities_to_excel.py
    @@ -46,7 +46,11 @@ def entities_to_excel(
             excel_headers = []
             header_values = []
             for field, excel_header in obj_definitions.excel_headers_map.items():
    -            header_values.append(getattr(obj_definitions, field))
    +            value = getattr(obj_definitions, field)
    +            if "validation" in excel_header.lower() and value is not None:
    +                print(f"Skipping {obj_definitions.code}, {excel_header}, {value=}")
    +                value = None
    +            header_values.append(value)
                 excel_headers.append(excel_header)
             worksheet.append(excel_headers)
             worksheet.append(header_values)
    
  3. Datatype OBJECT can not be imported by OpenBIS, it seems that it's just not supported and thus throws an error: UserFailureException: sheet: 1 line: 39 message: No enum constant ch.ethz.sis.openbis.generic.asapi.v3.dto.property.DataType.OBJECT (Context: []).
    To make it work anyway, for testing, I replaced all occurrences by VARCHAR:

    +++ b/bam_masterdata/cli/entities_to_excel.py
    @@ -63,6 +63,8 @@ def entities_to_excel(
                     for field in prop.excel_headers_map.keys():
                         if field == "data_type":
                             val = prop.data_type.value
    +                        if "OBJECT" in val:
    +                            val = "VARCHAR"
                         else:
                             val = getattr(prop, field)
                         row.append(val)
    

    This is kind of a show-stopper for the intended purpose (restoring main masterdata). Perhaps, someone knows a workaround in the context of excel-imports? For my purpose, this points me to other methods of syncing masterdata, perhaps server-side scripts.

  4. Another hard-to-solve issue on this side and version are internal properties (beginning with $). They can be modified (and created) only by system user and not instance admin. Afaik, you can't login as system user, but only server-side scripts run with this level of privileges can modify those properties (same as previous issue).

    For example, when importing with Update if exists mode, it complains early at the $NAME property with UserFailureException: sheet: 1 line: 5 message: Authorization failure: Property assignments created by the system user for internal property types can be managed only by the system user.

    There is one new internal property defined by our masterdata here ($ANNOTATIONS_STATE) that does not exist in the default installation of OpenBIS and therefore it can not be imported by Excel files. To make it work, I filtered this property from all objects, and thus the associated functionality will be missing after import:

    --- a/bam_masterdata/cli/entities_to_excel.py
    +++ b/bam_masterdata/cli/entities_to_excel.py
    @@ -59,6 +59,8 @@ def entities_to_excel(
                     list(obj_instance.properties[0].excel_headers_map.values())
                 )
                 for prop in obj_instance.properties:
    +                if "$ANNOTATIONS_STATE" in prop.code:
    +                    continue
                     row = []
                     for field in prop.excel_headers_map.keys():
                         if field == "data_type":
    
  5. Since other already existing internal properties get exported as well and by importing the file with ignore if exists update mode, it will not complain. But of course, this does not update other (non-internal) properties that might have changed.
    Image

With those changes, I could restore masterdata partially :)
Image

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions