Skip to content

Fail to update table mate when head is false #588

Open
@QiShao647

Description

@QiShao647

System information

  • Have I written custom code (no):
  • OS Platform and Distribution (Tencent tlinux 2.6):
  • FATE Flow version (2.2.0):
  • Python version (use command: Python 3.10.14):

Describe the current behavior
Upload with this config below

{
    "file": "$file",
    "head": false,
    "name": "$filename",
    "namespace": "namespace",
    "extend_sid": true,
    "meta": {"delimiter": ",", "match_id_name": "id", "sample_id_name": "id"},
    "partitions": 1
}

will successfully pass the upload stage and get a runtime error as follow even sample_id_name is in meta info.

Traceback (most recent call last):
  File "/data/projects/fate/fate/python/fate/components/entrypoint/cli/component/execute_cli.py", line 147, in execute_component_from_config
    component.execute(ctx, role, **execution_io.get_kwargs())
  File "/data/projects/fate/fate/python/fate/components/core/component_desc/_component.py", line 101, in execute
    return self.callback(ctx, role, **kwargs)
  File "/data/projects/fate/fate/python/fate/components/components/dataframe_transformer.py", line 33, in dataframe_transformer
    table_reader = TableReader(
  File "/data/projects/fate/fate/python/fate/arch/dataframe/_frame_reader.py", line 66, in __init__
    self.check_params()
  File "/data/projects/fate/fate/python/fate/arch/dataframe/_frame_reader.py", line 70, in check_params
    raise ValueError("Please provide sample_id_name")
ValueError: Please provide sample_id_name

This is probably because the config meta info is not updated into the table meta info.

Describe the expected behavior
Fate should correctly handle head:false case and read header line from meta info.

Other info / logs Include any logs or source code that would be helpful to
diagnose the problem. If including tracebacks, please include the full
traceback. Large logs and files should be attached.

From FATE-Flow/python/fate_flow/components/components/upload.py we can find that head:false case is not correctly handled.

    def update_schema(self, fp):
        id_index = 0
        read_status = False
        if self.parameters.head is True:
            data_head = fp.readline()
            id_index = self.update_table_meta(data_head)
            read_status = True
        else:
            pass
        return id_index, read_status

Contributing

  • Do you want to contribute a PR? (yes):
  • Briefly describe your candidate solution(if contributing):

Simply read header line from meta config may probably solve the problem.

    def update_schema(self, fp):
        id_index = 0
        read_status = False
        if self.parameters.head is True:
            data_head = fp.readline()
            id_index = self.update_table_meta(data_head)
            read_status = True
        else:
            data_head = self.parameters.meta.header # add this
            id_index = self.update_table_meta(data_head) # add this
        return id_index, read_status

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions