Description
System information
- Have I written custom code (no):
- OS Platform and Distribution (Tencent tlinux 2.6):
- FATE Flow version (2.2.0):
- Python version (use command: Python 3.10.14):
Describe the current behavior
Upload with this config below
{
"file": "$file",
"head": false,
"name": "$filename",
"namespace": "namespace",
"extend_sid": true,
"meta": {"delimiter": ",", "match_id_name": "id", "sample_id_name": "id"},
"partitions": 1
}
will successfully pass the upload stage and get a runtime error as follow even sample_id_name is in meta info.
Traceback (most recent call last):
File "/data/projects/fate/fate/python/fate/components/entrypoint/cli/component/execute_cli.py", line 147, in execute_component_from_config
component.execute(ctx, role, **execution_io.get_kwargs())
File "/data/projects/fate/fate/python/fate/components/core/component_desc/_component.py", line 101, in execute
return self.callback(ctx, role, **kwargs)
File "/data/projects/fate/fate/python/fate/components/components/dataframe_transformer.py", line 33, in dataframe_transformer
table_reader = TableReader(
File "/data/projects/fate/fate/python/fate/arch/dataframe/_frame_reader.py", line 66, in __init__
self.check_params()
File "/data/projects/fate/fate/python/fate/arch/dataframe/_frame_reader.py", line 70, in check_params
raise ValueError("Please provide sample_id_name")
ValueError: Please provide sample_id_name
This is probably because the config meta info is not updated into the table meta info.
Describe the expected behavior
Fate should correctly handle head:false case and read header line from meta info.
Other info / logs Include any logs or source code that would be helpful to
diagnose the problem. If including tracebacks, please include the full
traceback. Large logs and files should be attached.
From FATE-Flow/python/fate_flow/components/components/upload.py
we can find that head:false case is not correctly handled.
def update_schema(self, fp):
id_index = 0
read_status = False
if self.parameters.head is True:
data_head = fp.readline()
id_index = self.update_table_meta(data_head)
read_status = True
else:
pass
return id_index, read_status
Contributing
- Do you want to contribute a PR? (yes):
- Briefly describe your candidate solution(if contributing):
Simply read header line from meta config may probably solve the problem.
def update_schema(self, fp):
id_index = 0
read_status = False
if self.parameters.head is True:
data_head = fp.readline()
id_index = self.update_table_meta(data_head)
read_status = True
else:
data_head = self.parameters.meta.header # add this
id_index = self.update_table_meta(data_head) # add this
return id_index, read_status