-
Notifications
You must be signed in to change notification settings - Fork 122
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Bug description
demo code at section Extract and save Item embeddings raises no output schema exception.
Steps/Code to reproduce bug
just follow the demo
Expected behavior
Environment details
- Merlin version:
- merlin 0.0.1
- merlin-core 0+untagged.1.g6d396aa
- merlin-dataloader 0+untagged.1.g1441a12
- merlin-hps 1.0.0
- merlin-models 0+untagged.1.geb1e541
- merlin-sok 2.0.0
- merlin-systems 0+untagged.1.ga19d311
- Platform: Linux 12ce9556ef42 5.4.0-200-generic
- Python version: 3.10.12
- PyTorch version (GPU?): N/A
- Tensorflow version (GPU?): 2.12.0+nv23.6
Additional context
File "/root/raid/common_models/recommend_system/merlin/aliccp/extract_item_feature.py", line 56, in main item_embeddings = workflow.fit_transform(Dataset(item_features)).to_ddf().compute() File "/usr/local/lib/python3.10/dist-packages/nvtabular/workflow/workflow.py", line 236, in fit_transform self.fit(dataset) File "/usr/local/lib/python3.10/dist-packages/nvtabular/workflow/workflow.py", line 213, in fit self.executor.fit(dataset, self.graph) File "/usr/local/lib/python3.10/dist-packages/merlin/dag/executors.py", line 501, in fit ).sample_dtypes() File "/usr/local/lib/python3.10/dist-packages/merlin/io/dataset.py", line 1169, in sample_dtypes _real_meta = self.engine.sample_data(n=n) File "/usr/local/lib/python3.10/dist-packages/merlin/io/dataset_engine.py", line 64, in sample_data _head = _ddf.partitions[partition_index].head(n) File "/usr/local/lib/python3.10/dist-packages/dask/dataframe/core.py", line 1268, in head return self._head(n=n, npartitions=npartitions, compute=compute, safe=safe) File "/usr/local/lib/python3.10/dist-packages/dask/dataframe/core.py", line 1302, in _head result = result.compute() File "/usr/local/lib/python3.10/dist-packages/dask/base.py", line 314, in compute (result,) = compute(self, traverse=False, **kwargs) File "/usr/local/lib/python3.10/dist-packages/dask/base.py", line 599, in compute results = schedule(dsk, keys, **kwargs) File "/usr/local/lib/python3.10/dist-packages/dask/threaded.py", line 89, in get results = get_async( File "/usr/local/lib/python3.10/dist-packages/dask/local.py", line 511, in get_async raise_exception(exc, tb) File "/usr/local/lib/python3.10/dist-packages/dask/local.py", line 319, in reraise raise exc File "/usr/local/lib/python3.10/dist-packages/dask/local.py", line 224, in execute_task result = _execute_task(task, data) File "/usr/local/lib/python3.10/dist-packages/dask/core.py", line 119, in _execute_task return func(*(_execute_task(a, cache) for a in args)) File "/usr/local/lib/python3.10/dist-packages/dask/optimization.py", line 990, in __call__ return core.get(self.dsk, self.outkey, dict(zip(self.inkeys, args))) File "/usr/local/lib/python3.10/dist-packages/dask/core.py", line 149, in get result = _execute_task(task, cache) File "/usr/local/lib/python3.10/dist-packages/dask/core.py", line 119, in _execute_task return func(*(_execute_task(a, cache) for a in args)) File "/usr/local/lib/python3.10/dist-packages/dask/utils.py", line 72, in apply return func(*args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/merlin/dag/executors.py", line 103, in transform transformed_data = self._execute_node(node, transformable, capture_dtypes, strict) File "/usr/local/lib/python3.10/dist-packages/merlin/dag/executors.py", line 117, in _execute_node upstream_outputs = self._run_upstream_transforms( File "/usr/local/lib/python3.10/dist-packages/merlin/dag/executors.py", line 135, in _run_upstream_transforms node_output = self._execute_node( File "/usr/local/lib/python3.10/dist-packages/merlin/dag/executors.py", line 117, in _execute_node upstream_outputs = self._run_upstream_transforms( File "/usr/local/lib/python3.10/dist-packages/merlin/dag/executors.py", line 135, in _run_upstream_transforms node_output = self._execute_node( File "/usr/local/lib/python3.10/dist-packages/merlin/dag/executors.py", line 125, in _execute_node transform_output = self._run_node_transform(node, transform_input, capture_dtypes, strict) File "/usr/local/lib/python3.10/dist-packages/merlin/dag/executors.py", line 255, in _run_node_transform raise exc File "/usr/local/lib/python3.10/dist-packages/merlin/dag/executors.py", line 242, in _run_node_transform transformed_data = node.op.transform(selection, input_data) File "/usr/local/lib/python3.10/dist-packages/merlin/systems/dag/ops/workflow.py", line 107, in transform output = self.workflow._transform_df(transformable) File "/usr/local/lib/python3.10/dist-packages/nvtabular/workflow/workflow.py", line 256, in _transform_df raise ValueError("no output schema") ValueError: no output schema
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working