Skip to content

Issue: Dataset metadata parameter in create_dataset() not accessible via Dataset schema #1748

Open
@ashebensaadon

Description

@ashebensaadon

Issue you'd like to raise.

Summary

The create_dataset() method accepts a metadata parameter, but the created dataset doesn't expose this metadata through the Dataset schema or any retrieval methods.

Expected Behavior

When creating a dataset with metadata:

from langsmith import Client

client = Client()
dataset = client.create_dataset(
    dataset_name="Test Dataset",
    description="Test dataset with metadata",
    metadata={"version": "1.0", "author": "test", "project": "my-project"}
)

# Should be able to access metadata
print(dataset.metadata)  # Expected: {"version": "1.0", "author": "test", "project": "my-project"}

Actual Behavior

  • The create_dataset() method accepts the metadata parameter without error
  • The returned Dataset object has no metadata attribute
  • No way to retrieve the metadata that was passed during creation

Dataset Schema Fields

According to the [Dataset schema documentation, the Dataset class only includes:

  • created_at
  • example_count
  • id
  • inputs_schema
  • last_session_start_time
  • modified_at
  • outputs_schema
  • session_count
  • url

Missing: metadata field

Environment

  • LangSmith SDK version: 0.3.42
  • Python version: 3.12.3

Code to Reproduce

from langsmith import Client

client = Client()

# This works without error
dataset = client.create_dataset(
    dataset_name="Test Dataset Metadata",
    description="Testing metadata functionality",
    metadata={"version": "1.0", "environment": "test"}
)

# This fails - no metadata attribute
try:
    print(dataset.metadata)
except AttributeError as e:
    print(f"AttributeError: {e}")

# This also doesn't show metadata
print(dataset.__dict__)

# Re-reading the dataset also doesn't show metadata
retrieved_dataset = client.read_dataset(dataset_name="Test Dataset Metadata")
print(f"Retrieved dataset attributes: {list(vars(retrieved_dataset).keys())}")

Possible Solutions

  1. Add metadata field to Dataset schema and ensure it's returned when creating/reading datasets
  2. Remove metadata parameter from create_dataset() if it's not intended to be supported
  3. Add documentation clarifying the current status of dataset-level metadata support

Additional Context

  • Example-level metadata works fine and is properly documented
  • This creates confusion as the API accepts the parameter but provides no way to retrieve it
  • Dataset description field works correctly

Documentation References

Suggestion:

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions