Skip to content

Add support for multi sample item in optimize and yielding from the _getitem_ of the StreamingDataset  #317

@tchaton

Description

@tchaton

🚀 Feature

Motivation

It would be great to be able to create a batch of sub sample from a given sample. Right now, you can't do that.

However a user could support this.

def optimize(...):

    sample = 
	return MultiSample(sample, num_samples=X)

Under the hood, we know this sample could be used to generate multiple random samples.

class MyStreamingDataset(StreamingDataset):

	def __getitem__(self, index, sample_id):
			sample  = super().__getitem__(index)
            
            # do some transformation
            return  data

A use case would be image detection where each image can be used to generate multiple sub boxes and we might want to have them as different training samples.

Pitch

Alternatives

Additional context

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requesthelp wantedExtra attention is needed

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions