Skip to content

Add support for TiffStreamingRawDataset #687

@bhimrazy

Description

@bhimrazy

🚀 Feature

Notes from @tchaton

We could add https://developmentseed.org/async-tiff/latest to the StreamingRawDataset

from litdata import StreamingRawDataset
from litdata.raw.types import TIFF
import torch

class TiffStreamingRawDataset(StreamingRawDataset):

    def setup(self, urls):
        return [TIFF(url, tile=(512, 512, 3), ....]

    def __getitem__(self, decoded_bytes: bytes):
        return torch.frombuffer(decoded_bytes, torch.uint8)

example: https://github.com/microsoft/pytorch-cloud-geotiff-optimization/blob/5fb6d1294163beff822441829dcd63a3791b7808/optimized_cog_streaming/datamodules.py#L89 and https://github.com/microsoft/pytorch-cloud-geotiff-optimization/blob/5fb6d1294163beff822441829dcd63a3791b7808/optimized_cog_streaming/datasets.py#L42

Motivation

Pitch

Alternatives

Additional context

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requesthelp wantedExtra attention is needed

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions