BREAKING: Refactored SDK I/O #768
aajtodd
announced in
Announcements
Replies: 1 comment 3 replies
-
Hi guys! I have a method that download an S3 Objects in bunches of 1 Mib and sent it to a channel for processing... Something like this
I tried to rewrite it, but I'm getting a NotFound for SdkBuffer class.
Thanks in advance! |
Beta Was this translation helpful? Give feedback.
3 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
An upcoming release of the AWS SDK for Kotlin includes redesigned I/O abstractions (specifically affecting the
ByteStream
type).If you do not unwrap a
ByteStream
into one of the possible variants or supply a custom implementation ofByteStream
this change should not affect you.As an example if you only create and consume instances of
ByteStream
via convenience functions (e.g.ByteStream.fromFile(...)
,ByteStream.fromBytes(...)
,ByteStream.writeToFile(...)
,ByteStream.toByteArray()
, etc) then you should be unaffected.Release date
This feature will ship with the
v0.19.0
release planned for 2022-12-01.Background
When the AWS SDK for Kotlin project was first started there weren't many options for multi-platform I/O abstractions (there still aren't). There was the kotlinx-io incubator project that looked promising but wasn't ready for consumption yet.
kotlinx-io
was going to be an attempt to cleanup and standardize the APIs in ktor-io (the underlying I/O abstractions used by the Ktor project). We decided to take a minimal subset of the channel API from Ktor with the expectation that we would later upgrade tokotlinx-io
and converge on what we believed would become a community standard.kotlinx-io
has since been deprecated which left us with an I/O API that wasn't necessarily meant to be permanent. This set of breaking changes comes from taking a second look at the underlying I/O abstractions we provide and want to commit to long term.What's changing?
The
SdkByteReadChannel
interface has been simplified. Previously it required implementing at least 4 non-trivialread
functions making it hard to wrap or provide custom implementations of. It also forced at least one additional copy between producer and consumer. The new interface requires implementing only a single read method and minimizes copies.We have introduced a new
ByteStream.SourceStream
variant that takes anSdkSource
type.SdkSource
is a new blocking interface. It is simpler to implement and improves performance for file backed I/O by removing a channel middleman.How is
SdkByteReadChannel
different fromSdkSource
?An
SdkByteReadChannel
andSdkSource
are two different ways of supplying a stream of bytes.SdkSource
is a blocking interface whereasSdkByteReadChannel
is fully asynchronous. The addition ofSdkSource
provides additional flexibility to how a stream of bytes can be modeled.Both interfaces work by requiring data be written to or read from an
SdkBuffer
. Forcing a concrete type in the interfaces allows for copies to be minimized (because the internal representation of the type is known and can be leveraged when reading and writing data). The inspiration for this (and current backing implementation) comes from Okio. Moving data between buffers is cheap which makes layering these types easy and performant (data is frequently re-assigned between buffers rather than copied).How to Migrate
NOTE: This would be a good time to review any custom code and update to a convenience function if one already exists that matches your use case.
Consuming
ByteStream
Any code that consumes a
ByteStream
and unwraps it into one of the variants will need to be updated.The above code would need to be updated to handle the refactored type hierarchy (and differences in the underlying variants), e.g:
ByteStream.ReplayableStream
andByteStream.OneShotStream
have been collapsed intoByteStream.ChannelStream
and updated to the newSdkByteReadChannel
interface.ByteStream
is replayable is now indicated by theisOneShot
flag shared by all variants.ByteStream.SourceStream
is a new variant that reads from theSdkSource
typeProviding a
ByteStream
The new
ByteStream.SourceStream
variant provides an additional option to consider for how to supply bytes to the SDK. We will focus on how to upgrade the channel based variant here though as that is where the breaking change will be felt most acutely.The
SdkByteChannel
type has read and write halves,SdkByteReadChannel
andSdkByteWriteChannel
respectively. This is a single-reader, single-writer channel that is semantically the same in how it should be used before and after this change (think of it as a pipe connecting an asynchronous producer and consumer together). The only update required should be to re-implement your customizedread()
logic in terms of the updated interface.As an example, let's say we implemented a custom channel that wraps an underlying channel and logs something every time data is read from it. This might look something like:
Using it may look like:
This can now be written as:
Additional resources
If you have any questions concerning this change, please feel free to engage with us in this discussion. If you've encountered a bug with these changes, please file an issue.
Beta Was this translation helpful? Give feedback.
All reactions