Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add streams/web module #761

Merged
merged 24 commits into from
Jan 16, 2025
Merged

Add streams/web module #761

merged 24 commits into from
Jan 16, 2025

Conversation

jackkleeman
Copy link
Contributor

@jackkleeman jackkleeman commented Dec 31, 2024

Issue # (if available)

#544

Description of changes

This PR adds ReadableStream and WritableStream. TransformStream is still left to do. Almost all the WPT tests for these two objects are passing, with a couple commented out and explained.

Yes, this PR is very large. It is hard to avoid this as simply getting a working readable stream requires many functions to be implemented. If useful I can try to split out readable and writable although I don't think the line count is going to be much better - most of it is in readable anyway.

I have tried to stay as faithful as possible to the spec (https://streams.spec.whatwg.org/) in the implementation. I have also borrowed occasionally from the reference implementation https://github.com/whatwg/streams/tree/main/reference-implementation for error messages and also for the pipeTo implementation which is not tightly defined by the spec.

I have yet not wired this up for use by other packages. It seems the existing llrt_stream implements a node stream for use in child process, and I don't think a web stream is appropriate there (unless we first implement a web <-> node conversion). We will want to have both stream and stream/web available for sure. The obvious first place to wire up would be fetch, though.

Great care is needed throughout to ensure that user objects are not held when we release control to the user, as they can do something re-entrant (ie, call one of those objects) and we will fail to obtain an owned borrow. The wpt tests do a very good job of finding these cases but there could be more and they create a panic. Maybe fuzzing over the public API could work here. It seems ok if we put an experimental label on the streams/web package for the time being. The other thing to be aware of when releasing control is that the reader/writer may have changed from the one you had a handle on before which isn't necessarily a panic but can lead to bugs. Again, the wpt should catch a lot of these.

Key things for review:

  1. Any changes outside of llrt_stream_web
  2. Try it out

Possible optimizations:

  1. pipeTo could have a more native impl instead of copying the JS reference implementation
  2. Lots of places we use JS promises where we may be able to use Rust async code directly and skip some indirection
  3. Various ctx.evals which just need to be cleaned up and replaced with ctx.global().get or similar

Checklist

  • Created unit tests in tests/unit and/or in Rust for my feature if needed
  • Ran make fix to format JS and apply Clippy auto fixes
  • Made sure my code didn't add any additional warnings: make check
  • Added relevant type info in types/ directory
  • Updated documentation if needed (API.md/README.md/Other)

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@nabetti1720
Copy link
Contributor

nabetti1720 commented Jan 1, 2025

I think it's a great job. Just a comment. :)

I have yet not wired this up for use by other packages. It seems the existing llrt_stream implements a node stream for use in child process, and I don't think a web stream is appropriate there (unless we first implement a web <-> node conversion). We will want to have both stream and stream/web available for sure. The obvious first place to wire up would be fetch, though.

Can you expose these new classes from globalThis? That would make it more web standards compliant.
https://min-common-api.proposal.wintercg.org/

@jackkleeman
Copy link
Contributor Author

sure, thats easy to do

@nabetti1720
Copy link
Contributor

nabetti1720 commented Jan 3, 2025

I have yet not wired this up for use by other packages. It seems the existing llrt_stream implements a node stream for use in child process, and I don't think a web stream is appropriate there (unless we first implement a web <-> node conversion). We will want to have both stream and stream/web available for sure. The obvious first place to wire up would be fetch, though.

As you say, I think support for fetch is the top priority.

The Runtime compatibility document that I often cite seems to check whether stream classes are supported using the following criteria:

// reproduction.js
const judgement = (function () {
    if (!("fetch" in self)) {
        return { result: false, message: "fetch is not defined" };
    }
    var streamPromise = fetch("/favicon/favicon.ico")
        .then(function (response) {
            return response.body;
        })
        .catch(function () {
            return fetch(
                "https://mdn-bcd-collector.gooborg.com/favicon/favicon.ico"
            ).then(function (response) {
                return response.body;
            });
        });
    if (!streamPromise) {
        return { result: false, message: "streamPromise is falsy" };
    }
    var promise = streamPromise.then(function (stream) {
        return stream.getReader();
    });
    if (!promise) {
        return { result: false, message: "Promise variable is falsy" };
    }
    return promise.then(function (instance) {
        return !!instance;
        // To check if a method exists, use the following:
        // return !!instance && "cancel" in instance;
    });
})();

console.log(await judgement);

When I checked it on my laptop, it seemed that some classes were determined to be "unsupported" because this code produced an error.

% llrt reproduction.js
TypeError: cannot read property 'getReader' of null
  at <anonymous> (/Users/shinya/Workspaces/llrt-test/reproduction.js:19:38)

% bun reproduction.js 
true

@Sytten
Copy link
Collaborator

Sytten commented Jan 3, 2025

I will have to do a full review, but something that I saw that bugged me was the eval for creating errors. I am pretty sure we don't need to do that. We should be able to create them in Rust.

@jackkleeman
Copy link
Contributor Author

@nabetti1720 i think your code is testing whether fetch returns a stream as response.body, but it returns null:

  //FIXME return readable stream when implemented
    #[qjs(get)]
    pub fn body(&self) -> Null {
        Null
    }

lets get this merged first and i can look to wire it up with fetch

@jackkleeman
Copy link
Contributor Author

@Sytten yes, laziness on my part. will fix all the ctx evals today

@richarddavison
Copy link
Contributor

richarddavison commented Jan 3, 2025

Thanks @jackkleeman for this fantastic contribution 🎉

Various ctx.evals which just need to be cleaned up and replaced with ctx.global().get or similar

Like you mentioned, I see some ctx evals for error creation. Best approach here is to use primordials:
BasePrimordials::get(ctx)? where you have access to the type error constructor.
If this is something frequently done you can add a method to BasePrimodials:
fn new_type_error(...)

You can also create them for rust, but then they'll lack stack traces (sometimes, depending on how they are being used). I opened up an issue regarding this:
https://github.com/quickjs-ng/quickjs/blob/master/quickjs.c#L6862-L6865
quickjs-ng/quickjs#782

Lots of places we use JS promises where we may be able to use Rust async code directly and skip some indirection

Can you elaborate a bit more on this?

@jackkleeman
Copy link
Contributor Author

I have now refactored all the ctx.global.get and ctx.eval into primordials. Currently new_type and new_range are just helper methods in my crate, if you want them to go into BasePrimordials let me know

re the promises. the spec has a lot of 'upon fulfillment of promise, do X'. i have a helper method upon_promise for this. the spec also asks that we convert promises into other promises by adding a fulfillment or rejection step. and the spec also asks that we set particular promises as 'handled' ie so they don't complain about unhandled rejections. given that the spec is quite specific, ive implemented things as faithfully as possible. this inevitably means calling .then and .catch a lot with anonymous Function::new() objects.

what i have not done (at all) is use rust async functions, convert promises into futures, etc. as far as i can tell, converting promises into rust futures still calls then and catch with a Function::new() under the hood, but just with a handle to a waker, allowing the previous async code to resume. so maybe performance wise this very similar, but the argument to use more rust async code might be that its easier to follow or more idiomatic. however, it will depart from the spec somewhat which could make it harder to debug, and imo it is a lot easier to accidentally hold a reference to a user object across an await point than it is to accidentally move one into a FnOnce that we execute on promise resolution. and as a reminder, holding references to user objects when we release to user code can lead to panics.

one special case s the pipeTo implementation, which is not actually defined in the spec (instead ive copied the reference implementation). i imagine that we could make that implementation simpler and more concise if we used rust futures more there, without the concern that it will depart from the spec.

Copy link
Contributor

@richarddavison richarddavison left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some observations:

  1. First of all, I can't thank you enough for this colossal effort! Huge work 🥇
  2. Great that we avoid accessing globals and use primordials instead (also safer as globals can be modified in user space). However, for perf reasons we should try to clone and store the primordial value in the struct when used frequently and when possible to avoid lookup. Even though primordial lookup is faster than global lookup, it's still expensive for frequent calls.
  3. If possible, avoid using JS Function::new() if not providable from user space. This saves a ton of indication and we can pass fn pointers to rust functions instead. If accessible/definable from user space and from Rust, use an enum to hold either a JS Function, or a Rust function.
  4. Maybe some macros would reduce duplication or simplify instanciantion where code is very similar but can't be refactored into a shared function LMKWYT?

modules/llrt_stream_web/src/readable/reader.rs Outdated Show resolved Hide resolved
modules/llrt_stream_web/src/lib.rs Outdated Show resolved Hide resolved
modules/llrt_stream_web/src/queueing_strategy.rs Outdated Show resolved Hide resolved
modules/llrt_stream_web/src/readable/byob_reader.rs Outdated Show resolved Hide resolved
modules/llrt_stream_web/src/readable/byob_reader.rs Outdated Show resolved Hide resolved
modules/llrt_stream_web/src/readable/byte_controller.rs Outdated Show resolved Hide resolved
modules/llrt_stream_web/src/readable/byte_controller.rs Outdated Show resolved Hide resolved
modules/llrt_stream_web/src/readable/iterator.rs Outdated Show resolved Hide resolved
modules/llrt_stream_web/src/readable/tee.rs Outdated Show resolved Hide resolved
modules/llrt_stream_web/src/readable/tee.rs Outdated Show resolved Hide resolved
@jackkleeman
Copy link
Contributor Author

Re macros. I'll do a pass and think about what can be done. If you have any particular ideas let me know

@jackkleeman
Copy link
Contributor Author

I managed to shave off 700 lines with some new functions and type aliases but not seeing anything obvious beyond those - let me know if you see any

@richarddavison
Copy link
Contributor

I managed to shave off 700 lines with some new functions and type aliases but not seeing anything obvious beyond those - let me know if you see any

Fantastic, I'll take a second look!

Copy link
Collaborator

@Sytten Sytten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a first pass on the review, I didn't yet get to readable and writable modules. I should be able to do that tomorrow.
General comment is to please over-explain everything for future maintainers and split unrelated items into different files.

libs/llrt_utils/src/bytes.rs Show resolved Hide resolved
libs/llrt_utils/src/clone.rs Show resolved Hide resolved
libs/llrt_utils/src/hash.rs Outdated Show resolved Hide resolved
modules/llrt_abort/src/lib.rs Show resolved Hide resolved
modules/llrt_exceptions/src/lib.rs Outdated Show resolved Hide resolved
modules/llrt_stream_web/src/lib.rs Outdated Show resolved Hide resolved
modules/llrt_stream_web/src/queueing_strategy.rs Outdated Show resolved Hide resolved
modules/llrt_stream_web/src/queueing_strategy.rs Outdated Show resolved Hide resolved
modules/llrt_stream_web/src/queueing_strategy.rs Outdated Show resolved Hide resolved
modules/llrt_stream_web/src/queueing_strategy.rs Outdated Show resolved Hide resolved
@Sytten
Copy link
Collaborator

Sytten commented Jan 7, 2025

Also please write the typescript typing, I did write a simplified version of it since we didnt have most of the API but now that we do we should mostly probably take back the node typing and put it in there.

@jackkleeman
Copy link
Contributor Author

@Sytten re typing, this should be part of the browser globals types? this isnt node streams, which have their own types

@Sytten
Copy link
Collaborator

Sytten commented Jan 7, 2025

Right, it looks like they put it under the stream/web.ts file in the DefinitelyTyped so I would follow the same direction @jackkleeman

Copy link
Collaborator

@Sytten Sytten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some further comments

modules/llrt_stream_web/src/readable/mod.rs Outdated Show resolved Hide resolved
modules/llrt_stream_web/src/readable/mod.rs Outdated Show resolved Hide resolved
modules/llrt_stream_web/src/readable/mod.rs Outdated Show resolved Hide resolved
modules/llrt_stream_web/src/readable/mod.rs Outdated Show resolved Hide resolved
modules/llrt_stream_web/src/readable/mod.rs Outdated Show resolved Hide resolved
@Sytten
Copy link
Collaborator

Sytten commented Jan 7, 2025

Sorry the lines of the comments are off due to the force push, I did the review this morning.
I can help on the typing once we get there, we will want to remove references to node and double check that we support everything in there. The typing of node is also sometimes convoluted for not real reason so I do simplify it when need be.

@jackkleeman
Copy link
Contributor Author

I believe we do support everything in there but good to double check. SOunds good

Copy link
Contributor

@richarddavison richarddavison left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Some comments:

  1. We can store the string as we already have a string reference from JS and then use DOMExceptionName as a validation mechanism (and great to have when calling from rust)
  2. Remove code property (hard code to 0) as it's deprecated and should be used.

modules/llrt_exceptions/src/lib.rs Outdated Show resolved Hide resolved
modules/llrt_exceptions/src/lib.rs Outdated Show resolved Hide resolved
modules/llrt_exceptions/src/lib.rs Outdated Show resolved Hide resolved
modules/llrt_exceptions/src/lib.rs Outdated Show resolved Hide resolved
@jackkleeman jackkleeman force-pushed the streams branch 2 times, most recently from 8724f9f to 4ef0ca4 Compare January 13, 2025 20:44
@jackkleeman
Copy link
Contributor Author

I decided to just bite the bullet and add the DOMException wpt tests

@richarddavison richarddavison merged commit 3ff3dcd into awslabs:main Jan 16, 2025
11 checks passed
@jackkleeman jackkleeman mentioned this pull request Jan 19, 2025
3 tasks
@jackkleeman
Copy link
Contributor Author

thanks for the review - i know it was a big one
will keep chipping away at some of the smaller tasks like transform stream and encoder/decoder :)

@jackkleeman jackkleeman deleted the streams branch January 19, 2025 12:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants