Skip to content

src: improve performance of http parser #58288

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

anonrig
Copy link
Member

@anonrig anonrig commented May 12, 2025

Adds v8 fast api to almost all possible http parser methods.


Pending benchmarks.

Benchmark CI: https://ci.nodejs.org/view/Node.js%20benchmark/job/benchmark-node-micro-benchmarks/1717/

cc @nodejs/performance @nodejs/http

@anonrig anonrig requested review from mcollina and jasnell May 12, 2025 01:20
@nodejs-github-bot
Copy link
Collaborator

Review requested:

  • @nodejs/http
  • @nodejs/net

@nodejs-github-bot nodejs-github-bot added c++ Issues and PRs that require attention from people who are familiar with C++. lib / src Issues and PRs related to general changes in the lib or src directory. needs-ci PRs that need a full CI run. labels May 12, 2025
@anonrig anonrig force-pushed the yagiz/optimize-http-parser branch from f0f2213 to 6ba2682 Compare May 12, 2025 01:23
@anonrig anonrig requested review from H4ad and dario-piotrowicz May 12, 2025 01:53
Copy link

codecov bot commented May 12, 2025

Codecov Report

Attention: Patch coverage is 89.23077% with 7 lines in your changes missing coverage. Please review.

Project coverage is 90.18%. Comparing base (6184730) to head (6ba2682).
Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
src/node_http_parser.cc 86.53% 1 Missing and 6 partials ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main   #58288   +/-   ##
=======================================
  Coverage   90.18%   90.18%           
=======================================
  Files         631      631           
  Lines      186690   186741   +51     
  Branches    36666    36661    -5     
=======================================
+ Hits       168360   168415   +55     
+ Misses      11126    11124    -2     
+ Partials     7204     7202    -2     
Files with missing lines Coverage Δ
src/node_external_reference.h 100.00% <ø> (ø)
src/util.cc 87.21% <100.00%> (+0.35%) ⬆️
src/util.h 91.22% <ø> (ø)
src/node_http_parser.cc 84.64% <86.53%> (+0.37%) ⬆️

... and 28 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@@ -17,6 +17,11 @@ using CFunctionCallbackWithMultipleValueAndOptions =
v8::Local<v8::Value>,
v8::Local<v8::Value>,
v8::FastApiCallbackOptions&);
using CFunctionVoid = void (*)(v8::Local<v8::Value>);
using CFunctionVoid2 =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not super excited about the name here but then again none of the macro names here are great.

Local<Value>(),
signature,
0,
v8::ConstructorBehavior::kThrow,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

using v8::ConstructorBehavior at the top of the file

c_function);
// kInternalized strings are created in the old space.
const v8::NewStringType type = v8::NewStringType::kInternalized;
Local<v8::String> name_string =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

using v8::String at the top

@@ -711,7 +739,6 @@ class Parser : public AsyncWrap, public StreamListener {
}
}

// TODO(@anonrig): Add V8 Fast API
static void Consume(const FunctionCallbackInfo<Value>& args) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No fast version?

@anonrig
Copy link
Member Author

anonrig commented May 12, 2025

Benchmark CI results are available at: https://ci.nodejs.org/view/Node.js%20benchmark/job/benchmark-node-micro-benchmarks/1717/consoleText

I think there is something fundamentally wrong with either v8 fast api, or with our benchmarks. These changes shouldn't show a 11% performance degradation. @H4ad will run fastify benchmarks on this pull-request today. it might show a different result...

cc @nodejs/performance @nodejs/v8

Copy link
Member

@BridgeAR BridgeAR left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just requesting changes until the regressions are resolved. Feel free to dismiss this request as soon as that is the case.

@joyeecheung
Copy link
Member

I think there is something fundamentally wrong with either v8 fast api, or with our benchmarks

It looks like one of the many cases where using fast API does not make sense, as already explained in #58080 (comment) - getting the Environment requires looking it up from the current context which gets passed through a local handle which needs a handle scope. We should probably restrict the uses of fast API to bindings that do not need the handle scope (and by extension, do not need the Environment).

@anonrig
Copy link
Member Author

anonrig commented May 12, 2025

@joyeecheung since we have permission checks in almost all functions, it's impossible to add v8 fast api to any more methods then. I think we should change how environment::getcurrent isolate creates handle scope, and only do it when it's needed. For nodejs, it seems v8 fast API is almost useless even for basic functions.

@devsnek
Copy link
Member

devsnek commented May 12, 2025

In deno we store the "environment" reference in a v8::External in FastApiCallbackOptions::data (from FunctionTemplate::data), because this value, although it is a Local<T>, is actually not allocated from a HandleScope so it does not carry that overhead.

@H4ad
Copy link
Member

H4ad commented May 13, 2025

I run the fastify-benchmarks, and to summarize I don't think the data was good enough to mark this PR as an improvement:

Results

New node is this PR, current-main is with the revert-commit of this PR.

Version Router Requests/s Latency (ms) Throughput/Mb
fastify.current-main - 65002.4 14.76 11.65
fastify.new-node - 63460.0 15.25 11.38
0http.new-node - 39460.8 24.87 7.04
polkadot.current-main - 39397.8 24.91 7.03
h3.new-node - 39032.8 25.14 6.96
polkadot.new-node - 38391.0 25.56 6.85
0http.current-main - 38285.8 25.62 6.83
h3.current-main - 38166.4 25.71 6.81
node-http.current-main - 37055.4 26.51 6.61
node-http.new-node - 36655.4 26.81 6.54
h3-router.current-main - 35928.6 27.36 6.41
restana.new-node - 35588.2 27.62 6.35
restana.current-main - 35571.8 27.63 6.34
micro.new-node - 35566.2 27.65 6.34
polka.current-main - 35509.0 27.69 6.33
rayo.current-main - 35480.2 27.72 6.33
polka.new-node - 35470.6 27.73 6.33
micro.current-main - 35379.4 27.78 6.31
connect.current-main - 35352.2 27.80 6.30
server-base.new-node - 35347.4 27.80 6.30
server-base-router.current-main - 35282.0 27.87 6.29
server-base.current-main - 35263.8 27.87 6.29
rayo.new-node - 35250.2 27.89 6.29
server-base-router.new-node - 35090.0 28.03 6.26
connect.new-node - 34979.4 28.11 6.24
h3-router.new-node - 33177.2 29.67 5.92
connect-router.new-node - 33090.6 29.78 5.90
connect-router.current-main - 32966.6 29.89 5.88
hono.current-main - 31046.8 31.71 5.09
hono.new-node - 30875.6 31.88 5.06
koa.current-main - 30269.6 32.52 5.40
koa.new-node - 30233.2 32.57 5.39
koa-isomorphic-router.new-node - 28931.6 34.08 5.16
take-five.new-node - 28678.0 34.37 10.31
koa-router.new-node - 28620.8 34.46 5.10
koa-isomorphic-router.current-main - 28554.0 34.54 5.09
take-five.current-main - 28522.4 34.57 10.25
koa-router.current-main - 28496.8 34.61 5.08
hapi.current-main - 26791.2 36.84 4.78
hapi.new-node - 26658.4 37.03 4.75
adonisjs.current-main - 25735.2 38.37 4.59
adonisjs.new-node - 25234.4 39.14 4.50
micro-route.current-main - 23534.8 41.98 4.20
micro-route.new-node - 23052.0 42.87 4.11
express.new-node - 22942.0 43.11 4.09
express.current-main - 22622.8 43.72 4.03
express-with-middlewares.new-node - 19402.4 51.05 7.22
express-with-middlewares.current-main - 19261.2 51.41 7.16
microrouter.new-node - 18202.3 54.43 3.25
microrouter.current-main - 17922.9 55.29 3.20
fastify-big-json.current-main - 11550.4 86.04 132.90
fastify-big-json.new-node - 11421.0 87.03 131.39
trpc-router.current-main - 6675.9 149.28 1.47
trpc-router.new-node - 6631.8 150.21 1.46
restify.current-main - N/A N/A N/A
restify.new-node - N/A N/A N/A

The benchmark don't have all the frameworks because some of the failed to run (on both nodes).

Results (with the JSON) in case someone wants to do some more analysis.

I tried to run with little noise as possible on my setup (for fastify I tried to pin the CPU).


For now, I would trust more the benchmarks we have for node and try the approach @devsnek suggested.

@joyeecheung
Copy link
Member

joyeecheung commented May 13, 2025

since we have permission checks in almost all functions, it's impossible to add v8 fast api to any more methods then.

I think permission checking is relatively rare outside of fs and module bindings? e.g. the APIs being touched here do not need that.

I think we should change how environment::getcurrent isolate creates handle scope, and only do it when it's needed.

Actually I think many bindings don't necessarily even need Enviroment::GetCurrent. Like the bindings being changed here are only getting it to do CHECK_EQ(env, parser->env()) - that's not even a useful assumption to make if we intend to add cross-realm/context support for builtins. It's also not a good pattern to do Enviroment::GetCurrent everywhere and then use env->context() etc instead of only getting the context from the isolate when needed (again, env->context() is an antipattern if we intend to add cross-realm/context support for builtins).

@anonrig
Copy link
Member Author

anonrig commented May 13, 2025

Yes, I'll do a follow up for Environment::GetCurrent.

Regarding this pull-request, since these changes in this pull-request are updating prototype methods, they only call this function, and I don't see any possible way to improve the performance using fast api with these methods.

#define ASSIGN_OR_RETURN_UNWRAP(ptr, obj, ...)                                 \
  do {                                                                         \
    *ptr = static_cast<typename std::remove_reference<decltype(*ptr)>::type>(  \
        BaseObject::FromJSObject(obj));                                        \
    if (*ptr == nullptr) return __VA_ARGS__;                                   \
  } while (0)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c++ Issues and PRs that require attention from people who are familiar with C++. lib / src Issues and PRs related to general changes in the lib or src directory. needs-ci PRs that need a full CI run.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants