Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ES Module loader initial implementation #1348

Merged
merged 14 commits into from
Sep 12, 2022
Merged

Add ES Module loader initial implementation #1348

merged 14 commits into from
Sep 12, 2022

Conversation

jmartin4563
Copy link
Contributor

@jmartin4563 jmartin4563 commented Sep 9, 2022

Proposed Release Notes

  • Added loader hook to support instrumentation of CommonJS dependencies in ES Module applications

Links

Closes NEWRELIC-3121

Details

  • Created esm-loader.mjs with resolve hook, based on prototype by @michaelgoin (h/t)
  • Added new npm script for running our ESM flavored unit tests (npm run unit:esm), and updated CI to run this script
  • Introduced testdouble, a library that allows us to fake out import statements like proxyquire does for require
  • Updated eslint configuration to allow for import(), and updated linting commands to run on .mjs files

@CLAassistant
Copy link

CLAassistant commented Sep 9, 2022

CLA assistant check
All committers have signed the CLA.

@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@jmartin4563 jmartin4563 marked this pull request as ready for review September 9, 2022 15:41
Copy link
Member

@michaelgoin michaelgoin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very clean implementation, nice work!

A few comments around doing unnecessary work. You/the team can decide if you want to care or not this early. My main thing is they are easily identifiable and I'm not sure how easy to track down if problems do come up with customers.

I'd normally look for an integration test with this sort of thing to confirm actual functionality but I believe that is intended to be tackled in a separate story. So given that, the unit testing looks legit.

esm-loader.mjs Outdated
}

const instrumentationApi = newrelic.shim
const logger = instrumentationApi.logger.child({ component: 'esm-loader' })
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing to note is via this approach, we'll be instantiating a child logger every single invocation of resolve... so once per every import. This is probably pretty cheap / not noticeable but might recommend avoiding any extra work we can, that is obvious, that can add to startup times for customer applications.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to use the logging module directly, as that is a pattern applied by basically everything else in this repo

.eslintrc.js Outdated
@@ -8,6 +8,9 @@
module.exports = {
extends: ['@newrelic', 'plugin:jsdoc/recommended'],
plugins: ['jsdoc'],
parserOptions: {
ecmaVersion: '2020'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think ideally when we bump allowed versions based on what's allowed (deprecating older node versions), we update the shared definition so all projects are consistent. Fine to land this sort of thing in the project but might follow-up with updating the shared eslint module and then removing here.

Also, I'm not sure what was needed to bump that version but calling out that this will probably allow "spread parameters after optional chaining" which is not available in Node 14. So there's a small risk of introducing something that won't work. Hopefully that would get caught by tests but noting we have been burned at least once since I've been involved with the Node team.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason I introduced this is that doing await import() without it caused a parsing error on my unit test file, causing eslint to fail completely on the file (which would then fail CI). Fixing that is not worth the risk of potentially allowing introduction of bugs for v14 IMO, so what do you think about undoing this change and then adding my ESM unit test to .eslintignore?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have at times had to have a config specific to the tests folder to bump the ecmascript version earlier than the code base for testing newer features. For example, async-await while still supporting very old node versions.

It looks like we still have a file in there, so I might just bump that one to start: test/.eslintrc.js.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for future posterity 16.3.0 is when es2020 spec is 100% implemented: see https://node.green/

esm-loader.mjs Outdated
logger.debug(`${specifier} is not a CommonJS module, skipping for now`)
}

return { url, format }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would only matter for large projects, or may never matter, but the destructuring to later create an object to return the exact elements does result in an unnecessary allocation on a potentially hot startup path (application with many import statements). It may be negligible in this case but calling out as that is something to consider with agent development that often has less of an impact in CRUD/traditional development. Allocating too much on heavily invoked paths can start impacting GC, even when objects are short lived, which can have non-obvious impacts to perf sometimes (or obvious other times). A more obvious case where this might come up is instrumenting a database "read" method.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better to be cautious now than have to try and track down these GC thrashes in the wild

esm-loader.mjs Outdated
)
}

logger.debug(`${specifier} is not a CommonJS module, skipping for now`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the 'commonjs' if block does not return, I believe this log message will be logged regardless. Seems like it may not happen anyways as we are unlikely to have anything registered unless one of these projects converts (or does hybrid support? maybe?) but probably want to clean up anyways.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whoops! that's my bad, will fix :)

esm-loader.mjs Outdated Show resolved Hide resolved
This product includes source derived from [testdouble](https://github.com/testdouble/testdouble.js) ([v3.16.6](https://github.com/testdouble/testdouble.js/tree/v3.16.6)), distributed under the [MIT License](https://github.com/testdouble/testdouble.js/blob/v3.16.6/LICENSE.txt):

```
The MIT License (MIT)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MIT license ✔️

I only added that parserOptions stanza to fix a parsing issue in my unit test file, but configuring it in the top-level eslintrc.js meant that I updated eslint to allow things that would have broken in node14. By configuring in the test-only eslint config, we limit the blast radius of this potential issue
@bizob2828 bizob2828 self-assigned this Sep 12, 2022
esm-loader.mjs Outdated
* @returns {Promise} Promise object representing the resolution of a given specifier
*/
export async function resolve(specifier, context, nextResolve) {
if (!newrelic.shim) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hadn't seen this idiom before. Other than being shorter, is it better than newrelic.agent.config.agent_enabled?

Copy link
Contributor Author

@jmartin4563 jmartin4563 Sep 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checking for that would only cover the case where the configuration explicitly disables the agent, it doesn't cover the case where the agent errors out on bootstrap (like if the configuration file is invalid). Checking for the shim or agent property covers both cases in one if, because those properties get added as this very last portion of initialize(https://github.com/newrelic/node-newrelic/blob/main/index.js#L101)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we check for the agent itself, then? That seems a little more intentional than checking to see if the shim API is there.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure! Will update to assert newrelic.agent

}
}

return resolvedModule
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dumb n00b javascript question but is there a difference between returning the actual module here vs returning the promise given by nextResolve above in case there's no instrumentation?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need to await the parent(nextResolve) so we can get the resolved path. In the case the agent doesn't exist, still question this, you can see above, it means continue the loader chain. But we're essentially halting the loader chain. But we are going to suggest to be last in that chain. so technically we could just return without calling the parent resolve

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My concern here was having a function that sometimes returns a promise and sometimes does not. I thought that might break API expectations, but if we tell our users, "put us last", then maybe this doesn't matter.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The API doesn't change. It's async we just do all the awaiting ourselves

bizob2828
bizob2828 previously approved these changes Sep 12, 2022
Copy link
Member

@bizob2828 bizob2828 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good work on the paring down from prototype to actual solution. Overall looks great. I tested it with a branch where I'm porting versioned tests and all is well.

* @returns {Promise} Promise object representing the resolution of a given specifier
*/
export async function resolve(specifier, context, nextResolve) {
if (!newrelic.agent) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the check for newrelic.shim was done because it follows the patterns of our external modules. I think checking for agent, shim or awsLambda is enough. but frankly everything we do in here I think is safe without checking anything agent properties but wasn't sure the intention when @michaelgoin wrote this into the prototype.

}
}

return resolvedModule
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need to await the parent(nextResolve) so we can get the resolved path. In the case the agent doesn't exist, still question this, you can see above, it means continue the loader chain. But we're essentially halting the loader chain. But we are going to suggest to be last in that chain. so technically we could just return without calling the parent resolve

@@ -232,6 +234,7 @@
"THIRD_PARTY_NOTICES.md",
"lib/",
"bin/tracetractor",
"bin/test-naming-rules.js"
"bin/test-naming-rules.js",
"esm-loader.mjs"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice catch. this would've been a downer to get through the work and forget to include this here.

await td.replaceEsm('../../lib/logger.js', {}, fakeLogger)

// eslint-disable-next-line node/no-unsupported-features/es-syntax
loader = await import('../../esm-loader.mjs')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's worth noting this is ignored because of mysticatea/eslint-plugin-node#250. I also logged a ticket to update our plugin so we can remove these: https://issues.newrelic.com/browse/NEWRELIC-3321

esm-loader.mjs Show resolved Hide resolved
@jmartin4563 jmartin4563 merged commit c3a3b89 into newrelic:main Sep 12, 2022
@jmartin4563 jmartin4563 deleted the esm-loader branch September 12, 2022 21:04
@github-actions github-actions bot mentioned this pull request Sep 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

6 participants