-
Notifications
You must be signed in to change notification settings - Fork 5
FIX(regex): @W-18464445@: Use utility based off of p-limit instead of… #287
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -15,7 +15,8 @@ | |||
"dependencies": { | |||
"@salesforce/code-analyzer-engine-api": "0.23.0", | |||
"@types/node": "^20.0.0", | |||
"isbinaryfile": "^5.0.4" | |||
"isbinaryfile": "^5.0.4", | |||
"p-limit": "^3.1.0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that p-limit v4 and higher are strict ESM modules so we need to stick with version 3.
@randi274 FYI.
And we already were shipping p-limit 3 as an indirect dependency... so this doesn't increase the size of our package at all.
export class RegexEngine extends Engine { | ||
static readonly NAME = "regex"; | ||
private readonly regexRules: RegexRules; | ||
private readonly regexValues: Map<string, RegExp> = new Map(); | ||
private readonly textFilesCache: Map<string, string[]> = new Map(); | ||
private readonly ruleResourceUrls: Map<string, string[]>; | ||
private readonly promiseLimiter: PromiseExecutionLimiter = new PromiseExecutionLimiter(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note using a single one on the engine - so that we don't create a bunch of independent limit functions. Like we had a promise.all that calls a promise.all and if we did 1000 limit on the outer and then a 1000 limit on the inner... then that would be 1000000 limit in reality. So instead, we just keep an overall 1000 Promise limit for the entire thing by using the same instance of my PromiseExecutionLimiter.
let batchMultiplier = 0; | ||
const violations: Violation[] = []; | ||
while (batchMultiplier * FILE_BATCH_SIZE < textFiles.length) { | ||
// Turns out there's a system-level limit on how many files can be open at once. In order to avoid hitting | ||
// this limit while still reaping the benefits of `Promise.all()`, we'll process the files in batches. | ||
const textFileBatch = textFiles.slice(batchMultiplier * FILE_BATCH_SIZE, (batchMultiplier + 1) * FILE_BATCH_SIZE); | ||
const ruleRunPromises: Promise<Violation[]>[] = textFileBatch.map(file => this.runRulesForFile(file, ruleNames)); | ||
const newViolations = (await Promise.all(ruleRunPromises)).flat(); | ||
violations.push(...newViolations); | ||
batchMultiplier++; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removing the old batching because in reality, when a batch had a 1000 - then when 999 were finished... we were waiting for just the 1 before we created another batch of 1000.
With p-limit we guarantee that 1000 are always running. So when 1 finished... it allows the next one to come in to keep 1000 running at any given time.
So we actually get a performance boost from this change away from batching.
… batching and replace all Promise.all calls for safety
8ae727a
to
cfcaf4b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function-based setup is a little counterintuitive, but it seems fine to me. Approved.
… batching and replace all Promise.all calls for safety
Note that this is an improved fix over #202