-
-
Notifications
You must be signed in to change notification settings - Fork 219
Feat/scan url cache #5625
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feat/scan url cache #5625
Conversation
@@ -233,24 +235,46 @@ export type PhishingControllerState = { | |||
hotlistLastFetched: number; | |||
stalelistLastFetched: number; | |||
c2DomainBlocklistLastFetched: number; | |||
urlScanCache: Record<string, UrlScanCacheEntry>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be put into a class that is defined in a separate file.
All 5 of your functions (setUrlScanCacheTTL, setUrlScanCacheMaxSize, clearUrlScanCache, getFromUrlScanCache, addToUrlScanCache
) should really just be methods of the cache class. The PhishingController should not be responsible for caching logic, which is why I think it makes sense to abstract this. It also organizes the files in a significant way which makes understanding the implementation a lot easier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done!
@@ -304,6 +328,10 @@ export class PhishingController extends BaseController< | |||
|
|||
#c2DomainBlocklistRefreshInterval: number; | |||
|
|||
#urlScanCacheTTL: number; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same benefit here as described below - I think you can remove params like this from the Phishing Controller (and should not be its responsibility to be initialized with it) so that these are scoped to the cache itself
* @param config.messenger - The controller restricted messenger. | ||
* @param config.state - Initial state to set on this controller. | ||
*/ | ||
constructor({ | ||
stalelistRefreshInterval = STALELIST_REFRESH_INTERVAL, | ||
hotlistRefreshInterval = HOTLIST_REFRESH_INTERVAL, | ||
c2DomainBlocklistRefreshInterval = C2_DOMAIN_BLOCKLIST_REFRESH_INTERVAL, | ||
urlScanCacheTTL = DEFAULT_URL_SCAN_CACHE_TTL, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I've already made my point, but I think this is unnecessary to define the cache values within the constructor here. The Cache is part of the PhishingControllerState
so it feels the PhishingController should be in charge of setting those values, rather than letting them be defined within initialization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can agree with @mindofmar's suggestions re: cache. I would add that two improvements for the caching logic could include:
- tracking whether a url is in-flight and have that request await the promise rather than send out another request
- having a more efficient add/remove pattern for the cache e.g. LRU or the cache having two states (the map + the array) so that clients are not constantly reprocessing this array.
…nd add UrlScanCache class
…tionality details
…tionality details
#5688) ### Explanation The bulk URL scanning functionality in PhishingController previously didn't leverage the URL scan cache that was already implemented for single URL scanning. This meant that even if a URL had been recently scanned, it would be scanned again when included in a bulk scan request, causing unnecessary API calls and increased response times. This PR extends the caching functionality to the `bulkScanUrls` method, allowing it to: 1. Check the cache for each URL before sending API requests 2. Only scan URLs that aren't already in the cache 3. Add newly scanned results to the cache for future use 4. Return a combined response of both cached and newly scanned results This optimization significantly reduces API calls and improves response times for bulk scan operations, especially when the same URLs are frequently scanned. ### References Related to #5625 (Add URL scan cache functionality) Extends functionality from #5682 (Add bulk scan functionality)
Explanation
The PhishingController's
scanUrl
method currently makes an API call for every URL scan request, even when we've recently scanned the same URL.To address this, I've added a caching layer that:
This should reduce redundant API calls while maintaining security by respecting TTL for cached entries.
References
Checklist