-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: add block media feature for playwright - blocks images, videos, css #54
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great. Nice feature that will reduce proxy usage for user. Thank you.
But you've forgotten to add the input blockMedia
into the defaults
object in const.ts
.
Lines 23 to 47 in a4a2641
export const defaults = { | |
debugMode: inputSchema.properties.debugMode.default, | |
dynamicContentWaitSecs: inputSchema.properties.dynamicContentWaitSecs.default, | |
htmlTransformer: inputSchema.properties.htmlTransformer.default, | |
initialConcurrency: inputSchema.properties.initialConcurrency.default, | |
keepAlive: true, // Not in input_schema.json | |
maxConcurrency: inputSchema.properties.maxConcurrency.default, | |
maxRequestRetries: inputSchema.properties.maxRequestRetries.default, | |
maxRequestRetriesMax: inputSchema.properties.maxRequestRetries.maximum, | |
maxResults: inputSchema.properties.maxResults.default, | |
maxResultsMax: inputSchema.properties.maxResults.maximum, | |
minConcurrency: inputSchema.properties.minConcurrency.default, | |
outputFormats: inputSchema.properties.outputFormats.default, | |
proxyConfiguration: inputSchema.properties.proxyConfiguration.default, | |
query: undefined, // No default value in input_schema.json | |
readableTextCharThreshold: 100, // Not in input_schema.json | |
removeCookieWarnings: inputSchema.properties.removeCookieWarnings.default, | |
removeElementsCssSelector: inputSchema.properties.removeElementsCssSelector.default, | |
requestTimeoutSecs: inputSchema.properties.requestTimeoutSecs.default, | |
requestTimeoutSecsMax: inputSchema.properties.requestTimeoutSecs.maximum, | |
serpMaxRetries: inputSchema.properties.serpMaxRetries.default, | |
serpMaxRetriesMax: inputSchema.properties.serpMaxRetries.maximum, | |
serpProxyGroup: inputSchema.properties.serpProxyGroup.default, | |
scrapingTool: inputSchema.properties.scrapingTool.default, | |
}; |
This causes the input processing to drop it from the input and because of that it's never applied.
@MQ37, thank you! I believe we want this enabled by default, right? It should be a non-breaking change that improves speed and reduces traffic with proxies. Could you also provide 2-3 runs with this feature enabled and disabled so we can compare perf? |
Sure, we can enable this by default 👍 Here is the list of runs. Blocking disabled:
Blocking enabled:
Another runs with performance analyzed.
Block enabled:
|
Thank you for noticing 👍 I tested it with local browser and it actually works without being added, but will add that 👍 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the change. Now the Actor acts consistently between STANDALONE and STANDBY mode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MQ37, thank you! This looks very promising.
I’m afraid I don’t fully understand the test and the numbers you provided.
Did you call the Actor with several requests and then record the value?
It seems that with each additional request, the response time gets progressively slower.
An average response time of 47s doesn’t sound quite right.
Also, we need to document this option and its benefits in the README 🙏🏻
Added this feature to the README, thank you for noticing 👍 Those performance numbers are from the |
@jirispilka should this PR be closed? |
closes #47 (comment)