Open
Description
Feature request: Limit the maximum number of bytes to parse.
A maximum file size may be enforced per crawler. Content which is after the maximum file size may be ignored. Google currently enforces a size limit of 500 kilobytes (KB).
Source: Google
When forming the robots.txt file, you should keep in mind that the robot places a reasonable limit on its size. If the file size exceeds 32 KB, the robot assumes it allows everything
Source: Yandex
- Default limit of X bytes, eg. 524.288 bytes (512KB / 0.5MB)
- User-defined limit override
- Make sure the limit is reasonable, throw an exception if dangerously low, eg. 24.576 bytes (24 KB)
- Should be able to disable - no limit