Skip to content

isDisallowed() returns true for / despite no matching Disallow rule in robots.txt #41

Open
@seveun

Description

@seveun

Hi,

I'm encountering a possible issue with the way isDisallowed() behaves when parsing the robots.txt file from https://www.natureetdecouvertes.com/robots.txt.

Context
I'm checking if crawling the homepage / is allowed for a standard browser User-Agent like:

swift
Copier
Modifier
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/137.0.0.0 Safari/537.36
When I run:

js
Copier
Modifier
const isAllowed = robotsTxt.isAllowed('https://www.natureetdecouvertes.com/', userAgent);
const isDisallowed = robotsTxt.isDisallowed('https://www.natureetdecouvertes.com/', userAgent);
I get:

js
Copier
Modifier
isAllowed === undefined
isDisallowed === true
However, the robots.txt does not contain any rule explicitly disallowing /. There is no Disallow: /, and the default behavior according to the Robots Exclusion Protocol (RFC 9309) is to allow access to / unless explicitly blocked.

Expected behavior
isDisallowed('/') should return false, and isAllowed('/') should return true (or at least not undefined if a fallback to User-agent: * applies).

Notes
The user-agent I’m testing is not listed in any specific User-agent: group, so the fallback to User-agent: * should apply.

There is no Disallow: / or any wildcard rule that matches only /.

Can you confirm if this is intended behavior? Otherwise, it seems like a parsing bug or fallback handling issue.

Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions