Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crashes when encountering weird tags (<div 1>, <div 2>) #886

Closed
begilbert-sys opened this issue Jun 26, 2024 · 2 comments
Closed

Crashes when encountering weird tags (<div 1>, <div 2>) #886

begilbert-sys opened this issue Jun 26, 2024 · 2 comments

Comments

@begilbert-sys
Copy link

begilbert-sys commented Jun 26, 2024

I was testing on random news articles and came across the following webpage:
https://news.virginia.edu/content/dungeons-dragons-and-burgers-really-bad-outcomes-when-we-dont-grasp-fractions.

It appears that the site has some weird div tags like <div 1> and <div 2>, and ended up crashing the parser. The error was the following:

Error in event handler: Error: Failed to execute 'setAttribute' on 'Element': '2' is not a valid attribute name.
at Readability._simplifyNestedElements 
    at Readability._postProcessContent
    at Readability.parse 

Obviously this is more of the site's fault, but it would be nice to just ignore elements like this instead of breaking.

@gijsk
Copy link
Contributor

gijsk commented Jun 26, 2024

Thanks for the report. This is fundamentally the same issue as #859 .

@gijsk gijsk closed this as not planned Won't fix, can't repro, duplicate, stale Jun 26, 2024
@gijsk
Copy link
Contributor

gijsk commented Jun 26, 2024

(github somehow only has "not planned" and "fixed" as resolutions for a ticket? Anyway - conversations will continue in #859 I'm sure, and I do really appreciate the additional testcase + report!)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants