-
Notifications
You must be signed in to change notification settings - Fork 632
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle article:author meta tag. Fixes #938 #942
Conversation
@@ -0,0 +1,10 @@ | |||
{ | |||
"title": "If You Can Picture A Tarot Card, It's Because of These 3 People", | |||
"byline": "Laura June Topolsky", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the key line. Without this change, this byline would contain something like "Laura June Topolsky July 10, 2015".
Readability.js
Outdated
@@ -1726,7 +1740,7 @@ Readability.prototype = { | |||
|
|||
// property is a space-separated list of values | |||
var propertyPattern = | |||
/\s*(article|dc|dcterm|og|twitter)\s*:\s*(author|creator|description|published_time|title|site_name)\s*/gi; | |||
/\s*(article|dc|dcterm|og|twitter)\s*:\s*(author|article:author|creator|description|published_time|title|site_name)\s*/gi; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sure it's me but why is this not already being matched? The first capturing subgroup has article
, then there's optional whitespace followed by a colon followed by optional whitespace, and then this bit has author
. After this patch this will also work for e.g. dc:article:author
(and article:article:author
etc. etc.) but that doesn't seem to be the intent and isn't what's in the testcase... so I'm a bit lost! Sorry for being obtuse.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Trying locally, the new test passes without this part of the PR. It does fail the BBC test for me locally, where it now finds a byline that it did not find before ("BBC News"). Fixing that, this seems to pass tests without this change, so I'll just merge without this change? If I've missed something, do let me know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Woop!
Fixes #938