-
Notifications
You must be signed in to change notification settings - Fork 25
Feed results into HTTPS Everywhere #140
Comments
Hi Bill! Excited to get a chance to work on this :) It's pretty close to an opportunity I identified back in February so really I feel like I'm already six months behind on delivering. I hope that identifying which domains are eligible for new rules is as easy as you suggest, but I'm worried about how we could pick out which sites are "fully available over HTTPS." Before we figure out how to automate this, I'd like to walk through what I'd do to generate a one-time dump of new rules. The scorecard has a field called "Available over HTTPS" which is actually a combination of the scraper properties "valid_https" (which must be true) and "downgrades_https" (which must be false). That's certainly a start — of the 131 sites on the scorecard, fully half (66) are in the Goldilocks zone of being available over HTTPS but not HSTS preloaded. Of those 66, about 44 look like they already have rules. Without delving into the contents of the XML, these domains or their slug is already found in the name of a rule. I've listed them at the bottom of this issue. That leaves about 20 sites that might be rule-eligible. That seems like the ceiling, too, as poking around will certainly shake loose rules that are slightly irregularly named, or in some cases sites that our scanner identifies as "available over HTTPS" but which aren't "fully available over HTTPS," as you specify. This is probably a small enough number that it makes sense to check those 20 or so domains to confirm that (a) they actually don't have a rule, and (b) confirm that everything works over HTTPS. Unless that number changes dramatically, because STN starts tracking many more sites or something like that, my inclination would be to just manually repeat this process every once in a while. How does that work? Eligible domains that may not have existing rules
Eligible domains with existing rules found
|
Sorry, didn't mean to close! |
@thisisparker We spoke a couple weeks ago about your PRs and whether any of the ruleset generation scripts helped you get further -- any progress? |
Since you're already maintaining a list of news sites with HTTPS support, you could easily auto-generate rulesets for HTTPS Everywhere.
Any site which is fully available over HTTPS (e.g. no content is unavailable when a domain is only loaded over HTTPS), but not HSTS preloaded is eligible for inclusion in HTTPS Everywhere.
To create a new HTTPS Everywhere ruleset, you can clone the repo and run a simple ruleset generation script:
You can follow the common format generated from this example to create rules for other sites and subdomains. Refer to https://github.com/EFForg/https-everywhere/blob/master/CONTRIBUTING.md for full contribution documentation.
The text was updated successfully, but these errors were encountered: