-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[html] Allow ampersands in attribute values. #2036
[html] Allow ampersands in attribute values. #2036
Conversation
Package publishing
Documentation at https://github.com/dart-lang/ecosystem/wiki/Publishing-automation. |
PR HealthBreaking changes ✔️
Changelog Entry ✔️
Changes to files need to be accounted for in their respective changelogs. Coverage ✔️
This check for test coverage is informational (issues shown here will not fail the PR).
API leaks
|
Package | Leaked API symbols |
---|---|
html | HtmlTokenizer Token HtmlInputStream TagToken DoctypeToken StringToken TreeBuilder ActiveFormattingElements StartTagToken TagAttribute CommentToken CharactersToken SpaceCharactersToken EndTagToken |
This check can be disabled by tagging the PR with skip-leaking-check
.
License Headers ⚠️
// Copyright (c) 2025, the Dart project authors. Please see the AUTHORS file
// for details. All rights reserved. Use of this source code is governed by a
// BSD-style license that can be found in the LICENSE file.
Files |
---|
pkgs/html/lib/src/tokenizer.dart |
pkgs/html/test/parser_feature_test.dart |
All source files should start with a license header.
Unrelated files missing license headers
Files |
---|
pkgs/bazel_worker/benchmark/benchmark.dart |
pkgs/bazel_worker/example/client.dart |
pkgs/bazel_worker/example/worker.dart |
pkgs/benchmark_harness/integration_test/perf_benchmark_test.dart |
pkgs/boolean_selector/example/example.dart |
pkgs/clock/lib/clock.dart |
pkgs/clock/lib/src/clock.dart |
pkgs/clock/lib/src/default.dart |
pkgs/clock/lib/src/stopwatch.dart |
pkgs/clock/lib/src/utils.dart |
pkgs/clock/test/clock_test.dart |
pkgs/clock/test/default_test.dart |
pkgs/clock/test/stopwatch_test.dart |
pkgs/clock/test/utils.dart |
pkgs/coverage/lib/src/coverage_options.dart |
pkgs/coverage/test/collect_coverage_config_test.dart |
pkgs/coverage/test/config_file_locator_test.dart |
pkgs/html/example/main.dart |
pkgs/html/lib/dom.dart |
pkgs/html/lib/dom_parsing.dart |
pkgs/html/lib/html_escape.dart |
pkgs/html/lib/parser.dart |
pkgs/html/lib/src/constants.dart |
pkgs/html/lib/src/encoding_parser.dart |
pkgs/html/lib/src/html_input_stream.dart |
pkgs/html/lib/src/list_proxy.dart |
pkgs/html/lib/src/query_selector.dart |
pkgs/html/lib/src/token.dart |
pkgs/html/lib/src/treebuilder.dart |
pkgs/html/lib/src/utils.dart |
pkgs/html/test/dom_test.dart |
pkgs/html/test/parser_test.dart |
pkgs/html/test/query_selector_test.dart |
pkgs/html/test/selectors/level1_baseline_test.dart |
pkgs/html/test/selectors/level1_lib.dart |
pkgs/html/test/selectors/selectors.dart |
pkgs/html/test/support.dart |
pkgs/html/test/tokenizer_test.dart |
pkgs/pubspec_parse/test/git_uri_test.dart |
pkgs/stack_trace/example/example.dart |
pkgs/watcher/test/custom_watcher_factory_test.dart |
pkgs/yaml_edit/example/example.dart |
This check can be disabled by tagging the PR with skip-license-check
.
@sigmundch - can you take a look? Do you think it's safe to silently ignore parse errors here? I think it would be better if we can track down a way to avoid parsing URL attribute values as HTML entities but I don't see an obvious way to track that in the current implementation. |
FWIW the diff I assumed we'd land in this code location is something like generateSpans: true, attributeSpans: true);
tokenizer.moveNext();
- final token = tokenizer.current as StartTagToken;
+ final token = tokenizer.current;
+ if (token is ParseErrorToken) {
+ throw ParseError(token.data, token.span, token.messageParams);
+ }
+ token as StartTagToken;
if (token.attributeSpans == null) return; // no attributes But that would require us to track down and fix the spurious parse error. |
@natebosch I tracked the tokenizer error to here (I added a couple of links to the related issue):
Fixing it gracefully there... well, I didn't find a good solution. I tried the navie thing of not emitting those non-fatal tokenizer errors, but that seemed to mess up the computation of the spans. There's also a test validating that the
So I guessed that token is a feature, and attempted to fix it one level up. (Throwing if |
Yeah a better error message is my intention with that diff.
Is it a hard requirement to parse this invalid HTML? Or is it possible for you to fix the input to I don't fully understand how or whether we differentiate fatal and non-fatal errors. I expect browser probably allow this syntax to work but if we can be more strict to the standards in the parser that's definitely easier for us. |
Yes, using
I disagree with this being invalid HTML. In practice, it is rare that anybody is going to use the As for the standard; the URI rfc specifically lists The HTML Standard says that Character references like I think it's a little unfortunate that the HTML parser behaves this way, but I can't suggest a better solution than this PR. |
Interesting, reading this part of the spec, it does appear like the parse error was not intended to happen within an attribute value. In particular, see the text around being "consumed as part of an attrinbute". The location you found @ditman in tools/pkgs/html/lib/src/tokenizer.dart Lines 342 to 346 in 9c53358
does appear to be the place to address this issue. Note that there is a parameter if (!fromAttribute) {
_addToken(ParseErrorToken(...))
}
stream.unget(...)
output = ...; Just trying that out locally seems to still keep all tests passing and make your new test pass as well. |
@sigmundch what a cool find Siggi, I totally missed that param! (That method is long and full of terrors :P) Let me add that change and another test case to the tokenizer. |
lol, totally :) |
…side an attribute value. Co-Authored-By: sigmundch <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks David, looks great!
This seems to be all green, but I can't click the submit button :) /cc @sigmundch |
Revisions updated by `dart tools/rev_sdk_deps.dart`. core (https://github.com/dart-lang/core/compare/9f43210..61e6771): 61e67710 2025-03-12 Moritz Check for mandatory when using option (dart-lang/core#871) 77d33c83 2025-03-04 Devon Carew Update publish.yaml (dart-lang/core#870) db610bb5 2025-02-28 Erik Ernst Adjust the implements clause of IntX (dart-lang/core#866) 22d8879f 2025-02-27 Devon Carew [package:lints] fix the changelog (dart-lang/core#867) 2f0325e9 2025-02-26 Devon Carew Contribute a Gemini Code Assist config (dart-lang/core#865) ecosystem (https://github.com/dart-lang/ecosystem/compare/a3cc42d..5b8e6b8): 5b8e6b8 2025-03-17 Devon Carew [dart_flutter_team_lints] remove 'discarded_futures'; prep for publishing (dart-lang/ecosystem#347) b96e5d2 2025-03-14 Devon Carew Add the discarded_futures lint; rev for publishing (dart-lang/ecosystem#346) http (https://github.com/dart-lang/http/compare/001665e..9129a96): 9129a96 2025-03-13 Brian Quinlan Prepare to release cupertino_http 2.1.0 (dart-lang/http#1727) 91d8719 2025-03-13 Brian Quinlan Upgrade to `package:objective_c` 7.0.0 (dart-lang/http#1726) i18n (https://github.com/dart-lang/i18n/compare/b09c822..d9cce0b): d9cce0b 2025-03-13 Moritz Grab new artifacts from version `intl4x-icu-v.0.11.2-artifacts` (dart-lang/i18n#957) protobuf (https://github.com/dart-lang/protobuf/compare/7838e44..0bab78d): 0bab78d 2025-03-14 Devon Carew rename -dev pubspec versions to -wip (google/protobuf.dart#968) tools (https://github.com/dart-lang/tools/compare/9c53358..62bc13b): 62bc13bc 2025-03-17 Kevin Moore [markdown, stream_transform] Fix deprecated usage of pkg:web (dart-lang/tools#2046) f1a5e7a2 2025-03-17 Devon Carew [package:markdown] update package:web references in the example (dart-lang/tools#2039) a4ae1759 2025-03-17 Nate Bosch [package_config] Update language version and reformat (dart-lang/tools#2040) a2af1447 2025-03-17 Nate Bosch [package_config] Remove unnecessary casts to Uint8List (dart-lang/tools#2041) 07ebd15d 2025-03-14 Devon Carew [package:mime] generate a markdown table of the current mime mappings (dart-lang/tools#2038) 920fdb64 2025-03-13 dependabot[bot] Bump dart-lang/setup-dart from 1.7.0 to 1.7.1 in the github-actions group (dart-lang/tools#2022) b55643da 2025-03-13 David Iglesias [html] Allow ampersands in attribute values. (dart-lang/tools#2036) Change-Id: I2998ebecfdc7b1790a74a8ab2128895694658a5f Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/416023 Reviewed-by: Konstantin Shcheglov <[email protected]> Commit-Queue: Devon Carew <[email protected]>
Change the tokenizer so it allows ambiguous ampersands in attribute values, instead of emitting an ErrorToken.
Contribution guidelines:
dart format
.Note that many Dart repos have a weekly cadence for reviewing PRs - please allow for some latency before initial review feedback.