-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Tag Processor] Merge the test files into a single file #44593
Conversation
array( | ||
'" onclick="alert(\'1\');"><span onclick=""></span><script>alert("1")</script>', | ||
'" onclick="alert('1');"><span onclick=""></span><script>alert("1")</script>', | ||
), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still uncomfortable with re-encoding the tests of esc_attr()
in our test here. this adds a deep coupling that can come back and frustrate people trying to make unrelated changes.
for instance, in our actual code and discussion of the tag processor we have essentially delegated responsibility for attribute encoding and sanitization to esc_attr()
. we don't do additional work and we don't depend on any particular behavior other than its own guarantees.
why should we then extend additional constraints in the tests that we never wanted to make in the code or in conversation? we're not actually testing if this prevents XSS anyway - we're testing that it runs through esc_attr()
, and I'd rather leave the XSS prevention testing to that function.
ultimately this isn't the biggest problem and our code is littered with deeply-coupled integration tests like it, but at the same token our test suite is already in a place where it's frustrating how often tests fail probabilistically and where it can be hard to understand or figure out what's wrong when seemingly-unrelated tests fail.
if we're going to take responsibility for XSS protection and incorporate that code ourselves (which I don't think we'll do) then I think it would be more appropriate to test here. if we're going to delegate to esc_attr()
then we should either lean on its tests to cover the scenarios, or if it doesn't have them, lean on the history and reporting that has happened over the years to vet it.
"
-> "
, for example, is neither an XSS attack nor evidence that we're protected. it's a good demonstration that we encode the HTML entities and that we won't break out of the syntax; "
-> "
does neither and really appears to assert an internal behavior of esc_attr()
and nothing else.
it would be nice if it were easier and natively-available in PHP to create test spies, which is what I was emulating by comparing against esc_attr( $value )
, but every time I've tried that it involved user-space libraries with particular restrictions.
if we want to retain some of these tests then our naming is confusing. we could name it test_set_attribute_properly_escapes_html_entities()
and then give explanations for the constraints we are setting on it, which would free someone to swap out esc_attr()
while maintaining our requirements. I don't think our requirements are as strict as shown here.
array(
'escapes double-quote to avoid breaking out of HTML' => array( '"', '"' ),
'avoids double-encoding existing HTML entities' => array( '"', '"' ),
);
I'm not sure if any of the other constraints is really there. the example with onclick
seems to be covered by the assertion on quotes alone.
lots of thoughts 😆
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dmsnell I agree – I only did that to have an automated check in place to confirm that esc_attr
is indeed used in all the different cases. Ideally we'd have a spy there, but it won't work if we load WordPress. Or maybe it would? We could register the esc_attr
call and augment the return value to something like ESC_ATTR( $original_input )
using a filter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well blimey, we can do this with the attribute_escape
filter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added in 80452d8
should be easy to revert if we need to.
I removed the comment at the top about running this in isolation since we can't anymore; not if we want to assert that we called esc_attr()
and also verify behaviors of esc_attr()
. this is fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! Looks good to me, too. This PR needs a rebase but should be good otherwise – I'll look into that
Redoing this PR on thunk will be easier than rebasing. Here's the raw diff for reference: diff --git a/phpunit/html/wp-html-tag-processor-standalone-test.php b/phpunit/html/wp-html-tag-processor-test.php
similarity index 97%
rename from phpunit/html/wp-html-tag-processor-standalone-test.php
rename to phpunit/html/wp-html-tag-processor-test.php
index cef9b45f7c5d5..dd6e193715133 100644
--- a/phpunit/html/wp-html-tag-processor-standalone-test.php
+++ b/phpunit/html/wp-html-tag-processor-test.php
@@ -1,26 +1,11 @@
<?php
/**
* Unit tests covering WP_HTML_Tag_Processor functionality.
- * This file takes about 100ms to run because it does not load
- * any WordPress libraries:
- *
- * ```
- * ./vendor/bin/phpunit --no-configuration ./phpunit/html/wp-html-tag-processor-test.php
- * ```
- *
- * Put all new WP_HTML_Tag_Processor tests here, and only add new cases to
- * wp-html-tag-processor-test-wp.php when they cannot run without WordPress.
*
* @package WordPress
* @subpackage HTML
*/
-if ( ! function_exists( 'esc_attr' ) ) {
- function esc_attr( $string ) {
- return htmlspecialchars( $string, ENT_QUOTES | ENT_SUBSTITUTE | ENT_HTML401, 'utf-8', false );
- }
-}
-
if ( ! class_exists( 'WP_UnitTestCase' ) ) {
abstract class WP_UnitTestCase extends \PHPUnit\Framework\TestCase {}
}
@@ -256,20 +241,20 @@ public function test_set_attribute_on_a_non_existing_tag_does_not_change_the_mar
}
/**
- * Passing a double quote inside of an attribute values could lead to an XSS attack as follows:
+ * Passing unescaped attribute values could lead to XSS attacks and malformed HTML.
+ * This test ensures that we properly escape the values to avoid these problems.
*
+ * Example:
* <code>
* $p = new WP_HTML_Tag_Processor( '<div class="header"></div>' );
* $p->next_tag();
- * $p->set_attribute('class', '" onclick="alert');
- * echo $p;
- * // <div class="" onclick="alert"></div>
- * </code>
+ * $p->set_attribute( 'class', '" onclick="alert' );
*
- * To prevent it, `set_attribute` calls `esc_attr()` on its given values.
+ * // Not this!
+ * '<div class="" onclick="alert"></div>'
*
- * <code>
- * <div class="" onclick="alert"></div>
+ * // This instead…
+ * '<div class="" onclick="alert"></div>'
* </code>
*
* @ticket 56299
@@ -277,10 +262,20 @@ public function test_set_attribute_on_a_non_existing_tag_does_not_change_the_mar
* @dataProvider data_set_attribute_escapable_values
* @covers set_attribute
*/
- public function test_set_attribute_prevents_xss( $attribute_value ) {
+ public function test_set_attribute_escapes_value($value ) {
+ $spy = function ( $safe_text, $text ) use ( $value ) {
+ $this->assertEquals( $value, $text );
+
+ return $safe_text;
+ };
+
+ add_filter( 'attribute_escape', $spy );
+
$p = new WP_HTML_Tag_Processor( '<div></div>' );
$p->next_tag();
- $p->set_attribute( 'test', $attribute_value );
+ $p->set_attribute( 'test', $value );
+
+ remove_filter( 'attribute_escape', $spy );
/*
* Testing the escaping is hard using tools that properly parse
@@ -297,7 +292,7 @@ public function test_set_attribute_prevents_xss( $attribute_value ) {
preg_match( '~^<div test=(.*)></div>$~', (string) $p, $match );
list( , $actual_value ) = $match;
- $this->assertEquals( $actual_value, '"' . esc_attr( $attribute_value ) . '"' );
+ $this->assertEquals( $actual_value, '"' . esc_attr( $value ) . '"' );
}
/**
diff --git a/phpunit/html/wp-html-tag-processor-wp-test.php b/phpunit/html/wp-html-tag-processor-wp-test.php
deleted file mode 100644
index 6c3a30340b267..0000000000000
--- a/phpunit/html/wp-html-tag-processor-wp-test.php
+++ /dev/null
@@ -1,91 +0,0 @@
-<?php
-/**
- * Unit tests covering WP_HTML_Tag_Processor functionality.
- * This file is only for tests that require WordPress libraries.
- * Put all other tests in wp-html-tag-processor-test-standalone.php.
- *
- * Run these tests locally as follows:
- *
- * ```
- * npx wp-env run phpunit "phpunit -c /var/www/html/wp-content/plugins/gutenberg/phpunit.xml.dist --filter WP_HTML_Tag_Processor_Test_WP"
- * ```
- *
- * @package WordPress
- * @subpackage HTML
- */
-
-require_once __DIR__ . '/../../lib/experimental/html/index.php';
-
-/**
- * @group html
- *
- * @coversDefaultClass WP_HTML_Tag_Processor
- */
-class WP_HTML_Tag_Processor_Test_WP extends WP_UnitTestCase {
-
- /**
- * Passing a double quote inside of an attribute values could lead to an XSS attack as follows:
- *
- * <code>
- * $p = new WP_HTML_Tag_Processor( '<div class="header"></div>' );
- * $p->next_tag();
- * $p->set_attribute('class', '" onclick="alert');
- * echo $p;
- * // <div class="" onclick="alert"></div>
- * </code>
- *
- * To prevent it, `set_attribute` calls `esc_attr()` on its given values.
- *
- * <code>
- * <div class="" onclick="alert"></div>
- * </code>
- *
- * @ticket 56299
- *
- * @dataProvider data_set_attribute_escapable_values
- * @covers set_attribute
- */
- public function test_set_attribute_prevents_xss( $value_to_set, $expected_result ) {
- $p = new WP_HTML_Tag_Processor( '<div></div>' );
- $p->next_tag();
- $p->set_attribute( 'test', $value_to_set );
-
- /*
- * Testing the escaping is hard using tools that properly parse
- * HTML because they might interpret the escaped values. It's hard
- * with tools that don't understand HTML because they might get
- * confused by improperly-escaped values.
- *
- * For this test, since we control the input HTML we're going to
- * do what looks like the opposite of what we want to be doing with
- * this library but are only doing so because we have full control
- * over the content and because we want to look at the raw values.
- */
- $match = null;
- preg_match( '~^<div test=(.*)></div>$~', (string) $p, $match );
- list( , $actual_value ) = $match;
-
- $this->assertEquals( $actual_value, '"' . $expected_result . '"' );
- }
-
- /**
- * Data provider with HTML attribute values that might need escaping.
- */
- public function data_set_attribute_escapable_values() {
- return array(
- array( '"', '"' ),
- array( '"', '"' ),
- array( '&', '&' ),
- array( '&', '&' ),
- array( '€', '€' ),
- array( "'", ''' ),
- array( '<>', '<>' ),
- array( '"";', '&quot";' ),
- array(
- '" onclick="alert(\'1\');"><span onclick=""></span><script>alert("1")</script>',
- '" onclick="alert('1');"><span onclick=""></span><script>alert("1")</script>',
- ),
- );
- }
-
-} |
Part of #44410
What?
Follows up on #44447 (comment):
This sentiment makes sense, so in this PR I merge the two test files into a single file.
Test instructions
Confirm the CI is green.
cc @dmsnell