Make WordPress Core

Opened 5 years ago

Last modified 4 years ago

#48955 new defect (bug)

WP 5.3.1 changes cause potential backwards compatibility breakage with kses

Reported by: icaleb's profile iCaleb Owned by:
Milestone: Future Release Priority: normal
Severity: normal Version: 5.3.1
Component: Security Keywords: needs-patch
Focuses: Cc:

Description

Kses used to allow an array to be (incorrectly) passed in, and would just return the same array. Now it will return an empty string.

Before:

$test = ['this', 'is', 'an', 'array', 'but', 'shouldnt', 'be'];

// Returns the above array
wp_kses_post( $test );

After 5.3.1:

$test = ['this', 'is', 'an', 'array', 'but', 'shouldnt', 'be'];

// Returns empty string
wp_kses_post( $test );

Now to be clear, passing an array and not a string into wp_kses is wrong and in the past wouldn't do anything for you. But this kind of just "worked by accident" I guess. So while it is incorrect usage, this release does change behavior.

This stems from the changes with the new wp_pre_kses_block_attributes filter I believe that is hooked onto pre_kses. Looking at the changes, I don't think it technically needed to cause this break, was more of a side effect.

At a minimum, I'm thinking maybe we should add some tests around this behavior to catch this sort of change in the future?

Attachments (1)

48955.test.diff (1.1 KB) - added by jnylen0 5 years ago.
Test for wp_kses + array

Download all attachments as: .zip

Change History (30)

#1 @aduth
5 years ago

Since it's documented to accept a string, I don't know that backwards-compatibility would cover the incidental behavior of an invalid usage?

https://developer.wordpress.org/reference/functions/wp_kses_post/

Even prior to 5.3.1, the behavior for an improper type was not strictly-speaking an identity function, and could return a different result than what was given:

wp> 5 === wp_kses_post( 5 );
=> bool(false)
wp> wp_kses_post( 5 );
=> string(1) "5"
Last edited 5 years ago by aduth (previous) (diff)

#2 follow-up: @iCaleb
5 years ago

We're seeing quite a few errors stem from this unfortunately on a number of sites, with some PHP errors coming from popular plugins.

It does highlight incorrect usage and potential assumptions that lead to security issues, but I'd argue this sort of change would better be made with advanced notice. Or potentially even deprecation notices followed by strict type checking later on.


Also noting that this affects the larger family of wp_kses* functions, not just wp_kses_post. And also that previously, arrays were actually escaped, at least somewhat:

<?php
$test = ['text', '<script>alert("hi")</script>'];

wp_kses_post($test);
array(2) {
  [0] => string(4) "text"
  [1] => string(11) "alert("hi")"
}

At the end of the day, it just needs to be decided one way or the other regarding array support, whether permanent or even temporary while being phased out.

As-is, it just broke with this new filter as a side affect of how it does it's business in the loop. Which is suboptimal.

Last edited 5 years ago by iCaleb (previous) (diff)

This ticket was mentioned in Slack in #core by aduth. View the logs.


5 years ago

#4 @SergeyBiryukov
5 years ago

  • Version set to 5.3.1

#5 @SergeyBiryukov
5 years ago

The changes in question, for reference: [46896].

This ticket was mentioned in Slack in #forums by joostdevalk. View the logs.


5 years ago

#7 @johnbillion
5 years ago

  • Milestone changed from Awaiting Review to 5.3.2

Milestoning for visibility

#8 @aduth
5 years ago

On some initial investigation, the reason these functions would have appeared to work previously is due to the fact that the functions used by KSES to transform a string tend to support an overloaded form accepting an array.

See $subject argument:

Again, since the KSES functions are documented to accept a string, and the names and types of the arguments are designated accordingly ({string} and $string in the singular forms), that this happened to work is coincidental. It's also unclear whether there are other references in code to the string which would not work if the argument is provided as a non-string.

Some possible action items:

  • Close this as "working as intended", in that the functions are documented to accept a string, and should be passed a string.
  • Consider this as an enhancement request to allow an array of strings to be passed to KSES functions.
  • Make a one-off exception in wp_pre_kses_block_attributes to handle $string passed as an array of strings.

The last of these would be non-ideal from the perspective of a contributor or third-party developer in leaving ambiguity surrounding whether KSES is expected to support non-string values. As a solution, it would be for the sole purpose of maintaining support for an existing, undocumented behavior.

This ticket was mentioned in Slack in #forums by ipstenu. View the logs.


5 years ago

#10 @joelcj91
5 years ago

@iCaleb wp_kses_post_deep can (should) be used for an array.

Last edited 5 years ago by joelcj91 (previous) (diff)

This ticket was mentioned in Slack in #core by audrasjb. View the logs.


5 years ago

This ticket was mentioned in Slack in #core by audrasjb. View the logs.


5 years ago

#13 @audrasjb
5 years ago

  • Keywords needs-patch added

As per today's bug scrub, it would be a good first step to add a warning for plugin authors and users that are misusing kses functions.

#14 @aduth
5 years ago

@iCaleb To help clarify the impact, are you able to share some specific examples of plugins affected by this change?

#15 @iCaleb
5 years ago

Sites will be generating PHP warnings when an array is passed in now. So that is, in a way, a warning for plugin authors. Example:

Warning: preg_match() expects parameter 2 to be string, array given in /wp-includes/class-wp-block-parser.php on line 417

Warning: strlen() expects parameter 1 to be string, array given in /wp-includes/class-wp-block-parser.php on line 489

As for examples, here are two:

#16 @audrasjb
5 years ago

  • Milestone changed from 5.3.2 to 5.3.3

Moving to 5.3.3 as 5.3.2 RC1 is going to be released.

#17 follow-up: @RyanNovotny
5 years ago

https://github.com/WordPress/WordPress/blob/317465e2feb965ea7d86529e54908b3fbea539a8/wp-includes/default-filters.php#L246

add_filter( 'pre_kses', 'wp_pre_kses_block_attributes', 10, 3 );

This filter that was added to apparently fix some stored XSS exploit in the editor now hooks in to every call of wp_kses and does block parsing on it (bad). Pretty please fix it to remove that hook so we can filter HTML w/o running it through the block parser

#18 in reply to: ↑ 17 @aduth
5 years ago

Replying to RyanNovotny:

does block parsing on it (bad)

Can you elaborate on this? While "block parser" might seem worryingly opaque as an apparently expensive process, the actual implementation does pretty well to treat non-block HTML as a trivial operation. It effectively amounts to a single regular expression match and string concatenation.

The reason for running all HTML through the parser is a combination of:

  • Lack of context in knowing how the HTML will be used, deferring to the safest, widest application of the block escaping.
  • Mirroring the exact treatment of markup as it is processed via the the_content filter (which already also parses all incoming HTML using the block parser)

Whether the application of escaping could be scaled back based on certain conditions would need to be evaluated very carefully in consideration of the security implications in allowing a regression of the previous abusable behavior. Since discussion of these improvements are not directly related to this ticket in question, a separate ticket would be most appropriate.

#19 @RyanNovotny
5 years ago

Actually on further review I believe wp_kses basically allowed an array and now the regex parse inside the block parser encounters an error which exposes the bug/unexpected use of wp_kses with arrays.

I think you can perhaps just add a "doing it wrong" to wp_kses if an array or something besides a string is passed(?). Right now it seems quite cryptic to find the problem since its pointing to something you wouldn't expect to be used/wasn't previously (block-parser) in this function

@jnylen0
5 years ago

Test for wp_kses + array

#20 @jnylen0
5 years ago

Passing an array to the kses functions worked previously because it was just a chain of preg_replace_callback calls, and that function accepts either a string or an array.

The approach that is friendliest to the existing ecosystem would be to add unit tests to make sure the KSES functions continue working with an array as they have for many years, fix this behavior again in the code, and document passing an array as a valid option.

attachment:48955.test.diff is a first step towards this approach. It is still passing against the 4.9 branch.

#21 @aduth
5 years ago

What about plugins which respect the documented string type, and would likewise break if a feature enhancement to support arrays was adopted?

For example:

https://github.com/Automattic/jetpack/blob/e67bea6/modules/shortcodes/youtube.php#L109
Filtering as string: https://github.com/Automattic/jetpack/blob/e67bea6/modules/shortcodes/youtube.php#L37

https://plugins.trac.wordpress.org/browser/category-icons/trunk/category_icons.php?rev=744383#L37
Filtering as string: https://plugins.trac.wordpress.org/browser/category-icons/trunk/category_icons.php?rev=744383#L447

https://plugins.trac.wordpress.org/browser/twitter-embed/trunk/twitter-embed.php?rev=2074143#L29
Filtering as string: https://plugins.trac.wordpress.org/browser/twitter-embed/trunk/twitter-embed.php?rev=2074143#L47

https://plugins.trac.wordpress.org/browser/article-directory/trunk/article-directory.php?rev=333204#L37
Filtering as string: https://plugins.trac.wordpress.org/browser/article-directory/trunk/article-directory.php?rev=333204#L33

https://plugins.trac.wordpress.org/browser/yubikey-plugin/trunk/yubikey.php?rev=832195#L421
Filtering as string: https://plugins.trac.wordpress.org/browser/yubikey-plugin/trunk/yubikey.php?rev=832195#L407

https://plugins.trac.wordpress.org/browser/superpack/tags/0.3.1/inc/class-superpack-shortcodes.php#L67
Filtering as string: https://plugins.trac.wordpress.org/browser/superpack/tags/0.3.1/inc/class-superpack-shortcodes.php#L101

https://plugins.trac.wordpress.org/browser/another-soundcloud-quicktag/trunk/another-soundcloud-quicktag.php?rev=333652#L30
Filtering as string: https://plugins.trac.wordpress.org/browser/another-soundcloud-quicktag/trunk/another-soundcloud-quicktag.php?rev=333652#L33

https://plugins.trac.wordpress.org/browser/shortcake-bakery/trunk/inc/class-shortcake-bakery.php?rev=1649994#L77
Filtering as string: https://plugins.trac.wordpress.org/browser/shortcake-bakery/trunk/inc/class-shortcake-bakery.php?rev=1649994#L105

https://plugins.trac.wordpress.org/browser/article-directory-redux/trunk/icwp-adr.php?rev=900910#L67
Filtering as string: https://plugins.trac.wordpress.org/browser/article-directory-redux/trunk/icwp-adr.php?rev=900910#L189

https://plugins.trac.wordpress.org/browser/shoplocket/trunk/shoplocket.php?rev=688408#L44
Filtering as string: https://plugins.trac.wordpress.org/browser/shoplocket/trunk/shoplocket.php?rev=688408#L187

https://plugins.trac.wordpress.org/browser/wpdirectory/trunk/wpdirectory.php?rev=185104#L29
Filtering as string: https://plugins.trac.wordpress.org/browser/wpdirectory/trunk/wpdirectory.php?rev=185104#L25

Arguably, it was never the case that wp_kses would have worked as expected with an array value filtered by any of the plugins above.

The reason I keep harping on about the documentation is that it serves as a contract against which backwards-compatibility should be measured. It acts as a promise that the framework will behave in a particular way under a specific set of conditions, of which the given arguments shape is one such condition. For plugins which passed an array value, there was no guarantee on the behavior, in much the same way that I could call wp_insert_post( new DateTime() ), and it might do something different years from now than it does today.

If a plugin comes to rely on an incidental, undocumented behavior, it's unfortunate if that behavior changes over time. But for each one of those cases, there are many more where a plugin does target the documented signature, and may similarly break if we were to change the behavior of the function. Given the choice, we should seek to uphold support for the documented behavior.

There's also a question of just how far feature support should extend. Without delving into specifics, I can already imagine at least one potential security concern that could result from the proposed enhancement to KSES functions. This leads me to resurface a point I had raised in a recent Slack conversation (link requires registration), wherein we should be especially cautious with added feature support for these functions, given their sensitive nature and the expanded complexity of supporting additional combinations of arguments.

If we treat this as a feature enhancement, it should be evaluated on its own merits, and implemented in mind of the potential impact on existing usage as documented. On that basis, I'm personally not convinced it should be supported.

#22 follow-up: @xkon
5 years ago

Sorry for bumping in as I didn't know about this ticket and it took me way longer to find out why Customizer was producing massive amounts of errors for me today as an array was trying to pass from some custom options.

It seems that due to the mentioned hook add_filter( 'pre_kses', 'wp_pre_kses_block_attributes', 10, 3 ); the parse_blocks() is running everywhere throughout the admin. Is this expected behavior?

As an example:

1] I've added an error_log( $content ); at wp-includes\blocks.php:511.
2] Went into my Appearance -> Themes

debug.log shows:

[23-Dec-2019 20:36:57 UTC] Twenty Twenty
[23-Dec-2019 20:36:57 UTC] the WordPress team
[23-Dec-2019 20:36:57 UTC] Our default theme for 2020 is designed to take full advantage of the flexibility of the block editor. Organizations and businesses have the ability to create dynamic landing pages with endless layouts using the group and column blocks. The centered content column and fine-tuned typography also makes it perfect for traditional blogs. Complete editor styles give you a good idea of what your content will look like, even before you publish. You can give your site a personal touch by changing the background colors and the accent color in the Customizer. The colors of all elements on your site are automatically calculated based on the colors you pick, ensuring a high, accessible color contrast for your visitors.

Same thing happens on Plugins page for each existing plugin so on so forth.

---

Also note that if the filter is simply commented out, there are no errors produced at all from any of the Arrays that options in the Customizer might be returning.

Again sorry if this might be throwing the conversation off, but it hasn't been making any sense to me for hours now the part of why having an output everywhere from the admin area when there are no blocks involved at all (at least as far as I know ).

#23 in reply to: ↑ 22 @aduth
5 years ago

Replying to xkon:

It seems that due to the mentioned hook add_filter( 'pre_kses', 'wp_pre_kses_block_attributes', 10, 3 ); the parse_blocks() is running everywhere throughout the admin. Is this expected behavior?

Yes. These screens use one or more of the wp_kses family of functions. The default KSES behavior will now filter a markup string for block content.

See also: comment:18

#24 in reply to: ↑ 2 @jnylen0
5 years ago

Replying to aduth:

What about plugins which respect the documented string type, and would likewise break if a feature enhancement to support arrays was adopted?

(examples)

Arguably, it was never the case that wp_kses would have worked as expected with an array value filtered by any of the plugins above.

The reason I keep harping on about the documentation is that it serves as a contract against which backwards-compatibility should be measured.

This is a fair point, and the examples you listed (which look like they will mostly either do nothing or cause a warning and perhaps some minor breakage) are a good argument against changing the long-documented behavior. However I don't think accidentally breaking an undocumented behavior that is being actively used is necessarily the right idea, either.

If a plugin comes to rely on an incidental, undocumented behavior, it's unfortunate if that behavior changes over time. [...] Given the choice, we should seek to uphold support for the documented behavior.

Making functions conform more closely to their documented behavior is good, but that doesn't mean incidental, undocumented behavior that people are relying on has to stop working.

Another path to take (already suggested above) would be to restore the array behavior but add a deprecation notice when an array is passed to any of the relevant functions. I'm thinking now that this is probably the best solution.

we should be especially cautious with added feature support for these functions, given their sensitive nature and the expanded complexity of supporting additional combinations of arguments.

Also a fair point, but you'd be trading a hypothetical concern against something that is known to be a problem today:

Replying to iCaleb:

We're seeing quite a few errors stem from this unfortunately on a number of sites, with some PHP errors coming from popular plugins.

From Caleb's same comment, "deprecation notices followed by strict type checking later on" seems like a sensible way to handle this, addressing broken sites in the short term and predictability/complexity in the longer term.

#25 @audrasjb
5 years ago

  • Milestone changed from 5.3.3 to 5.4

Moving all unfixed tickets from 5.3.3 to milestone 5.4, as there is no plan for a 5.3.3 maintenance release for now.

This ticket was mentioned in Slack in #core by david.baumwald. View the logs.


4 years ago

#27 @davidbaumwald
4 years ago

  • Milestone changed from 5.4 to Future Release

This ticket still needs a decision, and with 5.4 Beta 3 landing tomorrow, this is being moved to Future Release. If any maintainer or committer feels this can be resolved in time or wishes to assume ownership during a specific cycle, feel free to update the milestone accordingly.

#28 @whyisjake
4 years ago

We are now a major version past this issue, I am thinking about closing this as wontfix, but am open to ideas here.

#29 @iCaleb
4 years ago

At a minimum, I think a decision needs to be made here other than "wontfix".

  • If enforcing a string is the decision, then let's enforce it in a way other than just a side effect of one specific filter.
  • If allowing an array, even if temporarily, let's specify that and have a clear acceptance path (along with deprecation notices if it's temporary).
  • Either way, let's get a unit test to dictate what should happen so it doesn't flip back and forth in the future.
Note: See TracTickets for help on using tickets.