-
Notifications
You must be signed in to change notification settings - Fork 3.2k
Prevent entity normalization mangling HTML like ' #9099
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prevent entity normalization mangling HTML like ' #9099
Conversation
|
The following accounts have interacted with this PR and/or linked issues. I will continue to update these lists as activity occurs. You can also manually ask me to refresh this list by adding the Core Committers: Use this line as a base for the props when committing in SVN: To understand the WordPress project's expectations around crediting contributors, please review the Contributor Attribution page in the Core Handbook. |
Test using WordPress PlaygroundThe changes in this pull request can previewed and tested using a WordPress Playground instance. WordPress Playground is an experimental project that creates a full WordPress instance entirely within the browser. Some things to be aware of
For more details about these limitations and more, check out the Limitations page in the WordPress Playground documentation. |
1af9cfc to
c25967a
Compare
dmsnell
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While it think it would be best to find another reviewer (perhaps @xknown or @johnbillion or @mdawaffe) I think this is sound at face value and it’s not breaking tests.
The fix makes sense theoretically and behaves as we expect.
As a side note, this entire function is ripe for replacement with the HTML API whereby we can reliably resolve encoding and escaping issues, but for now this change brings an immediate improvement without raising new questions about legacy bugs or interfaces.
This change should break behavior with old sites, but as demonstrated in the comment in the code, I believe that any existing code relying on the broken behavior is bound to be broken in some way already, making this change a wash.
For HTML saved as intended, this preserves existing behaviors.
|
as a clarification, I wanted to note that while this introduces a change of behavior, it’s not doing so in a way I would expect it to break things. previously, those characters might be corrupted through Core, and if some plugin looks for them and tries to repair them, perhaps that code would now see different data coming through the filter stack. in all of the cases where the behavior is different though, the existing options are fundamentally broken because the corruption happened by Core at the start. this should only improve the situation. |
c25967a to
c325e48
Compare
|
I shared this and requested review at the July 16, 2025 Core devchat. I plan to land this in the next few days unless there are reviews that raise concerns or other issues. |
Add test cases to the `wp_kses_normalize_entities` test to cover https://core.trac.wordpress.org/ticket/63630
The wp_kses_normalize_entities function should not decode double-encoded inputs like `&#2E;` to `E;`. Ensure that the normalization steps are processed in the correct order so that the input is normalized and its value is preserved.
c325e48 to
133bd8e
Compare
… references. Fixes an issue where `wp_kses_normalize_entities` would transform inputs like "'" into "'", changing the intended HTML text. This behavior has present since the initial version of KSES was introduced in [649]. [2896] applied the normalization to post content for users without the "unfiltered_html" capability. Developed in #9099. Props jonsurrell, dmsnell, sirlouen. Fixes #63630. git-svn-id: https://develop.svn.wordpress.org/trunk@60616 602fd350-edb4-49c9-b593-d223f7449a82
… references. Fixes an issue where `wp_kses_normalize_entities` would transform inputs like "'" into "'", changing the intended HTML text. This behavior has present since the initial version of KSES was introduced in [649]. [2896] applied the normalization to post content for users without the "unfiltered_html" capability. Developed in WordPress/wordpress-develop#9099. Props jonsurrell, dmsnell, sirlouen. Fixes #63630. Built from https://develop.svn.wordpress.org/trunk@60616 git-svn-id: https://core.svn.wordpress.org/trunk@59952 1a063a9b-81f0-0310-95a4-ce76da25c4cd
… references. Fixes an issue where `wp_kses_normalize_entities` would transform inputs like "'" into "'", changing the intended HTML text. This behavior has present since the initial version of KSES was introduced in [649]. [2896] applied the normalization to post content for users without the "unfiltered_html" capability. Developed in WordPress/wordpress-develop#9099. Props jonsurrell, dmsnell, sirlouen. Fixes #63630. Built from https://develop.svn.wordpress.org/trunk@60616 git-svn-id: http://core.svn.wordpress.org/trunk@59952 1a063a9b-81f0-0310-95a4-ce76da25c4cd
… references. Fixes an issue where `wp_kses_normalize_entities` would transform inputs like "&WordPress#39;" into "&WordPress#39;", changing the intended HTML text. This behavior has present since the initial version of KSES was introduced in [649]. [2896] applied the normalization to post content for users without the "unfiltered_html" capability. Developed in WordPress#9099. Props jonsurrell, dmsnell, sirlouen. Fixes #63630. git-svn-id: https://develop.svn.wordpress.org/trunk@60616 602fd350-edb4-49c9-b593-d223f7449a82
Prevent the
wp_kses_normalize_entitiesfunction from transforming inputs like'to', changing its value. That transformation changes the input in a way that is not normalized results in significantly different HTML.Trac ticket: https://core.trac.wordpress.org/ticket/63630
✅ (merged)
This change includes #9095 which should be reviewed and landed first.This Pull Request is for code review only. Please keep all other discussion in the Trac ticket. Do not merge this Pull Request. See GitHub Pull Requests for Code Review in the Core Handbook for more details.