Skip to content
Closed
Show file tree
Hide file tree
Changes from 32 commits
Commits
Show all changes
39 commits
Select commit Hold shift + click to select a range
39a3907
Add output buffering for the rendered template
westonruter Feb 26, 2025
f576c3c
Rework filter to exclusively apply to an HTML output buffer
westonruter Mar 13, 2025
33b524b
Add wp_final_output_buffer action
westonruter Mar 13, 2025
5e49712
Merge branch 'trunk' of https://github.com/WordPress/wordpress-develo…
westonruter Mar 13, 2025
8a6f5a7
Merge branch 'trunk' of https://github.com/WordPress/wordpress-develo…
westonruter Sep 22, 2025
e9a46aa
Add wp_before_include_template action and move OB handling to new fun…
westonruter Sep 22, 2025
2c074fa
Add wp_template_output_buffer filter
westonruter Sep 22, 2025
7f79ebd
Ensure at least one tag is present for HTML content type
westonruter Sep 23, 2025
6637ef5
Add test for ending and cleaning an output buffer to avoid processing
westonruter Sep 23, 2025
b98b81e
Add test case for JSON
westonruter Sep 23, 2025
9f07ad0
Account for multiple content-type headers
westonruter Sep 23, 2025
4e7029b
Fix passing back the filtered output buffer
westonruter Sep 23, 2025
fd79c2a
Remove extra line break
westonruter Sep 23, 2025
322c930
Merge branch 'trunk' into trac-43258
westonruter Sep 26, 2025
32cc514
Add test for calling ob_clean() instead of ob_end_clean()
westonruter Sep 26, 2025
59cf047
Include original buffer in param to filters
westonruter Oct 1, 2025
b755a24
Fix alignment
westonruter Oct 1, 2025
2dabd76
Clarify that the wp_before_include_template action occurs immediately…
westonruter Oct 10, 2025
e65811c
Remove redundant/unreliable HTML content type check for the presence …
westonruter Oct 10, 2025
69df1d0
Eliminate non-HTML output buffering and remove action
westonruter Oct 10, 2025
853082e
Reframe output buffer as being for optimization only
westonruter Oct 10, 2025
d38dc86
Merge branch 'trunk' of https://github.com/WordPress/wordpress-develo…
westonruter Oct 10, 2025
7f309f5
Only start output buffer if filters are present by default
westonruter Oct 10, 2025
6ee2a45
Account for only the first listed content-type header being sent
westonruter Oct 11, 2025
1b4a2c4
Add missing filter param to phpdoc
westonruter Oct 11, 2025
7f43ee5
Fix typo
westonruter Oct 11, 2025
b242dc0
Be explicit about the DOM API to use
westonruter Oct 11, 2025
ad4c8d4
Clarify purpose of filter and why chunking is disabled
westonruter Oct 11, 2025
bbfdaa6
Merge branch 'trunk' into trac-43258
westonruter Oct 11, 2025
80d24ae
Refer to enhancement rather than optimization
westonruter Oct 11, 2025
16d7928
Add wp_should_output_buffer_template_for_enhancement() helper function
westonruter Oct 11, 2025
dfd6e5e
Add assertions for wp_template_enhancement_output_buffer_started action
westonruter Oct 11, 2025
5409e44
Remove unnecessary return phpdoc
westonruter Oct 12, 2025
44d52df
Remove unnecessary default params
westonruter Oct 12, 2025
6bcbf0e
Add missing assertion messages
westonruter Oct 12, 2025
d4a3da3
Merge branch 'trunk' of https://github.com/WordPress/wordpress-develo…
westonruter Oct 12, 2025
6333403
Improve docs for wp_finalize_template_enhancement_output_buffer()
westonruter Oct 15, 2025
611a482
Merge branch 'trunk' into trac-43258
westonruter Oct 15, 2025
f49563c
Remove whitespace at end of line
westonruter Oct 15, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions src/wp-includes/default-filters.php
Original file line number Diff line number Diff line change
Expand Up @@ -422,6 +422,7 @@
add_action( 'do_all_pings', 'generic_ping', 10, 0 );
add_action( 'do_robots', 'do_robots' );
add_action( 'do_favicon', 'do_favicon' );
add_action( 'wp_before_include_template', 'wp_start_template_enhancement_output_buffer', 10, 1 );
add_action( 'set_comment_cookies', 'wp_set_comment_cookies', 10, 3 );
add_action( 'sanitize_comment_cookies', 'sanitize_comment_cookies' );
add_action( 'init', 'smilies_init', 5 );
Expand Down
9 changes: 9 additions & 0 deletions src/wp-includes/template-loader.php
Original file line number Diff line number Diff line change
Expand Up @@ -113,6 +113,15 @@
*/
$template = apply_filters( 'template_include', $template );
if ( $template ) {
/**
* Fires immediately before including the template.
*
* @since 6.9.0
*
* @param string $template The path of the template about to be included.
*/
do_action( 'wp_before_include_template', $template );

include $template;
} elseif ( current_user_can( 'switch_themes' ) ) {
$theme = wp_get_theme();
Expand Down
139 changes: 139 additions & 0 deletions src/wp-includes/template.php
Original file line number Diff line number Diff line change
Expand Up @@ -823,3 +823,142 @@ function load_template( $_template_file, $load_once = true, $args = array() ) {
*/
do_action( 'wp_after_load_template', $_template_file, $load_once, $args );
}

/**
* Checks whether the template should be output buffered for enhancement.
*
* By default, an output buffer is only started if a {@see 'wp_template_enhancement_output_buffer'} filter has been
* added be the time a template is included at the {@see 'wp_before_include_template'} action. This allows template
* responses to be streamed as much as possible when no template enhancements are registered to apply.
*
* @since 6.9.0
*
* @return bool Whether the template should be output-buffered for enhancement.
*/
function wp_should_output_buffer_template_for_enhancement(): bool {
/**
* Filters whether the template should be output-buffered for enhancement.
*
* By default, an output buffer is only started if a {@see 'wp_template_enhancement_output_buffer'} filter has been
* added. For this default to apply, a filter must be added by the time the template is included at the
* {@see 'wp_before_include_template'} action. This allows template responses to be streamed as much as possible
* when no template enhancements are registered to apply. This filter allows a site to opt in to adding such
* template enhancement filters during the rendering of the template.
*
* @since 6.9.0
*
* @param bool $use_output_buffer Whether an output buffer is started.
*/
return (bool) apply_filters( 'wp_should_output_buffer_template_for_enhancement', has_filter( 'wp_template_enhancement_output_buffer' ) );
}

/**
* Starts the template enhancement output buffer.
*
* This function is called immediately before the template is included.
*
* @since 6.9.0
*
* @return bool Whether the output buffer successfully started.
*/
function wp_start_template_enhancement_output_buffer(): bool {
if ( ! wp_should_output_buffer_template_for_enhancement() ) {
return false;
}

$started = ob_start(
'wp_finalize_template_enhancement_output_buffer',
0, // Unlimited buffer size so that entire output is passed to the filter.
/*
* Instead of the default PHP_OUTPUT_HANDLER_STDFLAGS (cleanable, flushable, and removable) being used for
* flags, the PHP_OUTPUT_HANDLER_FLUSHABLE flag must be omitted. If the buffer were flushable, then each time
* that ob_flush() is called, a fragment of the output would be sent into the output buffer callback. This
* output buffer is intended to capture the entire response for processing, as indicated by the chunk size of 0.
* So the buffer does not allow flushing to ensure the entire buffer can be processed, such as for optimizing an
* entire HTML document, where markup in the HEAD may need to be adjusted based on markup that appears late in
* the BODY.
*
* If this ends up being problematic, then PHP_OUTPUT_HANDLER_FLUSHABLE could be added to the $flags and the
* output buffer callback could check if the phase is PHP_OUTPUT_HANDLER_FLUSH and abort any subsequent
* processing while also emitting a _doing_it_wrong().
*
* The output buffer needs to be removable because WordPress calls wp_ob_end_flush_all() and then calls
* wp_cache_close(). If the buffers are not all flushed before wp_cache_close() is closed, then some output buffer
* handlers (e.g. for caching plugins) may fail to be able to store the page output in the object cache.
* See <https://github.com/WordPress/performance/pull/1317#issuecomment-2271955356>.
*/
PHP_OUTPUT_HANDLER_STDFLAGS ^ PHP_OUTPUT_HANDLER_FLUSHABLE
);

if ( $started ) {
/**
* Fires when the template enhancement output buffer has started.
*
* @since 6.9.0
*/
do_action( 'wp_template_enhancement_output_buffer_started' );
}

return $started;
}

/**
* Finalizes the template enhancement output buffer.
*
* @since 6.9.0
*
* @see wp_start_template_enhancement_output_buffer()
*
* @param string $output Output buffer.
* @param int $phase Phase.
* @return string Finalized output buffer.
*/
function wp_finalize_template_enhancement_output_buffer( string $output, int $phase ): string {
// When the output is being cleaned (e.g. pending template is replaced with error page), do not send it through the filter.
if ( ( $phase & PHP_OUTPUT_HANDLER_CLEAN ) !== 0 ) {
return $output;
}

// Detect if the response is an HTML content type.
$is_html_content_type = null;
$html_content_types = array( 'text/html', 'application/xhtml+xml' );
foreach ( headers_list() as $header ) {
$header_parts = preg_split( '/\s*[:;]\s*/', strtolower( $header ) );
if (
is_array( $header_parts ) &&
count( $header_parts ) >= 2 &&
'content-type' === $header_parts[0]
) {
$is_html_content_type = in_array( $header_parts[1], $html_content_types, true );
break; // PHP only sends the first Content-Type header in the list.
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There’s no break from the loop here, which means we are going to scan every header even after we have our answer. According to my understanding, any request with more than one Content Type should be rejected. Now we’re in a pickle here because WordPress or plugin code is the one creating multiple content types. In such a case, I think that the appropriate course of action is to avoid guessing: perhaps not apply the filtering?

in any case, I think we could make our check more explicit. we know we’re looking for the content type header, so we don’t need to split it apart.

$is_html = false;
foreach ( $headers_list as $header ) {
	if ( 1 === preg_match( '~^content-type:(?:(?:\r\n)?[ \t]+)(?:text/html|application/xhtml\+xml)(?:$|;)~i', $header ) ) {
		$is_html = true;
		break;
	}	
}
if ( ! $is_html && in_array( ini_get( 'default_mimetype' ), array( 'text/html', 'application/xhtml+xml' ), true ) ) {
	$is_html = true;
}

I guess that code doesn’t check for duplicate headers, and could probably be broken into some helper functions to separate the act of parsing headers from the act of checking for HTML mime types.

just think it would be worth being strict on the HTTP header parsing since PHP isn’t and since that’s a common opportunity for mistakes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had thought that the last Content-Type header sent by PHP would be the one that would gets sent over HTTP, but now this doesn't seem to be the case. I just tried with this plugin code:

add_action( 'template_redirect', static function () {
	header( 'Content-Type: text/plain', false );
	header( 'Content-Type: text/html', false );
	header( 'Content-Type: application/json', false );
	header( 'Content-Type: text/plain', false );

	var_dump( headers_list() );
	exit;
} );

The result is:

HTTP/1.1 200 OK
Server: nginx/1.29.1
Date: Sat, 11 Oct 2025 17:57:58 GMT
Content-Type: text/html; charset=UTF-8
Transfer-Encoding: chunked
Connection: keep-alive
X-Powered-By: PHP/8.2.29

array(6) {
  [0]=>
  string(24) "X-Powered-By: PHP/8.2.29"
  [1]=>
  string(38) "Content-Type: text/html; charset=UTF-8"
  [2]=>
  string(38) "Content-type: text/plain;charset=UTF-8"
  [3]=>
  string(37) "Content-type: text/html;charset=UTF-8"
  [4]=>
  string(30) "Content-Type: application/json"
  [5]=>
  string(38) "Content-type: text/plain;charset=UTF-8"
}

If I use $replace=true on the above application/json call, then it changes to:

HTTP/1.1 200 OK
Server: nginx/1.29.1
Date: Sat, 11 Oct 2025 17:58:11 GMT
Content-Type: application/json
Transfer-Encoding: chunked
Connection: keep-alive
X-Powered-By: PHP/8.2.29

array(3) {
  [0]=>
  string(24) "X-Powered-By: PHP/8.2.29"
  [1]=>
  string(30) "Content-Type: application/json"
  [2]=>
  string(38) "Content-type: text/plain;charset=UTF-8"
}

So it seems it is the first Content-Type header appearing in the headers_list() array which is getting sent. This is somewhat confusing to me because if I use another header like X-Content-Type then every one is sent.

See also the RFC 9110 section on Content-Type:

Although Content-Type is defined as a singleton field, it is sometimes incorrectly generated multiple times, resulting in a combined field value that appears to be a list. Recipients often attempt to handle this error by using the last syntactically valid member of the list, leading to potential interoperability and security issues if different implementations have different error handling behaviors.

So it is defined as a singleton so I suppose this is why PHP is only serving one of the headers, but PHP's behavior seems to vary with what is described above by serving the first valid Content-Type instead of the last. As I understand, sending duplicate headers is the same syntactically as sending one header with multiple comma-separated values.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about 6ee2a45?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

meh. it’s actually worse. PHP will send up to two Content-Type headers, because why wouldn’t it? despite it long being a singleton header. it seems to create two buckets, one for the $replace = true case and one for the $replace = false case. it will send the last version of the Content-Type header for each bucket.

header( 'Content-Type: text/plain' );
header( 'Content-Type: text/html' );
header( 'Content-Type: text/xml' );
header( 'Content-Type: application/xml', true );
header( 'Content-Type: application/octect-stream', false );
header( 'Content-Type: application/xml+svg', true );
header( 'Content-Type: application/json', false );

This sends the following response.

HTTP/1.1 200 OK
Host: localhost:8000
Date: Wed, 15 Oct 2025 17:20:32 GMT
Connection: close
X-Powered-By: PHP/8.4.13
Content-Type: application/xml+svg
Content-Type: application/json

How about 6ee2a45

I think it’s still going to work well enough for most happy-path cases, but it could help to step back and this about this from the HTTP perspective.

  • I don’t believe that the assertion in the code about the first header is accurate, as demonstrated in my included snippet.
  • The code is still vulnerable to spoofing based on non-spec header parsing. It’s not that much additional work to parse the headers in a spec-compliant way, so I think that would be a valuable addition here. (It will parse Content-Type; text/plain as a real Content-Type header, which it isn’t).

We can probably debate how important it is to safeguard this code, but I wouldn’t want to rule out subtle ways that code could force-enable or force-disable the output filtering.

what we have here is something I see a lot, which is a kind of elaboration on code to try and cover more edge cases, but the approach isn’t stemming from the language of HTTP. we could start from that language and it would involve a comparable amount of code. I believe that the PCRE pattern I provided is accurate (though may be it’s wrong. Alternatively, a simple str_starts_with( strtolower( $header ), 'content-type:' ) would expressively cover affirmative matches, missing some valid ones, but is also not prone to the false-positives from the [:;] group.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dmsnell OK, I don't actually see two Content-Type response headers being sent, at least in PHP 8.2 in the wordpress-develop env.

This WP PHP code:

add_action( 'template_redirect', static function () {
	header( 'Content-Type: text/plain' );
	header( 'Content-Type: text/html' );
	header( 'Content-Type: text/xml' );
	header( 'Content-Type: application/xml', true );
	header( 'Content-Type: application/octect-stream', false );
	header( 'Content-Type: application/xml+svg', true );
	header( 'Content-Type: application/json', false );
	var_dump( headers_list() );
	exit;
} );

Results in:

HTTP/1.1 200 OK
Server: nginx/1.29.1
Date: Wed, 15 Oct 2025 22:50:38 GMT
Content-Type: application/xml+svg
Transfer-Encoding: chunked
Connection: keep-alive
X-Powered-By: PHP/8.2.29

array(3) {
  [0]=>
  string(24) "X-Powered-By: PHP/8.2.29"
  [1]=>
  string(33) "Content-Type: application/xml+svg"
  [2]=>
  string(30) "Content-Type: application/json"
}

Maybe Nginx is stripping out the second response header?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've modified the parsing of the headers in a follow-up PR, although I didn't go with the preg_match() since I wanted to have one list of content types in an array to use when looking at the default MIME type: #10293

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could be. I added your code above but put it inside the init hook and used error_log() instead of exit.

dmsnell@Maximo ~/c/wordpress-develop (scrapyard)> curl http://localhost:3880 -D -
HTTP/1.1 200 OK
Server: nginx/1.29.1
Date: Wed, 15 Oct 2025 23:51:31 GMT
Content-Type: text/html; charset=UTF-8
Transfer-Encoding: chunked
Connection: keep-alive
X-Powered-By: PHP/8.4.13
Link: <http://localhost:3880/wp-json/>; rel="https://api.w.org/"
Server-Timing: wp-before-template;dur=70.72

array(3) {
  [0]=>
  string(24) "X-Powered-By: PHP/8.4.13"
  [1]=>
  string(33) "Content-Type: application/xml+svg"
  [2]=>
  string(30) "Content-Type: application/json"
}
bool(true)
<!DOCTYPE html>
<html lang="en-US">
...

the unifying factor here seems to be that this is inconsistent and I think that should encourage us to consider the potential for unexpected for messed up headers.

it seems very likely that nginx will refuse to send more than one header, but even when doing so, we need to understand which one it picks and also acknowledge that headers_list() may not match what’s actually sent.

}
if ( null === $is_html_content_type ) {
$is_html_content_type = in_array( ini_get( 'default_mimetype' ), $html_content_types, true );
}

// If the content type is not HTML, short-circuit since it is not relevant for enhancement.
if ( ! $is_html_content_type ) {
return $output;
}

$filtered_output = $output;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what’s the point of creating this copy of the variable? is it for documenting the apply_filters() call?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but it was also because there were multiple filters applying so it needed to have the $output be kept as-is. We could do this now:

diff --git a/src/wp-includes/template.php b/src/wp-includes/template.php
index 35a171e2a8..452edb1f5c 100644
--- a/src/wp-includes/template.php
+++ b/src/wp-includes/template.php
@@ -913,8 +913,6 @@ function wp_finalize_template_optimization_output_buffer( string $output, int $p
 		return $output;
 	}
 
-	$filtered_output = $output;
-
 	/**
 	 * Filters the template optimization output buffer prior to sending to the client.
 	 *
@@ -931,5 +929,5 @@ function wp_finalize_template_optimization_output_buffer( string $output, int $p
 	 * @param string $output          Original HTML template output buffer.
 	 * @return string HTML template optimization output buffer.
 	 */
-	return (string) apply_filters( 'wp_template_optimization_output_buffer', $filtered_output, $output );
+	return (string) apply_filters( 'wp_template_optimization_output_buffer', $output, $output );
 }

But having the $output duplicated in the apply_filters() params seems strange to look at.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not a quality issue of course. I figured it was this reason but never saw the multiple filters.

having the $output duplicated in the apply_filters() params seems strange to look at.

it jumps out less to me than creating a variable copy¹ and then doing nothing with it but pass it alongside the parent. still, it was clear that the purpose was to communicate something through its name. I have seen the duplicated variable and Core has one with $email.

either way, it’s fine. it just stood out to me and I wanted to ask if it was an oversight.

¹ using simplified language because I don’t know how to briefly say “have PHP create a new binding to the variable”

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've removed the copy of the variable in #10293


/**
* Filters the template enhancement output buffer prior to sending to the client.
*
* This filter only applies the HTML output of an included template. This filter is a progressive enhancement
* intended for applications such as optimizing markup to improve frontend page load performance. Sites must not
* depend on this filter applying since they may opt to stream the responses instead. Callbacks for this filter are
* highly discouraged from using regular expressions to do any kind of replacement on the output. Use the HTML API
* (either `WP_HTML_Tag_Processor` or `WP_HTML_Processor`), or else use {@see DOM\HtmlDocument} as of PHP 8.4 which
* fully supports HTML5.
*
* @since 6.9.0
*
* @param string $filtered_output HTML template enhancement output buffer.
* @param string $output Original HTML template output buffer.
* @return string HTML template enhancement output buffer.
*/
return (string) apply_filters( 'wp_template_enhancement_output_buffer', $filtered_output, $output );
}
Loading
Loading