Skip to content

Conversation

@westonruter
Copy link
Member

@westonruter westonruter commented Feb 26, 2025

This introduces output buffering of the rendered template for applying enhancements. Authors must not depend on the output buffer for essential page content. It is intended for applications such as optimization, analytics, and server timing.

A new wp_before_include_template action (cf. the wp_before_load_template action) is fired after the template_include filter applies, and right before the template file is include'ed. This new action allows for a designated hook to run to start output buffering, without hacking the template_include filter to start output buffering, or to potentially start output buffering a non-template output if template_redirect were used.

At the new wp_before_include_template action, the function wp_start_template_enhancement_output_buffer() runs which starts the output buffer. This output buffer is configured to have an unlimited chunk size and is not flushable so that the wp_finalize_template_enhancement_output_buffer() output buffer callback will receive the entire rendered template once when it is finished rendering. This means that calls to ob_flush() will be ignored during template rendering (and they'll actually result in a PHP warning). Nevertheless, if while a template is being rendered there is a call to ob_end_clean() then this will result in the template output buffer being cancelled without any hooks applying in the callback. Calling ob_clean(), on the other hand, discards any existing template which was rendered, allowing the output to start over and still be passed through the filters in the output buffer callback.

If the output buffer is not cleaned, then the content type of the response is sniffed by looking at the headers_list() and the default_mimetype. If the response is not HTML, then the output buffer processing short circuits and the buffer is sent straight as the response. If the response is HTML, then the output buffer is passed through the wp_template_enhancement_output_buffer filter, with the first argument being the filtered output buffer an the second argument being the original output buffer.

After the wp_template_output_buffer_html filter has applied (or if it is not applied since the response does not appear to be HTML), then the wp_template_output_buffer filter applies. This allows for filtering non-HTML templates, for example JSON or some other format.

Finally, after all filters have applied, the final filtered value and the original output buffer are passed into a wp_final_template_output_buffer action. This gives caching plugins the opportunity to store the resulting output.

This implementation is based on the same output buffering logic that was developed for Optimization Detective and Gutenberg's Full Page Client-Side Navigation Experiment.

A plugin can prevent the output buffer from being started by removing this action:

remove_action( 'wp_before_include_template', 'wp_start_template_enhancement_output_buffer' );

But the better way is to use the wp_should_output_buffer_template_for_enhancement filter. This filter can be queried via the wp_should_output_buffer_template_for_enhancement() action. The default value for this filter is true if there are any wp_template_enhancement_output_buffer filters added. If you want to prevent the template output from being buffered, you can prevent it via:

add_filter( 'wp_should_output_buffer_template_for_enhancement', '__return_false' );

Alternatively, if you want to add an output buffer but don't know for sure whether you'll be adding a wp_template_enhancement_output_buffer filter, or you'll be adding the filter after the wp_before_include_template action has fired, you can force the output buffer to start via:

add_filter( 'wp_should_output_buffer_template_for_enhancement', '__return_true' );

A new wp_template_enhancement_output_buffer_started action fires right after the output buffer has started, so that you can know for sure at that point that the wp_template_enhancement_output_buffer filters will apply.

Example Integrations

Examples for how this can be used:

  • Always Load Block Styles on Demand: In classic themes a lot more CSS is added to a page than is needed because when the HEAD is rendered before the rest of the page, so it is not yet known what blocks will be used. This can be fixed with output buffering.
  • Always Print Script Modules in Head: In classic themes script modules are forced to print in the footer since the HEAD is rendered before the rest of the page, so it is not yet known what script modules will be enqueued. This can be fixed with output buffering. (Actually, per Core-63486, this is not always desirable.)
  • Gutenberg's Full Page Client-Side Navigation Experiment: No longer would it need to start its own output buffer, but it could just reuse the wp_template_output_buffer filter.
  • Optimization Detective: The plugin would also be able to eliminate its output buffering, in favor of just reusing the wp_template_output_buffer filter.
  • Caching plugins would also not need to output buffer the response, but they could reuse the filter to capture the output for storing in a persistent object cache while also appending some status HTML comment.
  • Other optimization plugins (e.g. WP Rocket, AMP, etc) would similarly not need to do their own output buffering.
  • Adding fetchpriority=high to an IMG can do so for the largest image, not just the first sufficiently-large image. (This is particularly relevant for classic themes.)

Trac ticket: https://core.trac.wordpress.org/ticket/43258

Stale Gemini Analysis

Summary

This branch introduces a new output buffering mechanism for templates.

Here's a summary of the key changes:

  • A new wp_before_include_template action is fired before a template is included.
  • A new wp_start_template_output_buffer() function is hooked to this action to start an output buffer.
  • The wp_finalize_template_output_buffer() function processes the buffered output.
  • New wp_template_output_buffer_html and wp_template_output_buffer filters allow modification of the template output for HTML and all content types, respectively.
  • A wp_final_template_output_buffer action is fired after all filtering is complete.
  • New PHPUnit tests have been added to verify the functionality for both HTML and JSON responses.

In essence, these changes provide a robust way to intercept and modify the complete output of a template before it is sent to the client.

Review

This is a high-quality contribution that is well-documented, thoroughly tested, and adheres to WordPress coding standards. The new output buffering mechanism is a valuable addition, providing a robust method for developers to modify template output.

Here are a few specific points from the review:

src/wp-includes/template.php

  • The use of PHP_OUTPUT_HANDLER_STDFLAGS ^ PHP_OUTPUT_HANDLER_FLUSHABLE in wp_start_template_output_buffer() is a clever approach to prevent the buffer from being flushable, and the accompanying comment is very helpful.
  • In wp_finalize_template_output_buffer(), the check for PHP_OUTPUT_HANDLER_CLEAN is a good safeguard.
  • The logic to detect the content type is reasonable, and the subsequent use of WP_HTML_Tag_Processor to confirm the presence of HTML tags is an excellent defensive measure.
  • The filter and action order is logical, with the more specific wp_template_output_buffer_html filter executing before the general wp_template_output_buffer filter, followed by the wp_final_template_output_buffer action.

tests/phpunit/tests/template.php

  • The tests are comprehensive, covering HTML and JSON responses, as well as the case where the buffer is cleaned.
  • The tests correctly assert that the filters and actions are fired as expected and that the output is modified correctly.

Suggestion

In wp_finalize_template_output_buffer(), the content type detection logic is a bit complex due to the use of preg_split on the headers. While this implementation is acceptable within the context of WordPress Core, it could be a point of failure if the headers are malformed. This is a minor point, and the current implementation is functional, but it's something to be aware of for future maintenance.

Overall, this is an excellent piece of work.


This Pull Request is for code review only. Please keep all other discussion in the Trac ticket. Do not merge this Pull Request. See GitHub Pull Requests for Code Review in the Core Handbook for more details.

@github-actions
Copy link

github-actions bot commented Feb 26, 2025

The following accounts have interacted with this PR and/or linked issues. I will continue to update these lists as activity occurs. You can also manually ask me to refresh this list by adding the props-bot label.

Core Committers: Use this line as a base for the props when committing in SVN:

Props westonruter, flixos90, dmsnell, jorbin, peterwilsoncc.

To understand the WordPress project's expectations around crediting contributors, please review the Contributor Attribution page in the Core Handbook.

@github-actions
Copy link

Test using WordPress Playground

The changes in this pull request can previewed and tested using a WordPress Playground instance.

WordPress Playground is an experimental project that creates a full WordPress instance entirely within the browser.

Some things to be aware of

  • The Plugin and Theme Directories cannot be accessed within Playground.
  • All changes will be lost when closing a tab with a Playground instance.
  • All changes will be lost when refreshing the page.
  • A fresh instance is created each time the link below is clicked.
  • Every time this pull request is updated, a new ZIP file containing all changes is created. If changes are not reflected in the Playground instance,
    it's possible that the most recent build failed, or has not completed. Check the list of workflow runs to be sure.

For more details about these limitations and more, check out the Limitations page in the WordPress Playground documentation.

Test this pull request with WordPress Playground.

Copy link
Member

@felixarntz felixarntz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@westonruter If there is consensus on going with a simple filter for the entire HTML string, this would probably be good (other than missing tests, which is why I'm requesting changes). But in any case, I don't think it should be committed yet.

I think the approach requires more careful consideration due to the large impact the usage of such a new extension point can have. See also my latest reply in https://core.trac.wordpress.org/ticket/43258#comment:25.

@westonruter westonruter marked this pull request as draft March 13, 2025 16:41

// If the content type is HTML, require that there be at least one tag.
if ( $is_html_content_type ) {
$is_html_content_type = ( new WP_HTML_Tag_Processor( $output ) )->next_tag();
Copy link
Member Author

@westonruter westonruter Sep 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this should look specifically for the first tag being <html>.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

basically every document will parse as HTML5; this will detect XML as HTML and also any kind of JSON containing HTML or otherwise.

what might help here too is a function whose purpose is to try and detect HTML. we can focus the purpose there on detecting it properly and change that in isolation.


disregarding the fact that Hi is a valid HTML document, we could start guessing by looking for something specifically like this…

function likely_starts_a_complete_html_document( $input ) {
	/*
	 * Arbitrarily-chosen cutoff point after which to stop
	 * looking for hints that this is HTML. Adjust as appropriate
	 * to balance performance impact and reliability of detection.
	 */
	$first_chunk = substr( $input, 0, 8 * 1024 );

	$processor = new WP_HTML_Tag_Processor( $first_chunk );
	while ( $processor->next_tag() ) {
		switch ( $processor->get_token_name() ) {
			// The only text nodes before a document should be whitespace.
			case '#text':
				// @todo This function doesn’t exist but it should.
				if ( ! $processor->is_whitespace_only_text_node() ) {
					return false;
				}
				continue;

			// A DOCTYPE declaration is the strongest signal that this is HTML.
			case 'html':
				if ( 'html' === ( $processor->get_doctype_info()->name ?? '' ) ) {
					return true;
				}
				continue;

			// PI-node lookalikes might be XML declarations like `<?xml version…`.
			case '#comment':
				if ( WP_HTML_Tag_Processor::COMMENT_AS_HTML_COMMENT !== $processor->get_comment_type() ) {
					return false;
				}
				continue;

			// Many documents start at `<html>` or `<body>`.
			case 'HTML':
			case 'BODY':
				return true;

			// All other tags, comments, and malformed markup are probably something other than HTML.
			default:
				return false;
		}
	}
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think 1024 bytes makes sense given that the META[charset] tag must be located within the first kilobyte of a document, per MDN:

<meta> elements which declare a character encoding must be located entirely within the first 1024 bytes of the document.

The other consideration is what to do if you have WP_DEBUG_DISPLAY enabled and PHP is printing out before the DOCTYPE:

<br />
<b>Warning</b>:  Something bad happened. in <b>/var/www/src/wp-content/plugins/foo/foo.php</b> on line <b>564</b><br />
<!DOCTYPE html>
<html lang="en-US">
<head>
	<meta charset="UTF-8" />
	<meta name="viewport" content="width=device-width, initial-scale=1" />

Should such a response go through normal output buffer processing? Probably. But then the error message and/or stack trace could end up being more than 1024 bytes.

In this case, maybe it's better just to rely on the Content-Type alone and not try to go the extra mile to detect if the response body also looks like HTML. What do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

somehow I missed this reply, sorry about the delay!

I think 1024 bytes makes sense given that the META[charset] tag must be located within the first kilobyte of a document, per MDN:

This is definitely true for character encoding, but I was thinking about documents which may not have that META element. Perhaps it’s good enough if we expect this to come from WordPress and WordPress will “always” print one.

The other consideration is what to do if you have WP_DEBUG_DISPLAY enabled and PHP is printing out before the DOCTYPE:

This is a good point. Though, this almost seems to tie-in with some of my questions about the way that the design of this new API influences developers. In a situation like this, if we were thrown off and didn’t catch that this is HTML, none of the filters would run. However, if these are progressive enhancement filters, that wouldn’t be so much of a problem because this page render would already be in an abnormal state.

What I mean is that if something bad really happened, then likely the fact that some SCRIPT optimization doesn’t occur is not really that problematic.


On the other hand, I think another benefit of creating a new named function (to detect if a given string likely starts an HTML document) is that we can address these in a focused and meaningful way. We can file a bug report saying that it misdirected documents because of this case.

While I would caution against checking, “are there any HTML tags discovered within the first 1,024 bytes,” because XML and other non-HTML formats will frequently masquerade as such, we could attempt to detect common HTML patterns, or WordPress responses, and make the detector more robust over time without changing the intent of the function.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

out of curiosity, I sampled my set of top domains and found this:

  • 293,965 HTML pages processed
  • 28,984 contained a <meta charset> or <meta http-equiv=content-type>
  • 781 of those did not fall completely within the first 1,024 bytes
  • one was found only within the first 626,211 bytes

there is nothing actionable from this: just sharing

Copy link
Member Author

@westonruter westonruter Oct 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dmsnell Thank you for you replies. It sounds like it would be simpler and safer then to just check the Content-Type that was sent to see if it is HTML. Doing additional sniffing to see if the body also looks like HTML would seem to be redundant (and potentially unpredictable). And the browser is going to treat it as HTML regardless. In that way, we can just remove this logic:

// If the content type is HTML, require that there be at least one tag.
if ( $is_html_content_type ) {
	$is_html_content_type = ( new WP_HTML_Tag_Processor( $output ) )->next_tag();
}

@westonruter westonruter marked this pull request as ready for review September 26, 2025 05:59
Copy link
Member

@dmsnell dmsnell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really happy to see this coming together @westonruter. I know we’ve shared privately about this and I know this will lead to some marked improvement on processing of outbound HTML.


While I may have left this comment before, I am very hesitant to support a built-in system which forces buffering of the entire output by default. This one decision essentially prevents any kind of streaming output from WordPress which otherwise might reduce latency to the client.

I’d much rather see us push hard on encouraging developers to think about streaming interfaces by default and only reaching for full output buffering when necessary, and in a way that it can be identified (e.g. “Plugin [My Slow-Down Plugin] is delaying the start of rendering your site by 13ms by holding on to the full contents until the page is finished.”).

Note that we have long-established plans to build ->extend( $next_chunk_of_html ) into the Tag Processor and HTML Processor. The ->paused_at_incomplete_token() is there in part to facilitate that feature (as well as a kind of flushing mechanism too).

Now some of this is purely my own bias because I have made a number of systems take advantage of streaming output to minimize latency to the client. The change in this PR would remove all of the optimization those bring and give me no way to bring it back.

A plugin could remove wp_start_template_output_buffer if they wanted to deliver a faster experience in the browser, but if they do so they would also then be cutting off every other plugin which Core has encouraged to hook into the filter.

So this is the big concern I have; not that code will choose to eliminate the ability to stream a response and get it out quicker, but because it ultimately prevents any plugin from streaming.

Where is streaming really useful? when prepping many rows of independent records from the database, for example:

  • When sending bell notifications on WordPress.com
  • When returning a post list where we could be flushing after each iteration of the the loop.
  • When returning a post list in the JSON API, where we could flush out each record as we get them.
  • When part of a response depends on network calls which could delay over a second, such as when reporting available updates.

These kinds of use-cases can have dramatic improvements on the end-user experience because delay mid-request can put the response on hold for seconds at a time. Good front-end code, including browsers, will do really well at partially rendering partial responses as they arrive.

$template = apply_filters( 'template_include', $template );
if ( $template ) {
/**
* Fires before including the template.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be nice to leave some clue for why someone might like to hook into this. I’m struggling to know what I should do as a plugin author here. what if I want to disable output buffering? what might I want to do besides turning on output buffering?

would there be any value to making this a filter where I could change the template being included?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be nice to leave some clue for why someone might like to hook into this. I’m struggling to know what I should do as a plugin author here. what if I want to disable output buffering? what might I want to do besides turning on output buffering?

Yeah, I don't really imagine what specific things a developer would want to do here at this action, other than it provides a way for them to disable output buffering by having done:

remove_action( 'wp_before_include_template', 'wp_start_template_output_buffer' );

In this way, a plugin could replace the core output buffering function with their own implementation.

add_action( 'wp_before_include_template', 'my_custom_plugin_start_template_output_buffer' );

would there be any value to making this a filter where I could change the template being included?

There is already a filter for this, yeah? The template_include filter which is applied immediately before.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I just remembered. Something which would be of value to use this action for is capturing the start time for a template being rendered, so you can capture the time it takes to render the template.

The Performance Lab plugin hacks the template_include filter to send a Server-Timing header at the last possible moment before a template is included, but if this action were available then it would be used instead.


// If the content type is HTML, require that there be at least one tag.
if ( $is_html_content_type ) {
$is_html_content_type = ( new WP_HTML_Tag_Processor( $output ) )->next_tag();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

basically every document will parse as HTML5; this will detect XML as HTML and also any kind of JSON containing HTML or otherwise.

what might help here too is a function whose purpose is to try and detect HTML. we can focus the purpose there on detecting it properly and change that in isolation.


disregarding the fact that Hi is a valid HTML document, we could start guessing by looking for something specifically like this…

function likely_starts_a_complete_html_document( $input ) {
	/*
	 * Arbitrarily-chosen cutoff point after which to stop
	 * looking for hints that this is HTML. Adjust as appropriate
	 * to balance performance impact and reliability of detection.
	 */
	$first_chunk = substr( $input, 0, 8 * 1024 );

	$processor = new WP_HTML_Tag_Processor( $first_chunk );
	while ( $processor->next_tag() ) {
		switch ( $processor->get_token_name() ) {
			// The only text nodes before a document should be whitespace.
			case '#text':
				// @todo This function doesn’t exist but it should.
				if ( ! $processor->is_whitespace_only_text_node() ) {
					return false;
				}
				continue;

			// A DOCTYPE declaration is the strongest signal that this is HTML.
			case 'html':
				if ( 'html' === ( $processor->get_doctype_info()->name ?? '' ) ) {
					return true;
				}
				continue;

			// PI-node lookalikes might be XML declarations like `<?xml version…`.
			case '#comment':
				if ( WP_HTML_Tag_Processor::COMMENT_AS_HTML_COMMENT !== $processor->get_comment_type() ) {
					return false;
				}
				continue;

			// Many documents start at `<html>` or `<body>`.
			case 'HTML':
			case 'BODY':
				return true;

			// All other tags, comments, and malformed markup are probably something other than HTML.
			default:
				return false;
		}
	}
}

@westonruter
Copy link
Member Author

@dmsnell:

While I may have left this comment before, I am very hesitant to support a built-in system which forces buffering of the entire output by default. This one decision essentially prevents any kind of streaming output from WordPress which otherwise might reduce latency to the client. […] So this is the big concern I have; not that code will choose to eliminate the ability to stream a response and get it out quicker, but because it ultimately prevents any plugin from streaming.

Thanks for the feedback and for raising this concern, which we did discuss a bit before. While the lack of streaming was indeed a potential drawback to output buffering in the past with classic themes, the reality is that now with block themes that ship has largely sailed. This is because all the blocks have to be rendered and then wp_head() runs. See template-canvas.php. This is great for performance because allows for scripts and styles to be enqueued which are actually used on the page, but it means the response cannot be streamed. So block-based theme templates are essentially using output buffering already, except without using ob_start(). So I do not see that adding document-level output buffering will introduce any significant latency in practice, not only due to how block themes work, but also due to page caching layers and/or other optimization plugins which are already doing output buffering (each in their own ad hoc way without any standardization).

@dmsnell
Copy link
Member

dmsnell commented Sep 28, 2025

Thanks for the thoughtful response @westonruter and the link.

the reality is that now with block themes that ship has largely sailed. This is because all the blocks have to be rendered and then wp_head() runs. See template-canvas.php.

another way to look at this is that we introduced a regression there too, and I think we can look at that system for further optimization ideas. in a similar way that a browser starts with a full parser and a speculative parser, I bet WordPress could accomplish a lot of what it needs for enqueuing styles and scripts through a fast speculative parse, ship the HEAD, and then render the blocks.

this is something that the work in #9105 makes easier than ever, where we can quickly and efficiently process the block structure in a post before doing any real processing. that would, for instance, let us see every block type in use and check for things like block supports or even for the presence of CSS classes on a block’s “wrapping element.”


with the content type check I guess we are certain this won’t run on “REST” API calls? or RSS feeds, or XML-RPC calls?

Copy link
Contributor

@peterwilsoncc peterwilsoncc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a few minor notes inline.

Testing batcache, I can see that it does cache unauthenticated REST API requests. I think it would be good to account for that to allow it to eventually migrate (I was testing with the Human Made variation).

If implementing for the REST API is overly complex for this PR, perhapes you could:

  • rename the functions & hooks to be generic (ie, remove the template references)
  • adding a context parameter to the various hooks with the response type: html, json, etc

westonruter and others added 2 commits October 12, 2025 15:57
Co-authored-by: Peter Wilson <519727+peterwilsoncc@users.noreply.github.com>
Co-authored-by: Peter Wilson <519727+peterwilsoncc@users.noreply.github.com>
@westonruter
Copy link
Member Author

Testing batcache, I can see that it does cache unauthenticated REST API requests. I think it would be good to account for that to allow it to eventually migrate (I was testing with the Human Made variation).

If implementing for the REST API is overly complex for this PR, perhapes you could:

  • rename the functions & hooks to be generic (ie, remove the template references)
  • adding a context parameter to the various hooks with the response type: html, json, etc

@peterwilsoncc The description was out of date from the original purpose, which was to allow for this output buffer to be of use for page caches. Since then, the focus has sharpened to be specifically for enhancing HTML template responses. So it should not run for the REST API and it should not run for feeds or anything else that isn't HTML template responses that get loaded via the template_include filter. I've updated the description to be up-to-date.

westonruter and others added 2 commits October 12, 2025 16:29
Co-authored-by: Peter Wilson <519727+peterwilsoncc@users.noreply.github.com>
@github-actions
Copy link

A commit was made that fixes the Trac ticket referenced in the description of this pull request.

SVN changeset: 60930
GitHub commit: 9d03e8e

This PR will be closed, but please confirm the accuracy of this and reopen if there is more work to be done.

@github-actions github-actions bot closed this Oct 14, 2025
@westonruter
Copy link
Member Author

I don't understand why this PR was closed by committing r60930. Re-opening.

@westonruter
Copy link
Member Author

@dmsnell Any further concerns?

Copy link
Member

@aaronjorbin aaronjorbin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is in great shape for commit (pending the one documentation suggestion I have).

pento pushed a commit that referenced this pull request Oct 15, 2025
This introduces an output buffer for the entire template rendering process. This allows for post-processing of the complete HTML output via filtering before it is sent to the browser. This is primarily intended for performance optimizations and other progressive enhancements. Extenders must not rely on output buffer processing for critical content and functionality since a site may opt out of output buffering for the sake of streaming. Extenders are heavily encouraged to use the HTML API as opposed to using regular expressions in output buffer filters. 

* A new `wp_before_include_template` action is introduced, which fires immediately before the template file is included. This is useful on its own, as it avoids the need to misuse `template_include` filter to run logic right before the template is loaded (e.g. sending a `Server-Timing` header).
* The `wp_start_template_enhancement_output_buffer()` function is hooked to this new action. It starts an output buffer, but only if there are `wp_template_enhancement_output_buffer` filters present, or else if there is an explicit opt-in via the `wp_should_output_buffer_template_for_enhancement` filter.
* The `wp_finalize_template_enhancement_output_buffer()` function serves as the output buffer callback. It applies `wp_template_enhancement_output_buffer` filters to the buffered content if the response is identified as HTML. 
* The output buffer callback passes through (without filtering) any content for non-HTML responses, identified by the `Content-Type` response header.
* This provides a standardized way for plugins (and core) to perform optimizations, such as removing unused CSS, without each opening their own ad hoc output buffer.

Developed in #8412.

Props westonruter, nextendweb, dmsnell, flixos90, jorbin, peterwilsoncc, swissspidy, DrewAPicture, DaanvandenBergh, OptimizingMatters, tabrisrp, jonoaldersonwp, SergeyBiryukov.
Fixes #43258.


git-svn-id: https://develop.svn.wordpress.org/trunk@60936 602fd350-edb4-49c9-b593-d223f7449a82
@github-actions
Copy link

A commit was made that fixes the Trac ticket referenced in the description of this pull request.

SVN changeset: 60936
GitHub commit: a721cf9

This PR will be closed, but please confirm the accuracy of this and reopen if there is more work to be done.

@github-actions github-actions bot closed this Oct 15, 2025
markjaquith pushed a commit to WordPress/WordPress that referenced this pull request Oct 15, 2025
This introduces an output buffer for the entire template rendering process. This allows for post-processing of the complete HTML output via filtering before it is sent to the browser. This is primarily intended for performance optimizations and other progressive enhancements. Extenders must not rely on output buffer processing for critical content and functionality since a site may opt out of output buffering for the sake of streaming. Extenders are heavily encouraged to use the HTML API as opposed to using regular expressions in output buffer filters. 

* A new `wp_before_include_template` action is introduced, which fires immediately before the template file is included. This is useful on its own, as it avoids the need to misuse `template_include` filter to run logic right before the template is loaded (e.g. sending a `Server-Timing` header).
* The `wp_start_template_enhancement_output_buffer()` function is hooked to this new action. It starts an output buffer, but only if there are `wp_template_enhancement_output_buffer` filters present, or else if there is an explicit opt-in via the `wp_should_output_buffer_template_for_enhancement` filter.
* The `wp_finalize_template_enhancement_output_buffer()` function serves as the output buffer callback. It applies `wp_template_enhancement_output_buffer` filters to the buffered content if the response is identified as HTML. 
* The output buffer callback passes through (without filtering) any content for non-HTML responses, identified by the `Content-Type` response header.
* This provides a standardized way for plugins (and core) to perform optimizations, such as removing unused CSS, without each opening their own ad hoc output buffer.

Developed in WordPress/wordpress-develop#8412.

Props westonruter, nextendweb, dmsnell, flixos90, jorbin, peterwilsoncc, swissspidy, DrewAPicture, DaanvandenBergh, OptimizingMatters, tabrisrp, jonoaldersonwp, SergeyBiryukov.
Fixes #43258.

Built from https://develop.svn.wordpress.org/trunk@60936


git-svn-id: http://core.svn.wordpress.org/trunk@60272 1a063a9b-81f0-0310-95a4-ce76da25c4cd
github-actions bot pushed a commit to platformsh/wordpress-performance that referenced this pull request Oct 15, 2025
This introduces an output buffer for the entire template rendering process. This allows for post-processing of the complete HTML output via filtering before it is sent to the browser. This is primarily intended for performance optimizations and other progressive enhancements. Extenders must not rely on output buffer processing for critical content and functionality since a site may opt out of output buffering for the sake of streaming. Extenders are heavily encouraged to use the HTML API as opposed to using regular expressions in output buffer filters. 

* A new `wp_before_include_template` action is introduced, which fires immediately before the template file is included. This is useful on its own, as it avoids the need to misuse `template_include` filter to run logic right before the template is loaded (e.g. sending a `Server-Timing` header).
* The `wp_start_template_enhancement_output_buffer()` function is hooked to this new action. It starts an output buffer, but only if there are `wp_template_enhancement_output_buffer` filters present, or else if there is an explicit opt-in via the `wp_should_output_buffer_template_for_enhancement` filter.
* The `wp_finalize_template_enhancement_output_buffer()` function serves as the output buffer callback. It applies `wp_template_enhancement_output_buffer` filters to the buffered content if the response is identified as HTML. 
* The output buffer callback passes through (without filtering) any content for non-HTML responses, identified by the `Content-Type` response header.
* This provides a standardized way for plugins (and core) to perform optimizations, such as removing unused CSS, without each opening their own ad hoc output buffer.

Developed in WordPress/wordpress-develop#8412.

Props westonruter, nextendweb, dmsnell, flixos90, jorbin, peterwilsoncc, swissspidy, DrewAPicture, DaanvandenBergh, OptimizingMatters, tabrisrp, jonoaldersonwp, SergeyBiryukov.
Fixes #43258.

Built from https://develop.svn.wordpress.org/trunk@60936


git-svn-id: https://core.svn.wordpress.org/trunk@60272 1a063a9b-81f0-0310-95a4-ce76da25c4cd
Copy link
Member

@dmsnell dmsnell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my review is too late, but I left a couple thoughts!

'content-type' === $header_parts[0]
) {
$is_html_content_type = in_array( $header_parts[1], array( 'text/html', 'application/xhtml+xml' ), true );
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

meh. it’s actually worse. PHP will send up to two Content-Type headers, because why wouldn’t it? despite it long being a singleton header. it seems to create two buckets, one for the $replace = true case and one for the $replace = false case. it will send the last version of the Content-Type header for each bucket.

header( 'Content-Type: text/plain' );
header( 'Content-Type: text/html' );
header( 'Content-Type: text/xml' );
header( 'Content-Type: application/xml', true );
header( 'Content-Type: application/octect-stream', false );
header( 'Content-Type: application/xml+svg', true );
header( 'Content-Type: application/json', false );

This sends the following response.

HTTP/1.1 200 OK
Host: localhost:8000
Date: Wed, 15 Oct 2025 17:20:32 GMT
Connection: close
X-Powered-By: PHP/8.4.13
Content-Type: application/xml+svg
Content-Type: application/json

How about 6ee2a45

I think it’s still going to work well enough for most happy-path cases, but it could help to step back and this about this from the HTTP perspective.

  • I don’t believe that the assertion in the code about the first header is accurate, as demonstrated in my included snippet.
  • The code is still vulnerable to spoofing based on non-spec header parsing. It’s not that much additional work to parse the headers in a spec-compliant way, so I think that would be a valuable addition here. (It will parse Content-Type; text/plain as a real Content-Type header, which it isn’t).

We can probably debate how important it is to safeguard this code, but I wouldn’t want to rule out subtle ways that code could force-enable or force-disable the output filtering.

what we have here is something I see a lot, which is a kind of elaboration on code to try and cover more edge cases, but the approach isn’t stemming from the language of HTTP. we could start from that language and it would involve a comparable amount of code. I believe that the PCRE pattern I provided is accurate (though may be it’s wrong. Alternatively, a simple str_starts_with( strtolower( $header ), 'content-type:' ) would expressively cover affirmative matches, missing some valid ones, but is also not prone to the false-positives from the [:;] group.

return $output;
}

$filtered_output = $output;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not a quality issue of course. I figured it was this reason but never saw the multiple filters.

having the $output duplicated in the apply_filters() params seems strange to look at.

it jumps out less to me than creating a variable copy¹ and then doing nothing with it but pass it alongside the parent. still, it was clear that the purpose was to communicate something through its name. I have seen the duplicated variable and Core has one with $email.

either way, it’s fine. it just stood out to me and I wanted to ask if it was an oversight.

¹ using simplified language because I don’t know how to briefly say “have PHP create a new binding to the variable”

@dmsnell
Copy link
Member

dmsnell commented Oct 15, 2025

well done!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants