Caching in TYPO3—Part 2

The Mysteries of cHash unraveled—Cache Variants of a Page

|Benni Mack

We all want our TYPO3 pages to load faster. By understanding the concept of cache variants of a single page, you know exactly how to build and refine your websites built with TYPO3—and you will never ever have to worry about editors complaining about performance anymore. Let’s wrap our heads around the cHash parameter!

Recap of Part 1

Rendering the same page multiple times should not necessarily need to be computed over and over again. That’s why TYPO3 has a strong caching mechanism built-in. We’ve learned about “first hits,” “cached hits,” “uncacheable” parts and a “fully cached page” in our first part of our Caching Blog series.

If your TYPO3 page is slow, it might be because it does not take advantage of being a fully cached page. The worst thing to do in this scenario is to set the “no_cache” parameter via the URL query parameter or TypoScript because you miss out on all TYPO3’s caching power. Converting the affected parts to “uncacheable” objects (USER_INT / COA_INT) will make at least the static parts of a page stored for reuse.

Caching Plugins from Extensions

Often, plugins render business logic on a webpage, rather than simple content blocks. For Extbase-based plugins, each action can be defined as cacheable or uncacheable. We’ll take the example of a news plugin—we’ll call it “simplenews”.

The plugin has two different actions/views to render. A list view, containing a list of the most recent news, and a single view, showing the details of a specific news item. Now to be clear: None of these actions should be considered “uncacheable,” as none of this information will be different for each visitor. But how to make a page “fully cacheable” if such a plugin reacts to a query parameter like tx_simplenews_list[action]=detail in the URL? It should render different results, even though it’s the same TYPO3 page. We call this “variants” of the same page. And by marking your plugin as containing cacheable actions, you’re already good to go.

What Needs to be Cached?

TYPO3 creates a cache entry for a page based on certain factors, which I want to explain in full detail. These parameters make up the cache identifier to store the data in the “pages” cache—not just the simple “Page ID”.

Parameter name

What does it do?

id

Page ID, resolved by the URL Routing / Slug in a PSR-15 middleware. It’s the “uid” of the “pages” table of the default language. Previously found in the “id” GET parameter.

type

a.k.a typeNum, allows specifying a different render format via TypoScript. This is a great way to show the contents of a page as an RSS or JSON formatted view.

groupIds

The user groups of a logged-in user. If no user is logged-in, the default group values [0,-1] apply.

MP

The MountPoint identifier, which refers to the “context” where a page is called, when using the MountPoint pages functionality.

site

The identifier of the current site handling.

siteBase

The base URL of the site, including the language.

staticRouteArguments

Any arguments that have been resolved by the page router and its enhancers.

dynamicArguments

Any relevant GET parameters appended to the URL that could affect a page variant.

The most interesting part for us today is the “dynamicArguments” part, which has been connected to a concept named “cHash” since its inception in TYPO3 v3.8. The creator of TYPO3, Kasper Skårhøj, wrote a great article on this topic and why cHash got created back in 2005—read his “The mysteries of cHash” article to get some more information about the history.

The Main Problem for our Cache

The URL to a page with a plugin of my simplenews extension would look like this:

www.example.test/en/my-page showing the list view (fully cacheable for at least 24hrs).

If I create a link on the detail action in my Fluid template, the link would look like this:

www.example.test/en/my-page?tx_simplenews_list[action]=detail&tx_simplenews_list[news]=23&cHash=ef5f1456ee5a3585b65a69afc9b3c9f8

The cHash is like a signature, verifying that the same page (in this case the page with the URL “/my-page”) should render something different, but still fetching this page variant from the cache.

But why is the cHash query parameter needed at all? Well, we want our websites to load fast, so we want this page—even with the query parameters with “detail” etc.—to be fetched from the cache when visited multiple times. Well, I could create a crawler script and have the script visit your TYPO3 website like this:

www.example.com/en/my-page?benni=cool
www.example.com/en/my-page?benni=coolcool
www.example.com/en/my-page?benni=coolcoolcool
www.example.com/en/my-page?benni=coolcoolcoolcool

This might actually be another variant that should be cached separately. Who would know? TYPO3 could add this to its cache to be faster on the next hit, even though none of the plugins on this page would render anything different. TYPO3 would be open to a so-called “Denial-of-Service” attack, meaning that anybody could fill up TYPO3’s caches until your web server does not have any more space left, taking your whole web server down. That’s not what you want, that’s not what TYPO3 wants, and that’s why we need to verify that a URL called with GET parameters is actually generated by TYPO3. TYPO3 therefore automatically adds the GET parameter &cHash=aNonReadableString to any generated URL.

The “cHash” parameter is a summary of all GET parameters that are relevant for having a different variant of a rendered page, leading to a different cache entry. The summary is then “hashed” (one-way-encrypted) with your specific TYPO3 encryption key—a key set during the installation process—so nobody else can get to it … except if somebody were to gain access to your server

Generating and Resolving Links with TYPO3

TYPO3 takes care of two things:

1. Generating a Link from a Plugin with Custom Query Parameters

When generating a link to a page with additional GET parameters, a signature is added in form of an additional GET parameter called “cHash” so TYPO3 knows we’re going to access a variant of a cached page.

2. Resolving a URL with additional Query Parameters

When accessing a page with additional GET parameters, TYPO3 checks whether a “cHash” parameter is present and recalculates the signature. If the calculated cHash matches the one from the GET parameter, TYPO3 will look for a non-expired cached variant of the page, or build the page contents to put into cache. If there is a cHash mismatch or no cHash given, then TYPO3 will return an error page (“404 Not found”) or in a less restrictive mode, disable caching for calling this page with these exact arguments. The less restrictive mode can be enabled by setting the global setting “pageNotFoundOnCHashError” to false, but might result in slower website rendering.

We should summarize: The cHash mechanism is a great thing because it allows TYPO3 to cache a multitude of different variants of a single page! This is not the case with “no_cache=1” or “uncacheable” objects, where plugins always take up rendering time, as they have to be re-rendered for each visitor. But for a news list view, and detail view, this is completely unnecessary.

In the past, RealURL and other tools tried to circumvent having a cHash GET parameter in a URL for a page variant—sometimes because people think it looks ugly, or it harms search engine rankings. The latter is not true at all: Search engines crawl the web, and do not really care about additional GET parameters. In addition, it was never clear when to explicitly set the “useCacheHash” parameter in typolink (TypoScript) or Fluid ViewHelpers (Templates).

How Can I Get Rid of cHash?

Short answer: You can’t. Best answer: You want to have the cHash mechanism available in order to leverage caching. TYPO3 v9 introduced site handling, which allows you to use cHash transparently by building a “speaking URL”—where cHash is calculated into the URL path, so you never have to worry about ugly URLs from TYPO3 anymore. In previous versions, TYPO3 did not know about routing in general, and only cared about index.php?id=13, looking straight at the GET parameters.

TYPO3 v9’s Site Handling and page slugs got rid of three demons of the past: “id” and “L” GET parameters, as well as the “sys_domain” records. So-called “speaking URLs” are built-in by default. That’s why the only reason a “cHash” GET parameter is now (always) added to your URLs is because a plugin decided to link to a different variant of a page, and this variant needs its own cache entry. And to repeat myself: The only reason for wanting to get rid of the cHash is because people think it looks ugly. There might be use-cases for sharing links in emails, but that is also cosmetic in most situations.

If you’ve ever reset your password on any website, you’ll get an “ugly link” like this:

accounts.example.test/recovery/?user=123123asduoasd8sdf8sdzf&hash=1n23nxc89xcxsc

TYPO3 does exactly the same, and these URLs can never be “beautiful” as they are one-time links. Somebody does not need to remember it, and search engines should not crawl them.

Route Enhancers to the rescue

A link to a news detail page however should be more approachable—and that’s why TYPO3 v9’s Site Handling allows you to configure Route Enhancers. Route Enhancers move GET parameters of plugins into the path part of a URL, and where there are no GET parameters, there’s no need for cHash. Route Enhancers are responsible for transforming these arguments for the plugin into the URL (“creating a link”) and out of the URL (“resolving a URL”)—both managed in the TYPO3 PageRouter class.

The simplenews extension with its sample “list” plugin creates links like this:

example.test/news?tx_simplenews_list%5Baction%5D=detail&tx_simplenews_list%5Bcontroller%5D=Main&tx_simplenews_list%5Bnews%5D=2&cHash=ef5f1456ee5a3585b65a69afc9b3c9f8 

RouteEnhancers will create links that look like this

example.test/news/hooray-my-first-news

and still keep all functionality of dealing with cHash—and thus performant page variants loaded from cache—intact!

In order to configure Route Enhancers, a site administrator currently has to define an Enhancer in the configuration file. So in order to get rid of the cHash in our news plugin, our sites/main/config.yaml contains the following:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
routeEnhancers:
MyNews:
type: Extbase
limitToPages: [13,14,15]
extension: simplenews
plugin: list
routes:
- { _controller: 'Main::list', routePath: '' }
- { _controller: 'Main::detail', routePath: '/{news}' }
aspects:
news:
type: PersistedAliasMapper
tableName: 'tx_simplenews_domain_model_news'
routeFieldName: 'slug'

Excerpt from a site configuration yaml file

The source code for the extension can be found in our GitHub repository.

The Route Enhancer defines the possibilities for one plugin, and one aspect defines one GET parameter (e.g. the “news”), and transforms this into a speaking path segment.

With this technique, TYPO3 knows where to look up the information when building a URL, and where to resolve it when finding out if the page was cached already.

So—those were some serious insights, and luckily TYPO3 does most of the magic for us!

It is worth looking at all plugins (all extensions in typo3conf/ext/) and verifying that they actually have plugins like list and detail marked as “cacheable,” so they benefit from TYPO3’s caching mechanism for page variants.

Excluding GET Parameters from Page Variants

But wait: People get a lot of “page not found”—errors because they somehow clicked on a link from Facebook or a newsletter, where some campaign IDs or fbclid GET parameters are added. How would TYPO3 know the right cHash for that? These links weren’t generated by TYPO3! And it shouldn’t matter if a campaign ID was added; the content is still the same.

This is the moment where most people turn off the global “pageNotFoundOnCHashError” flag, resulting in not fetching information from the cache anymore. Because they need a simple fix so that it “just works”. But looking further, there is a solution for this as well!

All cHash Options in One Place

TYPO3 has some global options to completely ignore GET parameters for caching purposes and cHash calculations.

In your LocalConfiguration.php file or via the Settings module in the TYPO3 Backend, you can find a lot of options within $TYPO3_CONF_VARS[FE][cacheHash].

Let’s look at all system-wide options related to cHash calculation and validation which are available:

Option

Description

$TYPO3_CONF_VARS[FE][cacheHash][cachedParametersWhiteList] (array)

Having this option set automatically uses only a given set of parameters for the cHash calculation. This is especially useful if your website is very specific and clear in terms of plugins, so cHash calculation and generation can become very simple and effective.

$TYPO3_CONF_VARS[FE][cacheHash][requireCacheHashPresenceParameters]

Set parameters that actually require a cHash to be present at all. If no cHash is given but one of the parameters is set, then TYPO3 triggers the configured cHash error behaviour directly.

$TYPO3_CONF_VARS[FE][cacheHash][excludedParameters]

The most typical option to be adjusted. Where TYPO3 has common “campaign” GET parameters like “utm_source”, or “gclid”, “pk_campaign” already pre-configured.
Set any GET parameter that does not affect the output of the page here.

$TYPO3_CONF_VARS[FE][cacheHash][excludedParametersIfEmpty]

Imagine if you have URLs that have a ‘&ftu=’ parameter. Configure Parameters that are only relevant for the cHash if there's an associated value available. Set excludeAllEmptyParameters to true to skip all empty parameters.

$TYPO3_CONF_VARS[FE][cacheHash][excludeAllEmptyParameters] (boolean)

If you deal with additional GET parameters that aren’t used but sometimes set like ‘&ftu=’ having no property, these can be excluded as they have no meaning in the regular website flow—all parameters which are relevant for cHash are only considered if they are non-empty.

$TYPO3_CONF_VARS[FE][pageNotFoundOnCHashError]

Enabled by default: When cHash calculation fails on a request, the request is treated as if the page was not found. If turned off, pages with an invalid hash will have page caching disabled; turning this off is highly discouraged.

Be sure to look up the other settings if they apply to your configuration, where you can set fine-grained exclusions and requirements for calculating the cHash parameter.

And while you’re at it: Make sure to turn the “disableNoCacheParameter” option on so you know you will be saving resources!

Summary

A lot of things are going on under the hood of TYPO3 frontend to make the best out of caching. Our main goal is: Delivering fast websites with TYPO3—even if they have complex plugins on their pages. The cHash argument is a good thing for caching variants of the same page of your TYPO3 site, and you don’t have to be afraid of it once you’ve enabled Site Handling in TYPO3 v9.

Now that you know how to effectively cache page variants with cHash, we can’t wait to show you how to continue making your pages and all their variants lightning fast. If you want us to look at a specific scenario of your TYPO3 project, you can hire us to help you get the most out of your TYPO3 installation!

Read more about uncacheable blocks, “no_cache”-parameters and optimizing TYPO3’s caches in Part 1.