Addressing Duplicate Content in TYPO3
How duplicate website content impacts search engine optimization
Duplicate website content impacts search engine optimization (SEO), so you should aim to minimize it where possible. What about translated and localized content? The good news is that this is not deemed to be duplicate content by search engines, but that doesn’t mean you can just translate your site and still rank for your regional visitors. International SEO, particularly for localized websites, requires thought and care. TYPO3 comes with many SEO features that help you avoid creating duplicate content, whether in one language or many. You can achieve both market and language diversification in a single system with TYPO3—and avoid duplicate content while you’re at it.
First, let’s look at duplicate content. You need to understand it and know how your CMS handles URLs, particularly when it comes to multiple languages and multiple domains, so that you can implement an effective content and SEO strategy.
What is duplicate content and what is its impacts
Duplicate content is the same (or slightly similar) content that is available at multiple URLs within the same website or across domains.
Google long-since busted the myth of the "duplicate content penalty" but duplicate content does have a very real impact on your search engine results page (SERP) rankings.
- Lower rankings and reduced traffic: Search engines don’t want to show multiple pages with the same content, so they try to determine which result is most relevant to a particular search query, but there’s a chance they’ll get it wrong. This can lead to lower rankings and consequently, less organic traffic to your site.
- Wasted crawl budget means less indexing: Duplicate pages waste your crawl budget (the amount of time search engine bots crawl your site), limiting the number of unique pages search engines can index. This is particularly important for websites with lots of pages (like ecommerce sites). When your valuable content isn’t indexed it reduces your site’s visibility in search results.
Typically, the way to handle duplicate content on your site involves using redirects and a sitemap identifying canonical URLs.
Use redirects
Redirects are a simple way to address duplicate content from your site. Minimize duplicate content by redirecting an old URL to the new version. TYPO3 automatically creates redirects for you - find out more in our guide, Navigating page movements and redirects in TYPO3.
Use canonical tags
Canonical tags (also called rel="canonical" tags) are HTML snippets that specify the preferred URL for duplicate content. Placed in the <head> section, they tell search engines which page to index and display in search results. If you don’t explicitly tell a search engine which URL is canonical, it will make the choice for you. TYPO3 automatically adds canonical links to pages and has support for many other essential SEO features.
Hreflang tags for translated pages
A hreflang tag is an HTML attribute used to specify a webpage’s language and geographical targeting. In TYPO3, the hreflang tag is added automatically to link different multilingual or regional versions of the same page - so that search engines know it is not duplicate content. You can read more about why hreflang tags are important for global SEO.
How your CMS should help
One of the best ways to avoid duplicate content is understanding how your content management system (CMS) constructs URLs. Your CMS could be creating duplicate pages you don’t even know about. For example, say that you run an ecommerce site and have a product page that sells t-shirts. Ideally, every size and color of that t-shirt will have the same URL. But your CMS might create a new URL for every different version of the t-shirt, resulting in hundreds of duplicate content pages.
In particular, look out for:
- Www and non-www variations
- HTTP and HTTPS variations
- Variants of a URL with and without a trailing slash ("/"). Find out more about the significance of trailing slashes in URLs in our article, To slash or not to slash?
- Separate mobile and desktop versions
- Extra parameters like session IDs, query strings, click tracking, or print-friendly versions.
You can help search engines by keeping the URL as clean as possible, and by submitting a sitemap with the canonical version of each URL.
TYPO3’s built-in SEO features give you granular control over your content and pages, including how you want to construct your URLs, automatic canonical and href tags, and generating a sitemap.xml file. At b13, we’ve developed several URL-specific extensions to handle various scenarios like case sensitivity, alias mapping, and trailing slashes.
Reaching an international audience
Translation and localization add complexity to the duplicate content issue. You might want to translate your content, adapt your content to specific locales, or both. But you don’t want to be penalized for duplicate content.
Good news: translated and localized content (customized for regional markets) is an acceptable form of duplicate content. But unique content is always best practice for international SEO.
Simply copying your American English product pages, adjusting the spelling for the UK version, and adding hreflang tags to distinguish them won't suffice. Inevitably, the .com version will outrank the .co.uk version. The best way to make your content rank for regional audiences is to truly localize it. Re-frame the same content to make it culturally relevant and authentic—and avoid content duplication.
In many CMSs, working with translated and localized content across multiple websites to implement your international SEO strategy can quickly get complicated. TYPO3 lets you share content and configuration between multiple websites and domains in one installation — and the page tree structure makes it intuitive to navigate.
Link your multisite, multilingual content
The page tree is unique in TYPO3 and represents the sites and hierarchical page structure of your TYPO3 instance. There are two approaches for setting up your website for translation and/or localization: single-tree and multitree. The approach you choose can impact your risk of duplicate content.
- Single tree - One page tree for all languages. In this setup, translated pages are treated as different versions of the same page, and hreflang-tags are created automatically. It’s clear to search engines that this is not duplicate content.
- Multitree - A separate page tree for each language. This is a common approach for many organizations but comes with a downside. In this setup, the translated versions are unlinked by hreflang-tags so search engines cannot tell the relationship between the pages and might treat them as duplicate content.
We created an extension to make hreflang tag links between pages regardless of which page tree approach you are using. This not only solves the SEO dilemma of duplicate content, but brings other benefits like a more complete sitemap and the ability to give regional user roles permissions for editors.
Avoiding duplicate content with TYPO3
Some CMSs are good at handling translations while other CMSs are better suited for multiple sites. TYPO3 excels at both. It allows you to manage markets and translations seamlessly, targeting different regions with tailored content in different tones and on different domains. TYPO3 provides the flexibility and tools to handle everything within a single system, avoiding duplicate content issues. You can set up multiple domains, customize URLs, translate and localize content, and use the same system for content management and the same workflow for publishing across your global company.
At b13, we always recommend using one tool for the job. If you want to implement a complex, international content and SEO strategy, choose the right system for it: TYPO3.
Need help with your duplicate content?