Global audiences are becoming increasingly sophisticated with digital technologies and even people in more remote parts of the world are starting to get access to data-enabled mobile devices. As a result, billions of people are accessing websites from every corner of the world, making multilingual sites a high priority for companies and organizations looking to expand into this global marketplace. As an enterprise customer experience management platform, Sitecore has been perfectly architected to manage and deliver multilingual digital solutions. Over the next couple of months, we will be publishing a multi-part series that explores several topics relevant to implementing multilingual solutions in Sitecore and key considerations to be aware of as you embark on this process, including understanding language fallback, content translation options, and more.

In this first post of the series, we’re going to look at one of the initial tasks that needs to be prioritized when you're building a multilingual site in Sitecore, which is how to organize your content within the CMS. To do this, there are a number of things you need to have a thorough understanding of, including:

  • the business goals you want to achieve
  • the audiences you're trying to target
  • the breadth of your content
  • how your content needs will evolve over time
  • your editorial processes

Spending the necessary time upfront to sit down with your stakeholders and gather this information in detail will go a long way towards designing a content hierarchy that better meets your business goals and operational needs.

Content Architecture Options for Your Multilingual Sitecore Solution

Within Sitecore, content is stored, accessed, and served via the Content Tree. Architecting this tree the right way, at the outset, is important as it has significant implications for how your editorial teams will manage content, as well as how this content will be delivered to your target audiences across your multilingual websites.

Fundamentally, you have three choices as far as how to set up your Content Tree. You can leverage the versioning feature of Sitecore and have all your multilingual websites sharing the same tree. Alternately, you can split the content into separate trees, one for each language, or you can utilize a hybrid approach. Each has benefits and drawbacks, as we’ll detail below.

Multilingual Content Architecture Option 1: the Shared Tree

What is a shared content tree?

A shared content tree is one in which the localized content (i.e. content that is not only translated to the local language but adapted to meet the local area’s cultural, currency, legal, and other specific nuances, thus potentially requiring different layouts, etc.) is stored in language versions of the content items. You can conceptualize this by understanding that all Sitecore item versions have both a numeric and language-specific identifier. You may not have noticed this previously if you’ve been working within a Sitecore solution that covers only one language or localization, as the language identifier of the version becomes somewhat redundant in this case.

versioning of items in Sitecore

Versioning of items in Sitecore. An item has both a language version and a numerical version.

Key points for consideration.

Since versioning on language is the way in which localized content will be created and managed in a shared tree architecture, there are some things you’ll want to consider when adopting this approach. Your content is going to share the same tree structure, which means all localizations across languages will have the same content hierarchy. Therefore, if a page exists in one language, it will exist in all languages unless you take steps to hide it.

Additionally, Sitecore uses the item name when resolving URLs, so this specific part of the URL will be the same across all localized content. This could have SEO implications. For example, say you have a page about water in US English (en-US), and the item name is "water". The same page exists in Mexican Spanish (es-MX), and is translated to Spanish by the content author, who updates the title to "agua". The default URL resolver will resolve the page on the site for Mexico as "/water" as well, rather than "/agua". This behavior can be overridden, like many things in Sitecore, but now you are introducing added complexity and customization. The URL "/water" will resolve even on the language sites where a specific language version of the item has not been created. This can be handled either with a custom step in your URL resolution pipeline, or by taking advantage of the Enforce Version Presence setting.

Adding language versions alongside numbered versions is essentially adding another dimension to your content, and it can become confusing. Consider an item with 3 versions in US English, 2 versions in Canadian English, and 1 version in Canadian French. Authors can get confused as to which version they should be working on. When combined with the Language Fallback and Shared Layout features, content authors can easily make inadvertent edits to localized versions that they didn't intend to update. On top of this, each multilingual site could have its own separate workflow process making the publishing of this different content more complex.

Versioning of content by language and locality

Versioning of content by language and locality

These are powerful capabilities that Sitecore allows you to take advantage of so that you can finely control the creation, management, and promotion of content. Therefore, it's important to train the authors and editors using your Sitecore solution in the best way to manage language versions, or create specific roles and permissions that facilitate ease of use. These need to be weighed carefully when considering your content architecture. A system design may fulfill all the business requirements, but if it's too complex for the people using it, then the design has failed.

When to use a shared tree architecture.

Use a shared tree architecture if

  • your multilingual sites will have identical or very similar content structures,
  • your page layouts (renderings and datasources) will be the same or similar,
  • your content will share a common workflow across language versions, or
  • you require language fallback. (Language versions on items are required in order to take advantage of Sitecore's language fallback capabilities.)

A Shared Tree architecture works best for companies or organizations whose content authors agree to using shared page templates (and don’t intend on making changes to them that would subsequently affect shared elements). When using this approach, language should be the primary differentiator between locales and not cultural, currency, or legal variations that would require entirely different presentations (page layout templates) or page hierarchies.

Multilingual Content Architecture Option 2: The Split Tree

What is a split content tree?

A split content tree is one where your localized content is separated into its own content tree rather than being managed by creating language versions of a given content item. You would essentially have a distinct home node for every localized site, each with its own subtree of pages, content datasource items, settings items, and so on. This approach presents an alternative to some of the aspects described in the shared tree approach, while introducing new points to consider. Typically, these trees are bound to separate <site> nodes in your Sitecore configuration, and are managed much like a multi-site Sitecore solution. Language resolution can be done at the hostname level instead of using cookies or language parts in the URL (e.g. mysite.com, mysite.ca, mysite.uk).

Separation of content by locality and language.

Separation of content by locality and language. Each localized site has its own content tree.

Key points for consideration.

A split tree opens up the possibility of having a unique content structure per localization. This can be reflected in the item names themselves, such as our "water" page example above. The item can be named “agua” for a Spanish language site and have its URL path reflect the same, thereby representing the target audience's languages without any customization to Sitecore's URL resolver. If you don't need a "water" page at all in this localization, you don't need to take steps to hide it; you simply do not create it. Similarly, you can create entire sections specific to that locality without worrying about hiding them in all the other languages. You can even alter the content hierarchy to one that makes more sense for that target audience without worrying about the impact to the other multilingual sites.

As an author, your process is simplified as well, since the language dimension is essentially removed from the versioning scheme. You can take advantage of the shared layout and know it will only affect this locality instead of all localities. Without Language Fallback to consider, you can make edits without worrying about a change cascading to other language versions without review.

When to use a split tree architecture.

Use a split tree architecture if,

  • Your multilingual sites require distinct content structures.
  • You do not require language fallback.
  • Your multilingual sites will not share any content items or just a few, at most.
  • Your multilingual content requires unique workflows.

A Split Tree architecture offers simplicity – cutting out the extra editorial effort that comes with more complex versioning as long as your content structure needs meet the criteria above.

Multilingual Content Architecture Option 3: The Hybrid Tree

What is a hybrid content tree?

Sometimes your organization's needs can be complex, and you may realize that there are benefits to both the shared tree and split tree approaches that you’d like to incorporate into your solution. It's possible to combine these strategies into a hybrid tree structure, where there are separate trees for specific content hierarchies, and within those trees we have language versions of the content. With this approach, you are taking on the complexity of both architectures, but you're also able to use the benefits strategically with a carefully-considered design.

Separation of content by region

Separation of content by region. Authoring teams manage their own content tree with language versions relevant to their region.

Key points for consideration.

Let’s say for example that you are in charge of a large organization that markets products worldwide. Your teams are organized to target specific regions of the world, such as a North American team and a European team, and the marketing needs for each region are distinct, along with their product catalogs. In this case, you decide to create one tree for North America and a separate one for Europe, each managed by that region's team of content authors and editors. Within those regional sites, you create language versions for the countries in those regions. The North America tree has versions for the US (en-US), Canada (en-CA and fr-CA), and Mexico (es-MX). The European tree has versions for the UK (en-UK), Spain (es-ES), Germany (de-DE), and other countries you market to.

Hybrid architecture, a split tree containing different language versions

Hybrid architecture. A split tree containing different language versions.

Using this approach, your teams can create specific campaign pages for a region and not worry about the other region stumbling upon it in their tree, or having it conflict with that region's campaign pages. You can take advantage of language fallback within that region, but still separate the "base" languages from each other. Your product catalogs can differ, allowing the products unique to each region to not be displayed in a region that doesn't offer them. Granted, products that are offered in both regions will need to be duplicated.

Another example could be something as simple as a shared content folder. Perhaps you utilize a split tree approach across your localities, but they all share the same product catalog. Rather than duplicate the products in each tree, each localized site uses the same product tree that contains language versions of the products items.

When to use a hybrid tree architecture?

The hybrid architecture allows you to utilize both a shared tree structure (for pages that just need translation to different languages) and a split tree structure (for content that is specific to a given region or locale and uses different hierarchies and presentations).

Failing to Plan Is Planning to Fail When It Comes to Your Multilingual Solution

Once you decide on how you'll structure your content, you’ll likely be locked into it for the foreseeable future. If you've built components, written code, and deployed pages and campaigns that assume a certain content architecture, changing it after the fact is going to incur a significant amount of time and money. Taking a “figure it out as you go” approach, is a recipe for failure. Therefore, investing the time up-front to carefully consider your content hierarchy, tree structure, and approach to localization and multilingual content is critical. With some forethought and a solid investment in creating your website’s multilingual strategy, you will be able to implement a solution that better connects with your global audience and helps your organization thrive over the long-term in the global marketplace.

Have you worked with multilingual implementations in Sitecore? Do you have feedback and insights to share? We’d love to hear your thoughts and experiences. Please add to the discussion via the comments below or Tweet Us, @Velir!