Flatten the Sitecore web database

This post describes a concept of resolving content during publish in Sitecore 8.1, but most of it will apply to older versions as well.

With Sitecore 8.1 we get language fallback out of the box. Previously we’ve had the Partial Language Fallback module written by Alex Shyba. The new built-in one is a rewritten one.

It is a very powerful function, but it also uncovers some problems that occurs when relying on any kind of fallback mechanism. The most obvious one is performance. It does take some extra CPU cycles to resolve a fallback value, but the impact should not be that big with correct cache sizes etc.

A less obvious, but to me greater, problem is the fact that it may be quite hard for editors to really know what’s being published. An editor may be presented with the expected content in the Experience Editor (or in the Content Editor for that matter), but a field may fallback to an item that is not publishable or may not be in its final workflow step etc. Since the resolving is performed run-time, the presented content may be from an older version or even null/empty if the referred item doesn’t exist.

To solve this, I wanted to flatten the content of the web database in a similar way as clones work. When a cloned item is published, the clone reference is removed and the source content is copied into the each clone. Thereby the final value is always written into each field in the web database and no fallback processing is needed. This also eliminates the problem with unpublished referred item content.

There are a few ways to hook into the publishing process. For this scenario I found it quite simple to actually replace the publishing provider with a custom one that overrides the default PipelinePublishProvider. In Sitecore there is a PublishHelper that is quite simple to extend. By overriding the publish provider, we can provide our own PublishHelper class, like this:

public class CustomPipelinePublishProvider : Sitecore.Publishing.PipelinePublishProvider
{
	public override PublishHelper CreatePublishHelper(PublishOptions options)
	{
		Assert.ArgumentNotNull(options, "options");
		return new CustomPublishHelper(options);
	}
}

Then we need to take a look at the default implementation of PublishHelper. Most of the magic is performed in a private method called TransformToTargetVersion. Using a favorite reflection tool, we can build our own version of this:

public class CustomPublishHelper : Sitecore.Publishing.PublishHelper
{
	public CustomPublishHelper(PublishOptions options) : base(options)
	{
	}

	protected virtual Item TransformToTargetVersion(Item sourceVersion)
	{
		// Keep majority of exsting code from original method
		// Depending on you requirement, add custom flattening code, 
		// at a suitable place such as this:
		FieldCollection fields = sourceVersion.Fields;
		fields.ReadAll();
		foreach (Field field in fields)
		{
			if (field.Definition != null && !field.Definition.IsShared &&
				field.Definition.IsSharedLanguageFallbackEnabled && !field.HasValue)
			{
				// Force a copy of the value into the target field
				targetVersion.Fields[field.ID].SetValue(field.Value, true);
			}
		}
	}

	public override void CopyToTarget(Item sourceVersion)
	{
		// Unchanged code from original method
	}

	private void ReplaceFieldValues(Item targetVersion)
	{
		// Unchanged code from original method
	}

	private void CopyBlobFields(Item sourceVersion, Item targetVersion)
	{
		// Unchanged code from original method
	}
}

Finally we need to configure our new publish provider. The publishing process is also executed in the publisher site context, so in order for this to work, we have to enable language fallback on that site as well. Such configuration could look something like this:

<configuration xmlns:patch="http://www.sitecore.net/xmlconfig/">
  <sitecore>
    <publishManager>
      <patch:attribute name="defaultProvider">custom</patch:attribute>
      <providers>
        <add name="custom" type="Custom.Namespace.CustomPipelinePublishProvider, Custom.Assembly" />
      </providers>
    </publishManager>
    <sites>
      <site name="publisher">
        <patch:attribute name="enableItemLanguageFallback">false</patch:attribute>
        <patch:attribute name="enableFieldLanguageFallback">true</patch:attribute>
      </site>
    </sites>
  </sitecore>
</configuration>

As discussed previously, the result of this is that the evaluated value is written into the web database, i.e. the content that an editor sees in Sitecore is the actual content being published. This is typically the expected behavior, since it is that piece of content the author is typically approving through a workflow and publish.

However, one could argue that if the fallback value is not in a publishable state, a inheriting item should not be allowed to publish this content either. Luckily the language fallback function itself is implemented as part of the getFieldValue pipeline, so the GetLanguageFallbackValue can probably be replaced with one that always returns the field value from latest approved and publishable item. I haven’t tried this myself yet though. That’s a subject for a follow post.