Yet another Sitecore scheduled publishing engine

When you schedule an item in Sitecore, it doesn’t mean that the item gets published or unpublished at that date. It just means that it is possible to publish the item within the given period. One have to perform an actual publish for it to actually happen.

There are many great modules already that handles this, such as Scheduled Publish Module for Sitecore by Hedgehog.

In my scenario, we just wanted to trigger a publish for items as they are scheduled. Scanning through the whole database is very expensive though, so I decided to make one that utilizes our search index instead. We use Solr in our solution, but it’ll probably work as good with Lucene as well.

First, I created a computed field that stores the dates when a publish needs to occur. This can be either a start date or an end date. The field is defined as a list of dates in the index.

public class ScheduledPublishComputedField : AbstractComputedIndexField
{
	public override object ComputeFieldValue(IIndexable indexable)
	{
		Sitecore.Data.Items.Item item = indexable as SitecoreIndexableItem;
		if (item == null)
			return null;

		var dateList = new List<DateTime>();
		if (item.Publishing.PublishDate != DateTimeOffset.MinValue.UtcDateTime)
			dateList.Add(item.Publishing.PublishDate);
		if (item.Publishing.UnpublishDate != DateTimeOffset.MaxValue.UtcDateTime)
			dateList.Add(item.Publishing.UnpublishDate);

		if (item.Publishing.ValidFrom != DateTimeOffset.MinValue.UtcDateTime)
			dateList.Add(item.Publishing.ValidFrom);
		if (item.Publishing.ValidTo != DateTimeOffset.MaxValue.UtcDateTime)
			dateList.Add(item.Publishing.ValidTo);

		return dateList.Count == 0 ? null : dateList;
	}
}

And the field is added into the list of computed fields in the configuration

<fields hint="raw:AddComputedIndexField">
  <field fieldName="scheduledpublish" returnType="datetimeCollection">Stendahls.Sc.ScheduledPublishing.ScheduledPublishComputedField, Stendahls.Sc.ScheduledPublishing</field>
</fields>

Then we need an object that can represent items that should be published that we’ll use when querying the index:

protected class PublishItem
{
	[IndexField("_group")]
	public Guid ID { get; internal set; }

	[IndexField("_fullpath")]
	public string FullPath { get; internal set; }

	[IndexField("_language")]
	public string LanguageName { get; internal set; }

	[IndexField("_latestversion")]
	public bool LatestVersion { get; internal set; }

	[IndexField("scheduledpublish")]
	[IgnoreIndexField]
	internal DateTime ScheduledPublishDateTime { get; set; }
}

Then we need an agent that can check the index if there are any items that needs publishing. Essentially we just need a timestamp when the agent was previously executed, so instead of creating new database tables etc, I decided to just use a property in the core database. So the agent wrapper becomes quite simple:

public void Run()
{
	var database = Factory.GetDatabase(SourceDatabase);
	var now = DateTime.UtcNow;

	var lastRun = LoadLastPublishTimestamp(database);
	if (lastRun == DateTime.MinValue)
	{
		SaveLastPublishTimestamp(database, now);
		return;
	}

	try
	{
		PerformPublish(lastRun, now);
	}
	finally
	{
		SaveLastPublishTimestamp(database, now);
	}
}

private string PropertiesKey
{
	get
	{
		return "ScheduledPublishing_" + Settings.InstanceName;
	}
}

public DateTime LoadLastPublishTimestamp(Database db)
{
	string str = db.Properties[PropertiesKey];
	DateTime d;
	DateTime.TryParseExact(str, "s", CultureInfo.InvariantCulture,
		DateTimeStyles.None, out d);
	return d;
}

public void SaveLastPublishTimestamp(Database db, DateTime time)
{
	db.Properties[PropertiesKey] = time.ToString("s", CultureInfo.InvariantCulture);
}

Now we can easily find what items needs publishing by a regular search query:

var searchContxt = ContentSearchManager.GetIndex(SourceIndex).CreateSearchContext();

List<PublishItem> publishItemQueue = new List<PublishItem>();
int skip = 0;
bool fetchMore;
do
{
	var queryResult = searchContxt.GetQueryable<PublishItem>()
		.Filter(f => f.LatestVersion && 
			f.ScheduledPublishDateTime.Between(publishSpanFrom, publishSpanUntil, Inclusion.Upper))
		.OrderBy(f => f.FullPath)
		.Skip(skip)
		.Take(500)
		.GetResults();
	skip += 500;

	publishItemQueue.AddRange(queryResult.Hits.Select(h => h.Document));
	fetchMore = queryResult.TotalSearchResults > skip;
} while (fetchMore);

I decided to opt for publishing related items for each item that is scheduled. This may require adaptations according to you needs. The complete source of the agent ended up like this in my case:

using System;
using System.Collections.Generic;
using System.Globalization;
using System.Linq;
using Sitecore.Collections;
using Sitecore.Configuration;
using Sitecore.ContentSearch;
using Sitecore.ContentSearch.Linq;
using Sitecore.Data;
using Sitecore.Data.Items;
using Sitecore.Data.Managers;
using Sitecore.Diagnostics;
using Sitecore.Diagnostics.PerformanceCounters;
using Sitecore.Globalization;
using Sitecore.Publishing;

namespace Stendahls.Sc.ScheduledPublishing
{
	public class PublishAgent
	{
		public string SourceDatabase { get; private set; }
		public string SourceIndex { get; private set; }
		public List<string> TargetDatabases { get; private set; }
		public PublishMode Mode { get; private set; }

		public PublishAgent(string sourceDatabase, string sourceIndex, string targetDatabases)
		{
			Assert.ArgumentNotNullOrEmpty(sourceDatabase, "sourceDatabase");
			Assert.ArgumentNotNullOrEmpty(sourceDatabase, "sourceIndex");
			Assert.ArgumentNotNullOrEmpty(targetDatabases, "targetDatabase");
			SourceDatabase = sourceDatabase;
			SourceIndex = sourceIndex;
			TargetDatabases = ParseDatabases(targetDatabases);
		}

		public void Run()
		{
			var database = Factory.GetDatabase(SourceDatabase);
			var now = DateTime.UtcNow;

			var lastRun = LoadLastPublishTimestamp(database);
			if (lastRun == DateTime.MinValue)
			{
				SaveLastPublishTimestamp(database, now);
				return;
			}

			try
			{
				PerformPublish(lastRun, now);
			}
			finally
			{
				SaveLastPublishTimestamp(database, now);
			}
		}

		public void PerformPublish(DateTime publishSpanFrom, DateTime publishSpanUntil)
		{
			var searchContxt = ContentSearchManager.GetIndex(SourceIndex).CreateSearchContext();

			List<PublishItem> publishItemQueue = new List<PublishItem>();
			int skip = 0;
			bool fetchMore;
			do
			{
				var queryResult = searchContxt.GetQueryable<PublishItem>()
					.Filter(f => f.LatestVersion && 
						f.ScheduledPublishDateTime.Between(publishSpanFrom, publishSpanUntil, Inclusion.Upper))
					.OrderBy(f => f.FullPath)
					.Skip(skip)
					.Take(500)
					.GetResults();
				skip += 500;

				publishItemQueue.AddRange(queryResult.Hits.Select(h => h.Document));
				fetchMore = queryResult.TotalSearchResults > skip;
			} while (fetchMore);

			if (publishItemQueue.Count == 0)
				return;

			var db = Factory.GetDatabase(SourceDatabase);
			var publishingTargets = GetPublishingTargets(db, TargetDatabases);

			// Loop over the queue, but not using a regular foreach, since
			// we'll remove items from the queue as they are processed as related items.
			while (publishItemQueue.Count > 0)
			{
				var publishItem = publishItemQueue.First();
				publishItemQueue.RemoveAt(0);

				if (publishItem.LanguageName == null)
					continue;

				var language = LanguageManager.GetLanguage(publishItem.LanguageName);
				if (language == null)
					continue;

				var item = db.GetItem(new ID(publishItem.ID), language);
				if (item == null)
					continue;

				PublishItemAsyncWithRelatedItems(item, publishingTargets);
			}
		}

		protected virtual void PublishItemAsyncWithRelatedItems(Item item, List<string> publishingTargets)
		{
			if (item == null)
				return;

			var targetDb = Factory.GetDatabase(TargetDatabases.First());
			var options = new PublishOptions(item.Database, targetDb, PublishMode.Incremental, item.Language,
				DateTime.UtcNow, publishingTargets)
			{
				RootItem = item,
				Deep = true,
				PublishRelatedItems = true
			};
			var publisher = new Publisher(options);
			publisher.PublishAsync();

			// Increment performance counter
			JobsCount.TasksPublishings.Increment();
		}

		private List<string> GetPublishingTargets(Database sourceDatabase, ICollection<string> targetDatabases)
		{
			var targets = new List<string>();
			var parent = sourceDatabase.GetItem("/sitecore/system/publishing targets");
			// Loop over all targets and add those that matches the database list
			foreach (Item target in parent.GetChildren(ChildListOptions.SkipSorting))
			{
				if (targetDatabases.Contains(target["Target database"]))
					targets.Add(target.ID.ToString());
			}
			return targets;
		}

		private string PropertiesKey
		{
			get
			{
				return "ScheduledPublishing_" + Settings.InstanceName;
			}
		}

		public DateTime LoadLastPublishTimestamp(Database db)
		{
			string str = db.Properties[PropertiesKey];
			DateTime d;
			DateTime.TryParseExact(str, "s", CultureInfo.InvariantCulture,
				DateTimeStyles.None, out d);
			return d;
		}

		public void SaveLastPublishTimestamp(Database db, DateTime time)
		{
			db.Properties[PropertiesKey] = time.ToString("s", CultureInfo.InvariantCulture);
		}

		private static List<string> ParseDatabases(string databases)
		{
			return databases.Split(',')
				.Select(s => s.Trim())
				.Where(s => !string.IsNullOrWhiteSpace(s))
				.ToList();
		}
	}
}
<?xml version="1.0" encoding="utf-8" ?>
<configuration xmlns:patch="http://www.sitecore.net/xmlconfig/" xmlns:x="http://www.sitecore.net/xmlconfig/">
	<sitecore>
		<scheduling>
			<agent type="Stendahls.Sc.ScheduledPublishing.PublishAgent, Stendahls.Sc.ScheduledPublishing" method="Run" interval="00:30:00" >
				<param desc="source database">master</param>
				<param desc="source index">sitecore_master_index</param>
				<param desc="publish targets">web</param>
			</agent>
		</scheduling>
		<contentSearch>
			<indexConfigurations>
				<defaultSolrIndexConfiguration>
					<fields hint="raw:AddComputedIndexField">
						<field fieldName="scheduledpublish" returnType="datetimeCollection">Stendahls.Sc.ScheduledPublishing.ScheduledPublishComputedField, Stendahls.Sc.ScheduledPublishing</field>
					</fields>
				</defaultSolrIndexConfiguration>
			</indexConfigurations>
		</contentSearch>
	</sitecore>
</configuration>

6 thoughts on “Yet another Sitecore scheduled publishing engine

  1. First of all thank you. And now I have a question which DLL reference should be used for the following:

    using Sitecore.ContentSearch;
    using Sitecore.ContentSearch.Linq;
    using Sitecore.Globalization;

  2. Sitecore.ContentSearch.dll, Sitecore.ContentSearch.Linq.dll and Sitecore.Kernel.dll if I remember right.

  3. When Trying to build I am getting these errors:

    Error CS1061 ‘PublishItem’ does not contain a definition for ‘LatestVersion’ and no extension method ‘LatestVersion’ accepting a first argument of type ‘PublishItem’ could be found (are you missing a using directive or an assembly reference?)
    Error CS1061 ‘PublishItem’ does not contain a definition for ‘ScheduledPublishDateTime’ and no extension method ‘ScheduledPublishDateTime’ accepting a first argument of type ‘PublishItem’ could be found (are you missing a using directive or an assembly reference?)
    Error CS1061 ‘PublishItem’ does not contain a definition for ‘FullPath’ and no extension method ‘FullPath’ accepting a first argument of type ‘PublishItem’ could be found (are you missing a using directive or an assembly reference?)
    Error CS1061 ‘PublishItem’ does not contain a definition for ‘LanguageName’ and no extension method ‘LanguageName’ accepting a first argument of type ‘PublishItem’ could be found (are you missing a using directive or an assembly reference?)
    Error CS1061 ‘PublishItem’ does not contain a definition for ‘LanguageName’ and no extension method ‘LanguageName’ accepting a first argument of type ‘PublishItem’ could be found (are you missing a using directive or an assembly reference?)
    Error CS1061 ‘PublishItem’ does not contain a definition for ‘ID’ and no extension method ‘ID’ accepting a first argument of type ‘PublishItem’ could be found (are you missing a using directive or an assembly reference?)

    I am running Sitecore 8.2 is that because I am using the new Kernel.dll ?

    • Hi,
      sorry about the blog post being a bit unclear about this. The solution requires three classes in order to work, and it looks like you’re missing one or two of them. The big code block at the end only covers the PublishAgent class. You also need the ScheduledPublishComputedField and PublishItem classes as described previously in the post.

      Please note also that 8.2 has a new optional publish engine that you may install separately. This code is not tested with 8.2 yet. I believe it will work if you’re still on the default/old publish engine, but I assume some changes are needed for it to work with the new stand-alone .Net Core based publishing service.

Leave a Reply