About Mikael Högberg

[bio] Work at Stendahls. Blog about coding, hobby electronics and stuff

Why TDS may want to sync non-existing item versions

I recently ran into a strange issue where I noticed TDS wanted to sync a lot of item versions that didn’t exist. It took me a while before I nailed what was actually causing this. In the example below, the “Home” item only have a “en” language version. There is no version on the other languages. Still, TDS wants to serialize many more languages.

If I serialize this item into disk, it indeed adds all these versions as well.

Note: This is not a problem with TDS itself. It’s a consequence of a faulty database.

Continue reading

Sitecore “Switchers”

Sitecore has tons of “Switchers” that are pretty useful in many scenarios. You’re probably familiar with a few of them, such as EditContext, SecurityDisabler and LanguageSwitcher. Others are less known but very valuable when needed.

Switchers temporary puts Sitecore in a different mode for a piece of code and then restores it to its previous mode when done. A typical usage would look something like this:

using (new LanguageSwitcher(lang)) {
  // perform custom operations on "lang" instead of the context language
}

I made a list of common, and not so common switchers. Hope you find some of them useful.

Continue reading

Improved Sitecore Links Database provider

I have a somewhat love and hate relationship to the Sitecore Links Database. It’s really useful in some scenarios, especially in scenarios such as integrations, batch operations etc, where a Solr index or similar might not accurate enough. The Links Database has also caused me many performance headaches over the years due to some issues with its implementation.

The main problem with the default Links Database Provider is that it performs multiple small synchronous SQL “insert” and “delete” operations in the database. As the Sitecore databases grows, so does the Links table. With many insert/delete operations, the table indexes becomes very fragmented and it slows down the system. The application performance may also be quite heavily effected if latencies to SQL Server is increased.

I’ve had on my radar for a few years to write a new one. Finally I made some time to improve the implementation a bit and put it on GitHub.

Continue reading

Generating predictive Sitecore ID’s

Sitecore ID’s, or Guids, are many times great as content can be created in different places and be moved between databases without facing the risk of colliding id’s. When integrating stuff into Sitecore, you may need to represent some data as Items. When you author new content in Sitecore that references such imported data, it suddenly becomes a bit trickier to move data between instances etc. You’d basically have to transfer all the integrated data as well to ensure data consistency. It also removes the ability to just remove integrated data and re-run the process, as it would generate new IDs.

So what if we could integrate data from an external source and represent it as items with a predictive Sitecore ID? I.e. give each item a unique ID that won’t collide with anything else, but will always become the same every time the process is run in every system.

Continue reading

Dealing with Solr Managed Schema through Sitecore config files

When working with Content Search and Solr in Sitecore, it’s quite common that a field needs to be managed in both Sitecore and Solr. It can for example be a computed field or a basic field where you want specific field processing etc. where you need to slightly change the Solr schema. Having the configuration for such scenario split in two places can be quite annoying.

I while ago, I wrote a small module that’ll let you put your Solr Managed Schema configuration as part of the standard Sitecore configuration. This gives a better overview and configuration that plays together, stays together.

The module adds a new <solrManagedSchema> section in the contentSearch/indexConfiguration-section. The content of this section is almost identical to what you’d put in the Solr schema. The only difference is that fields, field types, dynamic fields and copy-fields are grouped separately with a Sitecore “hint” attribute.

Let’s describe the benefits of this with an example. Let’s say you need a computed field returning a set decimal numbers, like a set of variant specs etc. You’d obviously need the computed field class itself, but you’d also need to do a type match (because list of floats are not within the default mapping) and you’d need to modify the schema to support fields with multiple floats. The configuration could end up something like this:

Continue reading

Sitecore Azure Blob Storage module findings

A few years ago, I wrote about storing Sitecore binaries in an external blob storage service instead of having them in the database. You can read more about it here and the code is available on GitHub. It has several benefits and it works great! I’ve used this implementation in production on large scale solutions for many years.

In Sitecore 9.3, Sitecore introduced its own Azure Blob Storage module, that uses the same principles. Sitecore also slightly changed how databases are configured, so my old module works up to 9.2 as it is right now.

Since Sitecore supports for their own module, it makes sense to use that one instead of running a custom one. However, in true Sitecore spirit, the module was released without testing, so beware of the findings below before using it:

Update: Sitecore have issued a cumulative update to Sitecore 9.3 that addresses the two major faults described in this post. Make sure you have SC Hotfix 415404-1 or later installed.

Continue reading

Prevent Sitecore editors playing around with rendering cache options

Caching of renderings is a vital part of good performing Sitecore websites. Getting all the settings right aren’t trivial and it’s really only the developers of components that knows the details about how each component can be cached. Therefore caching options are typically managed on the rendering items and kept serialized together with solution source.

However, rendering cache options can also be set on a rendering in the layouts field when it’s used on a page. This is done in the “Control Properties” dialog. Personally I’ve never found a use case where this is needed, but the option is there.

Whether a user has access to the rendering parameters in the “Control Properties” dialog is controlled by “write” access to the Standard Rendering parameters template’s standard value. Sigh!

Continue reading

Faster indexing in Sitecore

Sitecore indexes is very powerful for getting various items fast, especially when they are located in various places. There are also some pitfalls that one needs to be aware of. This post covers methods that could be considered in some scenarios. In this post I’ll describe how I reduced indexing time from around three hours down to two minutes for a specific scenario.

Continue reading

Installing Content Hub CMP connector to Sitecore CM having Publish Service installed

There is a dll version conflict for Polly.dll between Sitecore CMP connector and Sitecore Publish Service 4.1.0. The Publishing Service Module comes with Polly.dll version 5.9.0 and the Sitecore Connect for CMP 1.0.0 comes with Polly 6.0.1. This will cause Sitecore CM to stop working during the CMP install, if the SPS module is already in the system.

Update: As Sitecore 9.3 and SPS 4.2.0 was just released a few hours after writing this post, I noticed this applies to that version as well. SPS module 9.3 also comes with Polly 5.9.0.

Update 2: After some help from Sitecore Support and some additional adjustments, I got the following solution to work on my machine:

Keep you 5.9.0 version of Polly.dll in the bin folder. Create a cmp sub folder (bin/cmp) and put the 6.0.1 version of Polly.dll in that folder. Then add the following to your web.config assembly binding section:

<dependentAssembly>
   <assemblyIdentity name="Polly" publicKeyToken="c8a3ffc3f8f825cc" />
   <codeBase version="5.9.0.0" href="bin\Polly.dll" />
   <codeBase version="6.0.0.0" href="bin\cmp\Polly.dll" />
   <codeBase version="6.0.1.0" href="bin\cmp\Polly.dll" />
</dependentAssembly>

I don’t think the 5.9.0.0 binding is really needed, but I found in my logs that the CMP connector was trying to load both 6.0.0.0 and 6.0.1.0.

The easy fix for this is to add assembly redirects into the web.config file before installing the connector and keep the old file: Update: It turned out I tricked myself. I thought I got everything working, but the CMP connector throws exceptions in the log while importing content from Content Hub.

<dependentAssembly>
  <assemblyIdentity name="Polly" /><!-- publicKeyToken="c8a3ffc3f8f825cc" -->
  <bindingRedirect oldVersion="5.8.0.0-6.0.1.0" newVersion="5.9.0.0"/>
</dependentAssembly>

As described in this stackoverflow post, it’s not possible to do assembly redirect between assemblies with different public keys. Polly 5.9 doesn’t have a public key (i.e. it’s null), so at least I couldn’t make binding work to the newer 6.0 version.

As of writing this, I don’t have a solution to this problem. I’m currently awaiting an answer from Sitecore if CMP and SPS can live together or not.

Clean up Sitecore database and avoid corrupt published content

We’ve discovered a rare issue in Sitecore Publish Service (SPS) where it may publish incorrect content to some fields. Even though I think SPS does this wrong, the root cause was inconsistent data in the master database. It turned out such inconsistency exist in most databases, even in a clean Sitecore install.

Continue reading

Improving Sitecore code quality with ReSharper External Annotations

I guess most of us Sitecore developers are familiar with the JetBrains ReSharper plugin for Visual Studio. The tool actually made me accept moving from the Java/IntelliJ IDEA world to the .Net/Visual Studio world many many years ago and I’m still on the Idea keyboard shortcut scheme.

Besides all the nice refactoring tools, code hints etc that comes with ReSharper, it also comes with a framework for annotating code with attributes. One can argue if this should be used or not in your own code, but it opens for a really nice way for improving code quality when working with external libraries.

As Sitecore has grown over the years, the API becomes larger and larger and there are sometimes multiple ways of achieving the same thing. Sometimes the API is a bit ambiguous to new developers and some operations should be avoided from a performance perspective etc. With ReSharper External Annotations we can give developers code hints and feedback directly in Visual Studio when using the Sitecore API in a way that may not be intended.

Continue reading

Learnings from a year of implementing Sitecore Publishing Service

Sitecore Publishing Service (SPS) is a replacement for the built-in publish function. It’s built on dotnet core and runs as a separate micro service instead of the built-in publisher that runs in-process.

I’ve been using it, or rather tried to use it, for about a year now on a large Sitecore 9.0.1 solution. It was everything but a smooth ride, so I thought it would be worth sharing my experience and what I learned during the process.

SPS has its clear advantages regarding the speed it publishes content. It’s not as “lightning fast” as Sitecore claims it to be, but still a lot faster than the built-in one. The greatest advantage, in my opinion, is that it runs outside the Sitecore Content Management (CM) worker process. So an ongoing publish processes doesn’t break due to an IIS application pool recycle. Those two reasons were also why we tried moving to SPS.

Note: This post contains my experiences while working with SPS 3.0 to 3.1.3. Some of the issues have been fixed in later versions. Some issues may also remain in SPS 4 as it was released before 3.1.3. Many of the issues turned out to exist in 2.x as well.

Continue reading

Indexing and OCR scanning PDF documents in Sitecore

PDF documents in Sitecore media library can be indexed using IFilters, but it has faced its limitations regarding Azure support etc and isn’t very effective from a performance point of view. The way the extracted content is indexed also makes it harder to use in multi-language solutions.

I’ve taken a different approach on indexing PDF documents, making it more accurate and increase the performance at the same time. The IFilter approach is a generic approach, supporting multiple file formats. I’ve focused on PDF documents in this post, as it’s a common format. Similar principles can be applied to other file formats as well.

In this post:

  • Avoiding heavy computation during index time
  • Extracting document content through PDF libraries
  • OCR scanning of image/non-text based PDF documents
  • Indexing documents with language stemming
Continue reading

Inherited and non-inherited fields to Sitecore clone items

When an item is cloned in Sitecore, the clones inherits its values from the source item. This is represented by a null value in each field, meaning that it inherits its value from the clone source item. When a value is written to a field in a clone, that value is used instead, hence breaking the inheritance. This works great in most cases.

In some scenarios you might not want to inherit all the fields. You might want to exclude some of them, enforcing a local value in each field for such clones. By default a few fields are not inherited. Those are __Created, __Created by__Updated, __Updated by, __Revision, __Source, __Source item, __Workflow, __Workflow state and __Lock. It’s quite natural that those fields are not inherited to clones, since each item, the source and the clone, should keep their own values of those fields.

You can add your own fields to this list by modifying the ItemCloning.NonInheritedFields setting. It’s a string setting where you can provide a pipe (|) separated list list of field ID’s or field keys. The drawback of the setting being a pipe separated list, is that it’s hard to add additional fields through config patch files. I hope Sitecore will change this in the future.

Continue reading

Defragment the SQL Server heap on Sitecore databases

I discovered that the heap gets very fragmented in SQL Server in some of our solutions. Large tables, such as Items, Shared-, Versioned- and Unversioned-fields, Blobs, Descendants and Links tables, that easily occupies a few GB on disk, also suffered from great fragmentation. More than 90% fragmentation was common.

From what I’ve found, the only way to fix SQL Server Heap fragmentation (the heap is where all the table data is stored), is to have a clustered index on each table.

However, I noticed that no tables in the Sitecore databases have any clustered indexes. All indexes are non-clustered in the common master/core/web databases. Sitecore used to have clustered indexes back in 5.2, but over the course of multiple Sitecore versions, the database schema has changed to non-clustered indexes.

A clustered index means that the table rows as stored in the index order physically on disk. That’s also why there can be only one clustered index per table. With a non-clustered index, there is a second list that has pointers to the physical rows. It’s generally faster to read from a clustered index, but it may be slower to write to it as there may be a need to rearrange the table data.

Continue reading

Correcting ambiguous Sitecore field scopes

As you probably know, all fields in Sitecore can have one of three field scopes: Versioned (aka Normal), Unversioned and Shared. Versioned fields have individual version numbers for each language. Unversioned fields have individual values for each language in the same way as versioned fields, but there can be only one value per language. Shared fields are just a single value regardless of language and item version. There are no such thing as a “versioned shared” field type.

This is configured using two check boxes on a field level: Shared and Unversioned. If none are checked, the field becomes a versioned field. As you see, there’s an ambiguous “invalid” state where both check boxes are checked. In this case, Shared has precedence.

Continue reading

Sorting with Sitecore Content Search/Solr

Sorting search results are rather straight forward at first glance, but there are some pitfalls to be aware of. When using Sitecore Content Search, the Linq provider supports the OrderBy method and it get serialized into a sort statement in a Solr query. Example:

var result = searchContext.GetQueryable<MyModel>()
   .Where(...)
   .OrderBy(x => x.DisplayName)
   .GetResults()

will be serialized into a Solr query like

?q=...&fq=...&sort=_displayname ASC

This usually works quite well, but consider the following list of item display names:

Continue reading

Sitecore MVP 2019

Sitecore MVP Technology 2019

Thank you Sitecore for awarding me “Most Valuable Professional” (MVP) again! Seven years in row!

The Sitecore MVP Award celebrates the most active Sitecore community members from around the world who provide valuable online and offline expertise that enriches the community and makes a difference

My contribution to Sitecore and the community over the last year have, besides the nine posts on this blog, have been mostly focused on improving the product by having a dialog with various Sitecore staff. During 2018 I filed over 50 confirmed bugs, mostly related to Sitecore Publish Service and Content Search and a handful of accepted product enhancements.

Optimize Sitecore Solr queries

I’ve written a few posts on Sitecore Content Search and Solr already, but there seems to be an infinite amount of things to discover and learn in this area. Previously I’ve pointed out the importance of configuring the Solr index correctly and the benefit of picking the fields to index, i.e. not indexing all fields as default (<indexAllFields>false</indexAllFields>). This will vastly improve the performance of Content Search operations and reduce the index size in large solutions.

Recently I’ve been investigating a performance issue with one of our Sitecore solutions. This one is running Sitecore 9 with quite a lot of data in it. It’s been performing quite well, but as the client were loading more data into it, it got a lot slower. Our metrics also showed the response time (P95) in the data center that got quite high. It measured around 500 ms instead of the normal 100 ms.

Continue reading

An easy way to create Sitecore config files

Some people find it a bit tricky to write Sitecore config files. It can sometimes be a bit tricky or time consuming to get the element structure correct. Ever found yourself debugging an issue where it turned out the config file wasn’t applied properly due to an element structure mistake?

The XPath Tools plugin, by Uli Weltersbach, for Visual Studio is a great help for creating those config patch files. Here’s a way to create those in a fast and simple way:

Continue reading