Sorting with Sitecore Content Search/Solr

Sorting search results are rather straight forward at first glance, but there are some pitfalls to be aware of. When using Sitecore Content Search, the Linq provider supports the OrderBy method and it get serialized into a sort statement in a Solr query. Example:

var result = searchContext.GetQueryable<MyModel>()
   .OrderBy(x => x.DisplayName)

will be serialized into a Solr query like

?q=...&fq=...&sort=_displayname ASC

This usually works quite well, but consider the following list of item display names:

Continue reading

Sitecore MVP 2019

Sitecore MVP Technology 2019

Thank you Sitecore for awarding me “Most Valuable Professional” (MVP) again! Seven years in row!

The Sitecore MVP Award celebrates the most active Sitecore community members from around the world who provide valuable online and offline expertise that enriches the community and makes a difference

My contribution to Sitecore and the community over the last year have, besides the nine posts on this blog, have been mostly focused on improving the product by having a dialog with various Sitecore staff. During 2018 I filed over 50 confirmed bugs, mostly related to Sitecore Publish Service and Content Search and a handful of accepted product enhancements.

Optimize Sitecore Solr queries

I’ve written a few posts on Sitecore Content Search and Solr already, but there seems to be an infinite amount of things to discover and learn in this area. Previously I’ve pointed out the importance of configuring the Solr index correctly and the benefit of picking the fields to index, i.e. not indexing all fields as default (<indexAllFields>false</indexAllFields>). This will vastly improve the performance of Content Search operations and reduce the index size in large solutions.

Recently I’ve been investigating a performance issue with one of our Sitecore solutions. This one is running Sitecore 9 with quite a lot of data in it. It’s been performing quite well, but as the client were loading more data into it, it got a lot slower. Our metrics also showed the response time (P95) in the data center that got quite high. It measured around 500 ms instead of the normal 100 ms.

The cause of this wasn’t obvious and most things looked good. Database queries were fast, caches worked as expected, CPU load was somewhat high, but not overloaded. I also looked at our Solr queries and the query time reported by Solr were typically below 25 ms. It wasn’t until I looked at network traffic between the servers, I discovered that a lot of data were sent between Solr and IIS. IIS received about ten times the data as it was sending/receiving to/from any other system.

So I started looking at the actual queries being performed and I knew some queries returns many records. But I was surprised when I found queries returning payloads of 500kB or more. This creates latency and CPU load when parsing it!

So I started digging into our code and at first glance everything looked quite good. The code followed a regular pattern, like:

public class MyModel {
public ID ItemId { get; set; }

public string LanguageName { get; set; }
// more needed fields ....
var result = searchContext.GetQueryable<MyModel>()
foreach (var myModel in result.Hits) {
// do stuff

I soon realized that Sitecore puts fl=*,score (Field List) in the Solr query string. You can see this in your Search.log. This means all stored fields are returned in the result. But in most cases we were only interesting in one or a few fields, like the ones specified in our model MyModel. So I looked at finding a way to fetch only the required fields. writing the queries in a way where only the needed fields were returned. It turned out that the LinqToSolrIndex were indeed managing a field set and builds the Solr fl parameter out of that.

So it turns out we can use the Select feature before GetResults to control the fields list. However, it complicates the code a bit, because it means that MyModel in the example above won’t be returned, but it is used when picking the fields and for constructing the query. The code pattern below explains it a bit more. Note: You don’t have to use different models and the query model doesn’t have to inherit the result model. I’ve written it like this to clarify the relation. It’ll probably become easier to understand when writing this in Visual Studio and you see what Intellisense gives you.

public class MyResultModel {
public ID ItemId { get; set; }
// more fields requested

public class MyQueryModel : MyResultModel {
public string LanguageName { get; set; }
// more fields needed for filtering....
var result = searchContext.GetQueryable<MyQueryModel>()
.Select(x => new { x.ItemId, /* more fields requested */ })
.Select(x => new MyResultModel {
ItemId = x.Document.ItemId,
/* mapping of more requested fields */
foreach (var myResultModel in result) {
// do stuff

Essentially, you can use Select to control what fields are returned, but you’ll also get a dynamic type returned that you probably want to convert back into a typed model.

So, was it worth doing those code changes in the project. Well, the answer was YES! I went through all Content Search queries that could return more than ten rows, i.e. queries that didn’t have a low .Take(n) value.

As I mentioned previously, we had already optimized the index storage, so we were already storing only the field we’re actually using in the application. Still I found that every record was quite heavy. In the serialized XML used in the network transport between Solr and Sitecore, every record were around 3630 bytes. When picking only the fields needed, we got down to around 270 bytes each. The performance within the data center improved a lot. The response time figure looked like this before/after deploying this code change:

Data center response time
Data center response time before/after optimizing the Solr fl parameter

I was blown away how much difference this did to the solution as a whole. And the Solr queries are still not perfect. Looking at the search log, there are still a few additional fields returned from Solr that I don’t need. It turnes out that Sitecore adds the fields _uniqueid, _datasource and score regardless of what I select. So my 270 bytes/record could go as low as 70 bytes/record if those were removed too, but that would probably be sub optimizing…

I don’t know why Sitecore always adds the first two, and the last one, score, turned out to be a bug. The GetResults method has an optional GetResultsOptions enum parameter. It may be Default or GetScores. However this parameter doesn’t do anything. Sitecore will always ask for score, unless you do something like ToArray instead of GetResults. That could be an option to reduce it a bit more, but then you won’t get the TotalSearchResults or Facets properties.

So my recommendation is to look through your Search.log and filter out the rows where fl=*,score and rows=a large number. Consider optimizing those with a limited field list or number of returned rows.

An easy way to create Sitecore config files

Some people find it a bit tricky to write Sitecore config files. It can sometimes be a bit tricky or time consuming to get the element structure correct. Ever found yourself debugging an issue where it turned out the config file wasn’t applied properly due to an element structure mistake?

The XPath Tools plugin, by Uli Weltersbach, for Visual Studio is a great help for creating those config patch files. Here’s a way to create those in a fast and simple way:

Start with a Sitecore config boiler plate:
Sitecore config template

Then grab the expanded Sitecore config file through Sitecore Rocks:
Open Expanded Web.config

Navigate to the section where you want to apply your config change and select “Copy XML structure”. In this example I’ll add a computed field to the default Solr index section:
Copy XML Structure

Then paste it into your config template:
Sitecore config structure

From here on it’s a simple task to just remove the surplus attributes you don’t need, such as database="SQLServer" xmlns:patch… and so on:
Complete patch file

So XPath tool is a small but very helpful tool that makes life just a little bit easier.

You can get the XPath tool from within Visual Studio or download it here, or grab it from GitHub.

Improving Editing Performance when using Sitecore Publish Service

The Sitecore Publish Service vastly improves the publish performance in Sitecore. For me it was really hard to get it working properly and I’ve blogged about some of the issues before. I received a lot of good help from Sitecore Support and now it seems like I’ve got into a quite stable state.

However, there is a backside of the Publish Service that may affect the editing performance. Publish Service doesn’t use the PublishQueue table for knowing what to publish. Instead it has an event mechanism for detecting what needs to be published. As an item is saved, Sitecore emits events to the Publish Service so that it knows what pages should be put into the publish manifest.

Note: The solution in this post may not suit every project. Address this only if you’re experiencing the performance decade described and make sure you test everything well. Make sure you fully understand this approach before dropping it into your project.

As part of the Publish Service package, a item:saved event handler is added to do some post processing. When a unversioned field is changed on an item, the event handler loops over all versions of that language and updates the __Revision field. When a shared field is changed on an item, the event handler loops over all versions on all languages and updates the __Revision field. Thereby the Publish Service gets a notification that the content of the item has been changed.
Continue reading

Sitecore X-Forwarded-For handling

A Sitecore solution is typically behind one or several reverse proxies, such as load balancers, content delivery networks etc. From a Content Delivery server perspective, the remote address, i.e. “the visible client IP” is the closes proxy instead of the IP of the connecting client. To solve this, the chain of proxies adds a http header with the IP address it’s communicating with. This header is typically called X-Forwarded-For or X-Real-IP.

Below is an example of such setup. Each proxy adds the IP they’re receiving the connection from:

Continue reading

Sitecore Publish Service 3.1 update-1

After having tons of problems and several filed tickets on the initial release of Sitecore Publish Service 3.1, I was happy to find that Sitecore have addressed many of the problems of the previous versions. This update contains 12 fixes and I found my customer support ticket number listed six times.

Sitecore Publish Service 3.1 update 1 release notesUnfortunately the update didn’t solve these issues properly, so while I’m waiting for new patches I thought I’d share a UI fix that wasn’t included in the release. When working with multiple languages, the language list isn’t very user friendly in the Publish Service UI. It’s essentially just becomes a small letterbox with unsorted languages and a large area for displaying the targets.

This is the layout provided as default when having multiple languages:

Default Publish Service dialog
Continue reading

Memory hungry Sitecore indexing

While investigating stability issues, I’ve found a few things that may need addressing.

Sitecore updates indexes in batches. This is good in general, but it turned out it may be very memory hungry. There are essentially two config parameters you can control the batch size with:

<setting name="ContentSearch.ParallelIndexing.Enabled" value="true" />
<setting name="ContentSearch.IndexUpdate.BatchSize" value="300" />

The default config above, essentially means Sitecore will start multiple threads processing 300 indexable objects each. This might not be an issue at all, but when combined with a multi-language setup, media indexing and crazy authors, this may become a real problem.
Continue reading

Sitecore Language Fallback caching issue

Language Fallback is a powerful feature in Sitecore. It’s been around for years as a module and since Sitecore 8.1 it is part of the core product. In short, it allow values to be inherited between item language versions. This allows you to show default content when translation is missing. You may have dialects of a languages, such as US English vs British English, and you can use Language Fallback to avoid translating content that is the same for the two dialects etc.

Increase Caching.SmallCacheSize to about 10MB if you’re using Language Fallback in Sitecore.

Continue reading

Things to test when using Sitecore Content Search and Publish Service

This is partly a follow-up post of my previous post on Workign with Solr and Sitecore Content Search in Sitecore 9. In that post I raised a few issues that needs to be dealt with, and I’ve found some more. Most of what’s in this post I’ve found on Sitecore 9.0 update-1 and/or Sitecore 8.2 update-5, and it seems like most of the things applies to many more versions too. So I’ve focused on how you can verify if your solution needs patching too.

This is essentially just a brief list of some issues I’ve found over the last few weeks, while working against the clock towards the releases of two large Sitecore projects. Big thank you to all my great colleagues that have put an enormous effort into getting things to work.
Continue reading

Working with Content Search and Solr in Sitecore 9

During an upgrade project to Sitecore 9, I got some insights worth sharing. Some findings in this post applies to multiple Sitecore versions and some are specific to Sitecore 9. I’ve been using SolrCloud 6.6, but some of it applies to other versions as well. It be came a long, yet very abbreviated, post covering many areas.

In this post:

  • Solr Managed schemas in Sitecore 9
  • Tune and extend Solr managed schema to fit your needs
  • How to fix Sitecore config for correct Solr indexing and stemming
  • How to make switching index work with Solr Cloud
  • How to reduce index sizes and gain speed using opt-in
  • How to make opt-in work with Sitecore (bug workaround)
  • Why (myfield == Guid.Empty) won’t give you the result you’re expecting

Continue reading

Sitecore MVP 2018

itecore MVP Technology 2018Thank you Sitecore for awarding me “Technology Most Valuable Professional” (MVP) again! Six years in a row!

The Sitecore MVP Award celebrates the most active Sitcore community members from around the world who provide valuable online and offline expertise that enriches the community experience and makes a difference.

My contribution to Sitecore and the community over the last year have, besides this blog, have been a continuous dialog with Sitecore staff on how to improve the product and I’ve filed around thirty confirmed bugs. I’ve also held a few talks on Sitecore User Group Gothenburg (SUGGOT) and a few modules are shared on GitHub.

External Blob Storage in Sitecore

Amazon S3By default, Sitecore stores media files as blobs in the database. This is usually good, but if you have a large volume of files, this can become too heavy to handle. So I wrote a Sitecore plugin where you can seamlessly store all the binaries in the cloud.
Continue reading

Improved Sitecore delete item access rights

Sitecore has a quite advanced access right management system. However, I’ve found a few quite common requirements that, as far as I know, isn’t supported out of the box. One is to allow content authors to remove individual item versions without allowing them to remove the entire item. This is especially useful for multi language sites. Another requirement is to allow authors to delete items they have created themselves, but no other items.

I’ve seen people work around these kind of issues by playing around in the core database, modifying ribbon buttons etc. Personally I don’t like that approach. That would just hide the button and if the user could initiate the command in any other way, Sitecore will gladly perform the delete action. It’s easy to forget a command action in a context menu or something like that.

Instead I created a very small Sitecore module to solve these issues. All the source is available on Github.
Continue reading

Adding rel=”noopener” to Sitecore

Some say target="_blank" is one of most underestimated vulnerabilities on the web. When you make a link to an external into a new tab or window, that site gets access to your site. If it’s a design flaw in browsers or something else is debatable, but luckily there is a simple fix for it by adding rel="noopener" to external links.
Continue reading

Happy Sitecore Experience Award winner

Sitecore Experience Award 2016, Marketing AgilitySitecore have awarded Volvo Construction Equipment, with partner Stendahls, winner of the Sitecore Experience Award 2016, Marketing Agility category. The Marketing Agility award recognizes marketing teams that have made significant, measurable gains in productivity and marketing ROI through the Sitecore platform. This year’s winners outlined clear before and after scenarios for team output and content publishing times as well as any associated organizational or team advantages, as a result of time/resource savings.

I’ve spent most of my working time during 2015 and 2016 on this huge project, where we built the whole marketing platform and rolled out 123 market and dealer websites, representing Volvo’s business in more than 140 countries on 30+ languages. The solution contains about 370k items and is managed by more than 150 editors worldwide, all in the same Sitecore solution.

A lot of my time went into creating a streamlined development process, build and maintain a fast and stable hosting platform and build a very efficient authoring environment. Many of my blog posts during the last year have sprung out of this project.

Volvo CE Press release

Creating successful Sitecore websites

Some time ago, Martin Davies started a thread on the Sitecore MVP community forum about where to store page content. It caught a lot of attention and many replies. Since that’s a closed forum, I thought I’d might as well write a public post on my view on this topic, but also put into a wider perspective. It almost ended up in a recipe on how to build Sitecore websites. Nothing in this post is new really, so I’ve kept each topic as brief as possible.

This post is only about the basics of Sitecore as a CMS. Sitecore can do so much more. But I’ve seen so many, in my opinion, implementation failures, where the customer can’t get passed the content management hell and never starts walking up the experience maturity ladder. The implementing partners plays a very important role here ensuring the content authoring environment doesn’t become messy, time consuming or non-intuitive.
Continue reading

Sitecore MVP 2017

Sitecore MVP Technology 2017 logoThank you Sitecore for awarding me “Technology Most Valuable Professional” (MVP) again! Fifth year in a row!

The Sitecore MVP Award celebrates the most active Sitcore community members from around the world who provide valuable online and offline expertise that enriches the community experience and makes a difference.

My contribution to Sitecore and the community over the last year have, besides this blog, been continuous dialog with Sitecore staff on how to improve the author experience in the product and over forty confirmed bug reports.

As a lead architect and DevSecOp of this huge website project,, I expect a lot of blog posts during 2017 about our findings and how to create and maintain great websites.

Sitecore MVC Controller cache vs error handling

I faced a problem where I need to control the Sitecore html cache from within a controller action. There are probably many scenarios where this is needed. In my particular case, the output from the controller is cacheable in the Sitecore html cache, but my controller action is also dependent on a third party system. The communication with this system could sometimes fail. Essentially it means that if the controller fails, I don’t want the html output to be cached either. If it is cached, as it is by default, my error message will stay there until the html cache is cleared, regardless if the third party system is online or not.
Continue reading