Auto-suggest with Solr Facets in Sitecore

Sitecore’s auto-suggest feature for search in the Content Authoring environment is pretty slick, but there is some confusing documentation from Sitecore about how to set it up properly with Solr.  As of today, Sitecore’s documentation on integrating with Solr indicates…

“When you implement Solr with Sitecore you need to enable term support in the Solr search handler.  The term functionality is built into Solr but is disabled by default. To power the dropdowns in the UI you must enable the terms component.

That above documentation will be updated at some point by Sitecore, since it’s no longer the case for the latest version of Sitecore — 8.2 rev. 161221 (Update-2).

In earlier versions of Sitecore, search in the Sitecore Content Editor could make use of the Solr “terms” component to populate suggestions.  This is why this guidance has previously been part of the Solr integration documentation from Sitecore.  Read more about Solr’s use of this auto-suggest through terms at https://cwiki.apache.org/confluence/display/solr/The+Terms+Component.

Sitecore’s strategy of making use of the “terms” component has changed with recently, however.  Sitecore now uses faceting with Solr instead of terms.

To prove this out, I’m going to turn to the Solr logs after I try some queries for content in the Sitecore client.  Refer to this documentation from Sitecore if you’re looking for more context on how to use the search facility — there are a lot of features that are very under-utilized, in my experience.  I’ll specify a clause by typing Updatedby: and then “siteco” to engage the auto-suggest feature:searbhby

Very nice, right?

Under the covers, the Solr logs will reveal something like this . . .

2017-02-17 19:33:07.546 INFO  (qtp33171127-11) [   x:trial_core] o.a.s.c.S.Request [trial_core]  webapp=/solr path=/select params={q=*:*&facet.field=parsedupdatedby_s&facet.prefix=siteco&rows=0&facet=true&version=2.2&facet.sort=true} hits=24626 status=0 QTime=2

. .  . and that can be further debugged by turning it into the URL request powering that auto-suggest response . . .

http://server:port/solr/sitecore_master_index/select?q=*:*&facet.field=parsedupdatedby_s&facet.prefix=siteco&rows=0&facet=true&version=2.2&facet.sort=true

. . . and that would return results like the following:

solrresponse

If instead we tried an author: search in Sitecore, for example, the facet.field would be parsedcreatedby_s instead of parsedupdatedby_s.

I don’t want to go too far down this rabbit hole.  I really just wanted to share that despite what the documentation shows, it’s not necessary to enable the Solr term component on the /select requestHandler in Solr if you’re using the most recent version of Sitecore.  I’ve confirmed with official Sitecore support that this change was tagged as change #444661 and that’s it was incorporated into the product since Sitecore 8.1 update-1 (rev. 151207); the release notes for 8.1 update-1 are vague, but here it is:

Autocomplete for known fields such as language did not work in the Content Editor Search tab using the SOLR provider. The problem was related to the SOLR server configuration. This has been fixed so that Sitecore no longer depends on this configuration. (444661)

Happy faceting to all!

 

Advertisements

High Availability of Azure Search with Sitecore

I’ve been investigating Azure Search with Sitecore’s new Azure App Service offering.  I’ve got a giant Excel file of benchmarks and charts based on several permutations and configurations, and several other interesting tidbits that I need to organize into posts to this blog . . . so look for much more about this general topic in the future.

For now, I thought I’d share a point I’ve confirmed with Sitecore support regarding a limitation of Azure Search with Sitecore’s CloudSearchProviderIndex.  The CloudSearchProviderIndex is what the standard Platform-As-A-Service product from Sitecore will use in place of Lucene or Solr or Coveo to power content search for Sitecore.  This is the key building block for working with Azure Search through Sitecore.  While I was performing performance benchmarks for search re-indexing with Sitecore, I noticed the Azure Search document count would drop to 0 and I’d see odd results from Sitecore requests that depended on the search index.  This was classic “search index is being worked on, don’t rely on querying it until the work is done” behaviour.  This was corrected several years ago through Sitecore’s addition of a SwitchOnRebuildLuceneIndex and equivalent for Solr . . . but there is no such equivalent for the CloudSearchProviderIndex used by Azure PaaS solutions.  Essentially: Sitecore is using a single copy of search indexes for query and re-indexing operations, limiting the availability of search during maintenance work.

One could argue this may not be such a big deal because one may not rebuild Azure Search indexes with any frequency.  I’m not sold on this argument, however, since the Sitecore projects I know will frequently perform re-indexing due to development changes to the schema, content synchronization demands, or just routine deployment standard practices.

Further complicating this issue is that my benchmarking for Azure Search re-indexing through Sitecore leaves a lot to be desired.  It can be slow.  This could make for an extended period of search index unavailability due to the CloudSearchProviderIndex‘s limitations.  I’ll share the full battery of testing I’ve done in a future post, but for now let me share the timings I’m observing regardless of the number of Azure Search partitions or replicas I’m working through (partitions should generally improve indexing performance; replicas should generally improve querying performance):

App Service Configuration Time for 20,000 Sitecore Items to Re-Index with Azure Search
Azure PaaS Standard (S1) CM IIS (OOTB from the Marketplace) 66 minutes
Azure S2 CM IIS 35 minutes
Azure S3 CM IIS 25 minutes
Azure P2 CM IIS 35 minutes
Azure P3 CM IIS 24 minutes

For reference, with Lucene indexes this operation would take 5 minutes or less.  The scaling options for Azure Search, Partition count and Replica count, have a minimal impact to the re-indexing operation.

I’ll go into details of this later, but it could be that . . .

  • 20,000 Sitecore items is too small a figure to benefit from scaling with Azure Search?  Many customers have 100,000 or more items, so perhaps I should evaluate a larger data set.
  • there are bottlenecks at the SQL tier?  App Insights here I come…
  • the fact Sitecore isn’t using Azure Search Indexers to ingest data and relies on the Sitecore crawling logic to handle data indexing is artificially slowing this process down

For the time being, Sitecore has responded that improving the availability of Azure Search indexes during rebuilds is an official “feature request” and assigned reference number 146822 

In the meantime, if a project needs high availability for Azure Search indexes one may need to roll up their sleeves and craft their own SwitchOnCloudSearchProviderIndex.  It appears fairly straight-forward based on reviewing how this is solved for Solr, just as one example.  A key caveat is in the Azure Search capacity planning documentation:

High availability for Azure Search pertains to queries and index updates that don’t involve rebuilding an index. If you add or delete a field, change a data type, or rename a field, you will need to rebuild the index. To rebuild the index, you must delete the index, re-create the index, and reload the data.

To maintain index availability during a rebuild, you must have a copy of the index with a different name on the same service, or a copy of the index with the same name on a different service, and then provide redirection or failover logic in your code.

It looks like providing for high availability would double the price of Azure Search indexes, so there are a cascade of complications related to this.

My investigations into Sitecore and Azure Search yielded this complication — it’s not insurmountable, and I actually find it fascinating how an on-premises product (classic Sitecore) will evolve into a cloud-first product.  This is just one piece of the evolutionary story.  I expect this will be addressed sooner rather than later in an official upgrade or patch from Sitecore, and until then it’s important to understand this nuance to the Sitecore PaaS landscape.

Digesting Sitecore Commerce 8.2.1

A whole new take on Sitecore Commerce is hot off the presses and I had an opportunity to dig into it briefly this week.  Taking from the release notes and the documentation, which is actually fairly extensive:

This is Sitecore’s new re-envisioned Commerce product.

“Release number 8.2.1 has been assigned to reflect the compatibility with release 8.2 of the Sitecore Experience Platform (Sitecore XP). However, Sitecore Commerce 8.2.1 is not an update to previous Commerce 8.2 releases, but is an entirely new Commerce product and release.”

I worked on a few Sitecore Commerce implementations a while back, but it had been over a year since I ran a proof-of-concept or even completed the installation.  My background with the permutations of “Commerce” on Windows goes back over 15 years, starting with the Microsoft Site Server product and the initial craze around XSLT rendering HTML output from content engines . . . I remember a horrendous e-commerce project designed with a Commerce Server beta and the “elegance” of XML was a complete productivity killer.  It’s a poor worker who blames their tools, right? 🙂   I digress…

Anyway, the last real work I did with Sitecore Commerce was in 2015 and I recall the installation/configuration process being arduous, with both Web and Desktop elements, lots of security hoops to jump through, COM everywhere, and even registry edits for good measure.

This new 8.2.1 Release installation process is certainly an improvement over what is now considered “Legacy” Sitecore Commerce . . . but standing up a baseline installation to kick the tires will still likely occupy a solid day of your time.  The documentation is good, but not 100% bulletproof because there are so many moving parts.  I know I ended up needing to install some new .Net elements for ASP.NET Core . . . and I needed to install an old .Net framework SDK to get another piece of the puzzle to run on the IIS server. I took notes on what extra steps I needed to perform, but I was using a fairly old Rackspace server image so not particularly applicable to everyone.  A few examples from my notes, however:

  1. Re-install the Default Web Site to IIS (our scripted Sitecore installation cleans out the Default Web Site in IIS, so I needed to add it back in to satisfy an assumption one of the various installers made)
  2. Configure IIS 6 Metabase Compatibility to satisfy a requirement for the Commerce Server installation

It’s these sorts of nuances that I recall from previous run-ins with the Commerce platform Sitecore inherited and now fully owns.  In some respects, not all that much has changed.

On the bright side, however, there are clean new SPEAK applications for working with Commerce data:

threecommerce

To get this far, however, you really have to earn it.  There are eight Sitecore “packages” that must be installed, for example, once you get the base Commerce Server + Sitecore + Commerce Core running . . . oh, and they need to be installed in a specific order that is NOT alphabetical, either:

packages

On the bright side, there is a lot more documentation than I’ve seen before on this set of products.  I worked with Sitecore Commerce at a time when there was essentially no real current information about the product, so maybe I’m satisfied too easily with what is now available . . . but I really found this an area Sitecore has improved upon.

Based on this documentation, I was able to pull out some of Sitecore’s diagrams of the product and compile this single visual of the Sitecore Commerce platform as I understand it for version 8.2.1:

8-2-1-annotation

The above is just consolidated from a variety of pictures and notes contained throughout the official documentation from Sitecore on the subject, but one of the ways I digest a system is by diagramming and scribbling notes as I go through a project.  Maybe others will find it useful, too.