Updating Sitecore 9 to connect to Solr over HTTP instead of HTTPS

It’s swimming against the tide to use HTTP instead of HTTPS with Sitecore and Solr, but there were a series of circumstances that had us reviewing the SSL default and evaluating other options. I had to go to Sitecore support for clarification on setting up Solr over HTTP instead of HTTPS (now that HTTPS is the default in the Sitecore 9 space). There wasn’t succinct documentation from Sitecore, so let me share the findings here in case it helps others . . .

For the standard Sitecore CM and CD roles, one can update the  ContentSearch.Solr.ServiceBaseAddress URL — just use http instead of https.

For the Sitecore xConnect role, one needs to change the following configuration:

1) For the Collection instance, update the App_data\config\sitecore\CollectionSearch\sc.Xdb.Collection.IndexReader.SOLR.xml file:

<Options>
<ConnectionStringName>solrCore</ConnectionStringName>
<RequireHttps>false</RequireHttps>
</Options>

2) For the xConnect Indexer instance, update the App_data\jobs\continuous\IndexWorker\App_data\Config\Sitecore\CollectionSearch\sc.Xdb.Collection.IndexReader.SOLR.xml file:

<Options>
<ConnectionStringName>solrCore</ConnectionStringName>
<RequireHttps>false</RequireHttps>
</Options>

3) Also for the xConnect Indexer instance, update the App_data\jobs\continuous\IndexWorker\App_data\Config\Sitecore\SearchIndexer\sc.Xdb.Collection.IndexWriter.SOLR.xml file:

<Options>
<ConnectionStringName>solrCore</ConnectionStringName>
<RequireHttps>false</RequireHttps>
<Encoding>utf-8</Encoding>
</Options>

<Options>
<ConnectionStringName>solrCore</ConnectionStringName>
<RequireHttps>false</RequireHttps>
<MaximumUpdateBatchSize>1000</MaximumUpdateBatchSize>
<MaximumDeleteBatchSize>1000</MaximumDeleteBatchSize>
<MaximumCommitMilliseconds>1000</MaximumCommitMilliseconds>
<ParallelizationDegree>4</ParallelizationDegree>
<MaximumRetryDelayMilliseconds>5000</MaximumRetryDelayMilliseconds>
<RetryCount>5</RetryCount>
<Encoding>utf-8</Encoding>
</Options>

Curious case of the LocationsDictionaryCache

We have some cache monitoring in place for some enterprise Sitecore customers and that system has found one particular cache being evicted roughly 40 times per hour for one particular customer. It’s curious, as this cache isn’t covered in the standard documentation for Sitecore’s caches. Sitecore support had to point me in the right direction on this . . . and it’s an interesting case.

The LocationsDictionaryCache cache is defined in Sitecore.Analytics.Data.Dictionaries.LocationsDictionary and there’s a hard coded 600 second expiration in that object definition:

Expires

There’s also a maxCacheSize set of 0xf4240 hex (1,000,000 or 976 KB). You can’t alter these settings through configuration, you’d have to compile a new DLL to alter these values.

It’s not clear to me that the quick eviction/turn-over of this cache is a perf issue to worry about . . . I think, at this point, it’s working as expected and indicative of a busy site with lots of Sitecore Analytics and Tracking behaviour. Reviewing the decompiled Sitecore code that uses this class (like in Sitecore.Analytics.Tracking.CurrentVisitContext or Sitecore.Analytics.Pipelines.CommitSession.ProcessSubscriptions), it appears that this cache serves as a short-term lookup along the same lines as devices, user-agents, etc. Why this particular cache is active in ways UserAgentDictionaryCache and others is not, besides the obvious 600 second life span, is something we need to dig further into — but I don’t know that it’s a perf bottleneck for our given scenario.

 

Which Solr Node Is Responding to My Sitecore Query?

In working with Sitecore + Solr, eventually you may need to determine which Solr server responded to a specific query in order to validate Solr replication or to compare Solr responses across machines. If you can’t imagine a reason why you’d want to do this, you’re probably fortunate and should maybe go buy a lottery ticket instead of reading this blog post 🙂

Brute Force Method with Solr Master/Slave

If you’re working with Solr master/slave behind a load-balancer, or with multiple slaves behind a load-balancer, I haven’t found a reliable way of determining which Solr server responded to a particular query besides the brute force method of comparing Sitecore and Solr logs. Specifically, from the Sitecore logs in Data\logs\Search.Logs.[Timestamp].txt you should see something like the following for each query:

Sitecore’s Query Log

15:11:48 INFO Serialized Query – ?q=(_template:(f613d8a8d9324b5f84516424f49c9102) AND (-_name:(“__Standard Values”) AND _language:(en-US)))&start=0&rows=1&fl=*,score&fq=_indexname:(sitecore_web_index)&sort=searchdate_tdt desc

Solr’s Query Log

You can cross-reference this Sitecore log data with Solr logs (like S:\solr-4.10.4\sitecore\logs\solr.log):

INFO – 2018-08-01 15:11:48.514; org.apache.solr.core.SolrCore; [sitecore_web_index] webapp=/solr path=/select params={q=(_template:(f613d8a8d9324b5f84516424f49c9102)+AND+(-_name:(“__Standard+Values”)+AND+_language:(en-US)))&fl=*,score&start=0&fq=_indexname:(sitecore_web_index)&sort=searchdate_tdt+desc&rows=1&version=2.2} hits=2958 status=0 QTime=16

It’s tedious to match up the exact query and time in these logs, but the Solr node with the matching record will reveal which one serviced the request.

Now, one can craft some PowerShell to scrounge the Sitecore and Solr logs and determine where we have matches — crazy as it may sound — but I’m not interested in sharing all that here. It’s academic to read the two logs and look for matches, anyway, so I’ll move on to the Solr Cloud scenario that is more interesting and forward-looking since Sitecore is steadily progressing towards full Solr Cloud support across the board.

Solr Cloud Debug Query Command

Solr Cloud supports a debug command where you append debug=true to the URL and Solr includes diagnostic output in the results. For example, a RESTful query to Solr like http://10.215.118.28:8983/solr/sitecore_web_index_shard1_replica2/select?q=_name%3ANEWS&wt=xml&indent=true&amp;debug=true. Using the XML formatting, debug=true adds something like this to the response from Solr:

Capture

There can be interesting tidbits in each of those debug sections, but I’m going to focus on the track node that shares information about the different phases of the distributed request Solr is making. Under the “EXECUTE_QUERY” item is a “name” attribute that will specify which Solr nodes, shards, and replica were involved in responding to the query, for example:

<lst name=”http://10.215.140.12:8983/solr/sitecore_web_index_shard2_replica1/|http://10.215.140.13:8983/solr/sitecore_web_index_shard2_replica2/”>

I’ve also found the “shard.url” value of the Response (nested under the EXECUTE_QUERY data) to share the same information. It’s possible that’s more reliable across Solr versions etc, but something to keep an eye on. Here’s a fragment of the XML response for the debug information:

Capture

A careful reader might point out that the “rid”  value includes the IP address of the server responding to the request, but this is designed to be the “request ID” that traces the query through Solr’s various moving parts — I wouldn’t rely on the “rid” to tell you the source for the response, though, as it could be changing across versions.

Here’s a quick run through of the other diagnostic data in that EXECUTE_QUERY data that I know about:

  1. QTime: the number of milliseconds it took Solr to execute a search with no regard for time spent sending a response across the network etc
  2. ElapsedTime: the number of milliseconds from when the query was sent
    to Solr until the time the client gets a response. This includes QTime, assembling the response, transmission time, etc.
  3. NumFound: the count of results

There is a ton to all this and we’re only scratching the surface, but as Sitecore gets more serious about scalable search with Solr, we’re all going to be learning a lot more about this in the months and years to come!