Azure Search compared to Solr for Sitecore PaaS (Chapter 1: Ingestion)

I’ve been investigating Azure PaaS architectures for Sitecore lately, and I wanted to take a few minutes and summarize some recent findings around the standard Sitecore search providers of Solr and, new for Sitecore PaaS, Azure Search.

To provision Azure PaaS Sitecore environments, I used a variant of the ARM Template approach outlined in this blog.  For simplicity, I evaluated a basic “XP-0” which is the name for the Sitecore CM/CD server combined into a single App Service.  This is considered a basic setup for development or testing, but not real production . . . that’s OK for my purposes, however, as I’m interested in comparing the Sitecore search providers to get an idea for relative performance.

The Results

I’ll save the methodology and details for lower in this post, since I’m sure most don’t want to wait for an idea for the results.  The Solr search provider performed faster, no matter the App Service or DB Tier I evaluated in Azure PaaS:

ChartComparison

The chart shows averages to perform the full re-index operation in minutes.  You may want to refer to my earlier post about the lack of HA with Sitecore’s use of Azure Search; rest assured Sitecore is addressing this in a product update soon, but for now it casts a more significant shadow over the 60+ minutes one could spend waiting for the search re-index to complete.

Methodology

In these PaaS trials, I setup the sample site LaunchSitecore.  I performed rebuilds of the sitecore_core_index through the Sitecore Control Panel as my benchmark; I like using this operation as a benchmark since it has over 80,000 documents.  It doesn’t particularly exercise the querying aspects of Sitecore search, though, so I’ll save that dimension for another time.  I’ve got time set aside for JMeter testing that will shed light on this later…

To get the duration the system took to complete the re-index, I queried the PaaS Sitecore logs as described in this Sitecore KB article.  Using results like the following, I took the timestamps since I’ve found the Sitecore UI to be unreliable in reporting duration for index rebuilds.

queries

You can get at this data yourself in App Insights with a query such as this:

traces
| where timestamp > now(-3h)
| where message contains " Crawler [sitecore_core_index]" 
| project timestamp, message
| sort by timestamp desc

Remember, I’ve used the XP0 PaaS ARM Templates which combine CM and CD roles together, so there’s no need for the “where cloud_RoleInstance == ‘CloudRoleBlahBlah'” in the App Insights query.

Methodology – Azure Search

For my Azure Search testing, I experimented with scaling options for Azure Search.  For speedier document ingestion, the guidance from Microsoft says:

“Partitions allow for scaling of document counts as well as faster data ingestion by spanning your index over multiple Azure Search Units”

The trials should perform more quickly with additional Azure Search Partitions, but I found changing this made zero difference.  My instincts tell me the fact Sitecore isn’t using Azure Search Indexers could be a reason scaling Azure Search doesn’t improve performance in my trials.  Sitecore is making REST calls to index documents with Azure Search, which is fine, but possibly not the best fit for high-volume operations.  I haven’t looked in the DLLs, but perhaps there’s other async models one could use in the the Azure Search provider when it comes to full re-indexes?  It could also be that the 80,000 documents in the sitecore_core_index is too small a number to take advantage of Azure Search’s scaling options.  This will be an area for additional research in the future.

Methodology – Solr

To host Solr for this trial, I used a basic Solr VM in the Rackspace cloud.  One benefit to working at Rackspace is easy access to these sorts of resources 🙂  I picked a 4 GB server running Solr 5.5.1.  I used a one Solr core per Sitecore index (1:1 mapping), see my write-up on Solr core organization if you’re not following why this might be relevant.

For my testing with the Solr search provider, Sitecore running  Azure PaaS needed to connect outside Azure, so I selected a location near to Azure US-East where my App Service was hosted.  I had some concerns about outbound data charges, since data leaving Azure will trigger egress bandwidth fees (see this schedule for pricing).  For the few weeks while I collected this data, the outbound data fee totaled less than $40 — and that includes other people using the same Azure account for other experiments.  I estimate around 10% (just $4) is due to my experiments.  Suffice it to say using a Solr environment outside of Azure isn’t a big expense to worry about.  Just the same, running Solr in an Azure VM would certainly be the recommendation for any real Sitecore implementation following this pattern.  For these tests, I chose the Rackspace VM since I already had it handy.

I’d be remiss to not mention the excellent work Sitecore’s Ivan Sharamok has posted to help make Solr truly enterprise ready with Sitecore.  Basic Auth for Solr with Sitecore is important for the architecture I exercised; this post is another gem of Ivan’s worth including here, even if I didn’t make use of it in this specific set of evaluations.  Full disclosure: I worked with Ivan while I was at the Sitecore mothership, so I’m biased that his contributions are valuable, but just because I’m biased doesn’t mean I’m wrong.

Conclusions

I’ll include my chart once again:

ChartComparison

These findings lead me to more questions than answers, so I’m hesitant to make any sweeping generalizations here.  I’m safe declaring Sitecore’s search provider for Solr to be faster than the Azure Search alternative when it comes to full index re-builds, that’s clear by an order of magnitude in some cases.  Know that this is not a judgement about Solr versus Azure Search;  this is about the way Sitecore makes use of these two search technologies out of the box.  The Solr provider for Sitecore is battle-tested and has gone through many years of development; I think the Azure Search provider for Sitecore could be considered a beta at this point, so it’s important to not get ahead of ourselves.

A couple other conclusions could be:

  1. Whether using Solr or Azure Search, there is no improvement to search re-index performance when changing between the S3 to P3 tiers in Azure App Services.
  2. Changing from the S1 to S3 tiers, on the other hand, makes a big perf difference in terms of search re-indexing.
    • Honestly, the S1 tier is almost unusable as the single CPU core and 1.75 GB RAM are way too low for Sitecore; the S3 with 4 cores and 7 GB RAM is much more reasonable to work with.

Next Up

It’s time for me to consider the more fully scaled PaaS options with Sitecore, and I need to exercise the query side of the Sitecore search provider instead of just the indexing side.

Solr Configuration for Integration with Sitecore

I’ve got a few good Solr and Sitecore blogs around 75% finished, but I’ve been too busy lately to focus on finishing them.  In the meantime, I figure a picture can be worth 1,000 words sometimes so let me post this visual representation of Solr strategies for Sitecore integrations.  One Solr core per index is certainly the best practice for production Sitecore implementations, but now that Solr support has significantly matured at Sitecore a one Solr core for all the Sitecore indexes is a viable, if limited, option:

draft

There used to be a bug (or two?) that made this single Solr core for every Sitecore index unstable, but that’s been corrected for some time now.

More to follow!

Sitecore Gets Serious About Search, by Leaving the Game

Sitecore is steadily reducing the use-cases for using Lucene as the search provider for the Sitecore product.  Sitecore published this document on “Solr or Lucene” a few days ago, going so far as to state this one criteria where you must use Solr:

  • You have two or more content delivery servers

Read that bullet point again.  If you’ve worked with Sitecore for a while, this should trigger an alarm.

Sound the alarm: most implementations have 2 or more CD servers!  I’d say 80% or more of Sitecore implementations are using more than a single CD server, in fact.  Carrying this logic forward, 80% of Sitecore implementations should not be running on Lucene!  This is a big departure for Sitecore as a company, who would historically dodge conversations about what is the right technology for a given situation.  I think the Sitecore philosophy was that advocating for one technical approach over another is risky and means a Sitecore endorsement could make them accountable if a technology approach fails for one reason or another.  For as long as I’ve known Sitecore, it’s a software company intent on selling licenses, not dictating how to use the product.  Risk aversion has kept them from getting really down in the weeds with customers.  This tide has been turning, however, with things like Helix and now this more aggressive messaging about the limitations of Lucene.  I think it’s great for Sitecore to be more vocal about best practices, it’s just taken years for them to come around to the idea.

As a bit of a search geek, I want to state for the record that this new Solr over Lucene guidance from Sitecore is not really an indictment of Lucene.  The Apache Lucene project, and it’s cousin the .Net port Lucene.net that Sitecore makes use of out-of-the-box, was ground breaking in many ways.  As a technology, Lucene can handle enormous volumes of documents and performs really well.  Solr is making use of Lucene underneath it all, anyway!  This recent announcement from Sitecore is more acknowledgement that Sitecore’s event plumbing is no substitute for Solr’s CAP-theorem straddling acrobatics.  Sitecore is done trying to roll their own distributed search system.  I think this is Sitecore announcing that they’re tired of patching the EventQueue, troubleshooting search index update strategies for enterprise installations, and giving up on ensuring clients don’t hijack the Sitecore heartbeat thread and block search indexing with some custom boondoggle in the publishing pipeline.  They’re saying: we give up — just use Solr.

Amen.  I’m fine with that.

To be honest, I think this change of heart can also be explained by the predominant role Azure Search plays in the newest Sitecore PaaS offering.  Having an IP address for all the “search stuff” is awful nice whether you’re running in Azure or anywhere else.  It’s clear Sitecore decided they weren’t keen to recreate the search wheel a few years ago, and are steadily converging around these technologies with either huge corporate backing (Azure Search) or a vibrant open source community (Solr).

I should also note, here, that Coveo for Sitecore probably welcomes this opportunity for Sitecore teams to break away from the shackles of the local Lucene index.  I’m not convinced that long-term Coveo can out-run the likes of Azure Search or Solr, but I know today if your focus is quickly ramping up a search heavy site on Sitecore you should certainly weight the pros/cons of Coveo.

With all this said, I want to take my next few posts and dig really deep into Solr for Sitecore and talk about performance monitoring, tuning, and lots of areas that get overlooked when it comes to search with Sitecore and Solr.   So I’ll have more to say on this topic in the next few days!

 

The Sitecore Pie: strategic slicing for better implementations

The Sitecore Pie

pie
At some point I want to catalog the full set of features a Sitecore Content Delivery and Content Management server can manage in a project.  My goal would be to identify all the elements that can be split apart into independent services.  This post is not a comprehensive list of those features, but serves to introduce the concept.

Think of Sitecore as a big blueberry pie that can be sliced into constituent parts.  Some Sitecore sites can really benefit from slicing the pie into small pieces and letting dedicated servers or services manage that specific aspect of the pie.  Too often, companies don’t strategize around how much different work their Sitecore solution is doing.

An example will help communicate my point: consider IIS and how it serves as the execution context for Sitecore.   Many implementations will run logic for the following through the same IIS server that is handling the Sitecore request for rendering a web page.  These are all slices of the Sitecore pie for a Sitecore Content Delivery server:

  1. URL redirection through Sitecore
  2. Securing HTTP traffic with SSL
  3. Image resizing for requests using low-bandwidth or alternative devices
  4. Serving static assets like CSS, JS, graphics, etc
  5. Search indexing and query processing (if one is using Lucene)

If you wanted to cast a broader net, you could include HTTP Session state for when InProc mode is chosen, Geo-IP look-ups for certain CD servers, and others to this list of pie slices.  Remember, I didn’t claim this was an exhaustive list.  The point is: IIS is enlisted in all this other work besides processing content into HTML output for Sitecore website visitors.

Given our specific pie slices above, one could employ the following alternatives to relieve IIS of the processing:

  1. URL Redirection at the load balancer level can be more performant than having Sitecore process redirects
  2. Apply SSL between the load balancer and the public internet, but not between the IIS nodes behind your load balancer — caled “SSL Offloading” or “SSL Termination”
  3. There are services like Akamai that fold in dynamic image processing as part of their suite of products
  4. Serving static assets from a CDN is common practice for Sitecore
  5. Coveo for Sitecore is an alternative search provider that can take a lot of customer-facing search aspects and shift it to dedicated search servers or even Coveo’s Cloud.  One can go even further with Solr for Sitecore or still other search tiers if you’re really adventurous

My point is, just like how we hear a lot this election season about “let Candidate X be Candidate X” — we can let Sitecore be Sitecore and allow it to focus on rendering content created and edited by content authors and presenting it as HTML responses.  That’s what Sitecore is extremely valuable for.

Enter the Cache

I’ve engaged with a lot of Sitecore implementations who were examining their Sitecore pie and determining what slices belong where . . . and frequently we’d make the observation that the caching layer of Sitecore was tightly coupled with the rest of the Sitecore system and caching wasn’t a good candidate for slicing off.  There wasn’t a slick provider model for Sitecore caches, and while certain caches could be partially moved to other servers, it wasn’t clean, complete, or convenient.

That all changed officially with the initial release of Sitecore 8.2 last month.  Now there is a Sitecore.Caching.DefaultCacheManager class, a Sitecore.Caching.ICache interface, and other key extension points as part of the standard Sitecore API.  One can genuinely add the Sitecore cache to the list of pie slices one can consider for off-loading.

In my next post, I will explore using this new API to use Redis as the cache provider for Sitecore instead of the standard in-memory IIS cache.

Sitecore RemoteRebuild Strategy Best Practices

I spent entirely too long troubleshooting a customer’s approach to the RemoteRebuild indexing strategy for Sitecore.  The official documentation is fairly straight-forward, but there are some significant gaps left up to the reader to infer or figure out.

I think the heading “Best Practice” on that documentation page is great, and I hope Sitecore continues to expand those notes to be as thorough as possible.  That being said, I would include the following example patch configuration showing how to apply the change without manually editing Sitecore base install files:

<configuration xmlns:patch="http://www.sitecore.net/xmlconfig/">
  <sitecore>
    <contentSearch>
      <configuration type="Sitecore.ContentSearch.ContentSearchConfiguration, Sitecore.ContentSearch">
        <indexes hint="list:AddIndex">
           <index id="sitecore_web_index" type="Sitecore.ContentSearch.LuceneProvider.LuceneIndex, Sitecore.ContentSearch.LuceneProvider">
            <strategies hint="list:AddStrategy">
              <strategy ref="contentSearch/indexConfigurations/indexUpdateStrategies/remoteRebuild" />
            </strategies>
          </index>
        </indexes>
      </configuration>
    </contentSearch>
  </sitecore>
</configuration>

This patch should be applied on the Sitecore CD servers where you want to perform the Lucene re-indexing operations.  There are no configuration changes necessary to the Sitecore CM servers.

Speaking of CM server, one might think the posted Sitecore prerequisites cover it:

  • The name of the index on the remote server must be identical to the name of the index that you forced to rebuild.
  • You must enable the EventQueue.
  • The database you assign for system event queue storage (core by default) must be shared between the Sitecore instance where the rebuild takes place and the other instances.

But the biggest addition to this I found was that the Indexing Manager feature in the Control Panel does not call the proper API to trigger the RemoteRebuild activity, so this screen is not where one initiates the remote rebuild:

Won’t work:

remoterebuild

The only way to properly activate the RemoteRebuild is via the Developer ribbon in the Sitecore Content Editor

This works:

remoterebuild2

See this quick video on how to enable this Developer area in the Content Editor, in case this is new to you.

Apparently this dependence on the Developer ribbon is a bug in Sitecore and scheduled to be corrected in a future release.  I know I spent several hours troubleshooting configuration changes and attempted many permutations of the configuration for the RemoteRebuild strategy before Sitecore support confirmed this fact (as did the ever-helpful Ivan Sharamok).

The only other detail I’ll share on the Sitecore RemoteRebuild strategy is that one should verify if the Lucene index adding RemoteRebuild to is using Sitecore.ContentSearch.LuceneProvider.SwitchOnRebuildLuceneIndex or just Sitecore.ContentSearch.LuceneProvider.LuceneIndex.  The patch .config file one uses should reference the appropriate type attribute.

A final note I’ll add about rationale for the RemoteRebuild . . . this customer has several Sitecore Content Delivery servers spread in different data centers, and is currently reliant on the Lucene search provider for Sitecore.  One could make a strong case the customer should be using Solr instead, and indeed this is on their implementation road map, but in the meantime we suggested the RemoteRebuild as a way for them to centrally manage the state of indexes on all their disparate CD servers.  The alternative would be depending on the onPublishEndAsync strategy (which works well, but has limited applications in certain cases), or doing some administrative connection to each CD server (via browser or PowerShell etc) and doing something along the lines of the following when they need to rebuild specific indexes:

public void Page_Load(object sender, EventArgs e)
{
        Server.ScriptTimeout = 3600; //one hour in case this takes a while
        ContentSearchManager.GetIndex("your_index_name").Rebuild();
}

This quickly becomes unwieldy and hard to maintain, however, so getting RemoteRebuild wired up in this case is going to be a valuable addition to the implementation.

Search User Group Talk

I did a talk for the New England Search Technology (NEST) user group yesterday.  Even though the meetings in Boston are a good 90+ minutes away for me, I try to make the trip there a couple times a year since the topics are usually very relevant to what I’m up to with Sitecore.   I offered to do a talk bridging the Sitecore and Search domains, and they took me up on it.  The audience is typically serious Solr and ElasticSearch technologists, some regular committers to those projects, so it was fun to combine those domains with Sitecore’s relative immaturity when it comes to the platform of search.

I don’t want to just post the powerpoint presentations and say “have at it” since the presentations require context to make sense (and it is 2 different powerpoints to sift through).  I’m a non-conventional guy, and I try to avoid powerpoint with bullet points galore and hyper-structured material.  The talk was more a conversation about search and the challenges unique to search with Sitecore (and other CMS platforms that build on Lucene).

My premise relied heavily on Plato’s Allegory of the Cave where I was a “prisoner” experiencing the world of search through the “cave” of Sitecore.  In reality, search is an enormous space with lots of complexity and innovation across the technology . . . but in terms of Sitecore, we experience a filtered reality (the shadows on the cave wall).  This graphic represents a traditional Allegory of the Cave concept:

cave

I’m not going to summarize the entire talk, but I want to state for the record this isn’t a particular criticism of Sitecore — it’s just the nature of working with a product that is built on top of other products.  In Sitecore’s case with Lucene, for example, there’s a .Net port of Lucene (Lucene.Net) and a custom .Net event pipeline in Sitecore used to orchestrate Lucene.Net activities via an execution context of IIS, and so on.

Understanding Search technology with Sitecore is always going to be filtered through the lens of Sitecore.  My talk was addressing that fact, and how Search (with a capital “S”) is a far broader technology than what is put to use in Sitecore.  Understanding the world of Search beyond the confines of Sitecore can be very revealing, and opens up a lot of opportunity for improving a Sitecore implementation.

I should also note that my talk benefited from insightful input from Sitecore-Coveo Jeff, Tim “Sitecore MVP” Braga, and Al Cole from Lucidworks and others.  The host, Monster.com, shared a great meeting space for us and opened the evening with a quick survey of their specific experience with search using both Solr, Elasticsearch, and their own proprietary technology.

While not specific to Sitecore, one link I wanted to share in particular was the talk about Bloomberg’s 3-year journey with Solr/Lucene; it’s a talk from the Berlin Buzzwords conference a couple weeks ago and thoroughly worth watching.  Getting search right can take persistence and smart analysis, regardless of the platform.  With Sitecore, too often implementations assume Search will just work out of the box and not appreciate that it’s a critical set of features worthy of careful consideration.

I’ll have a few follow-up posts covering more of the points I made in my talk; some lend themselves to distinct blog posts instead of turning this into a sprawling re-hash of the entire evening.

Adding Custom Fields to Sitecore Search Indexes in Solr vs Lucene

We’re doing a lot with Solr lately, and I wanted to make a note of the difference in how one defines custom fields to include in your Sitecore indexes. I’m not including how one defines a full custom index, but just the field definition; if one has a “Description” and “Conversation” field to include in a search index for Sitecore, this summarizes the basics of how to make that happen.

Lucene

One adds <field> nodes to the fieldNames and fieldMap sections of the XML configuration.  Again, this example alters the defaultLuceneIndexConfiguration for Sitecore which isn’t a best-practice; it’s generally better to define your own index that is laser-focused just on the content you want to use for your implementation, but I wanted as succinct an example as possible to note the caveat!

 <sitecore>  
   <contentSearch>  
    <indexConfigurations>  
      <defaultLuceneIndexConfiguration type="Sitecore.ContentSearch.LuceneProvider.LuceneIndexConfiguration, Sitecore.ContentSearch.LuceneProvider">  
       <fieldMap type="Sitecore.ContentSearch.FieldMap, Sitecore.ContentSearch">  
         <fieldNames hint="raw:AddFieldByFieldName">  
          <field fieldName="Description" storageType="YES" indexType="TOKENIZED" vectorType="NO" boost="1f" type="System.String" settingType="Sitecore.ContentSearch.LuceneProvider.LuceneSearchFieldConfiguration, Sitecore.ContentSearch.LuceneProvider">  
            <analyzer type="Sitecore.ContentSearch.LuceneProvider.Analyzers.LowerCaseKeywordAnalyzer, Sitecore.ContentSearch.LuceneProvider" />  
          </field>  
          <field fieldName="Conversation" storageType="YES" indexType="TOKENIZED" vectorType="NO" boost="1f" type="System.String" settingType="Sitecore.ContentSearch.LuceneProvider.LuceneSearchFieldConfiguration, Sitecore.ContentSearch.LuceneProvider">  
            <analyzer type="Sitecore.ContentSearch.LuceneProvider.Analyzers.LowerCaseKeywordAnalyzer, Sitecore.ContentSearch.LuceneProvider" />  
          </field>  
         </fieldNames>  
       </fieldMap>  
      </defaultLuceneIndexConfiguration>  
    </indexConfigurations>  
   </contentSearch>  
 </sitecore>  

Solr

One adds <fieldReader> nodes to the mapFieldByTypeName and FieldReaders sections of the XML configuration.  Same caveat applies here for Solr as with Lucene — it’s generally recommended to define your own Solr index and work from that, but I wanted a minimal example:

 <sitecore>  
  <contentSearch>    
   <indexConfigurations>  
    <defaultSolrIndexConfiguration type="Sitecore.ContentSearch.SolrProvider.SolrIndexConfiguration, Sitecore.ContentSearch.SolrProvider">  
     <FieldReaders type="Sitecore.ContentSearch.FieldReaders.FieldReaderMap, Sitecore.ContentSearch">  
      <mapFieldByTypeName hint="raw:AddFieldReaderByFieldTypeName">  
       <fieldReader fieldTypeName="Description"  fieldReaderType="Sitecore.ContentSearch.FieldReaders.RichTextFieldReader, Sitecore.ContentSearch" />  
       <fieldReader fieldTypeName="Conversation"  fieldReaderType="Sitecore.ContentSearch.FieldReaders.RichTextFieldReader, Sitecore.ContentSearch" />  
      </mapFieldByTypeName>  
     </FieldReaders>  
    </defaultSolrIndexConfiguration>  
   </indexConfigurations>  
  </contentSearch>  
 </sitecore>  

 

This is really just the tip of the iceburg, as fieldReaderType definitions for Solr and Lucene analyzers — just two examples — open up so many possibilities.  I struggled to find this side-by-side info for Lucene and Solr as it applies to Sitecore (Sitecore 8.1 update-1, specifically), so I wanted to share it here.