Sitecore is steadily reducing the use-cases for using Lucene as the search provider for the Sitecore product. Sitecore published this document on “Solr or Lucene” a few days ago, going so far as to state this one criteria where you must use Solr:
- You have two or more content delivery servers
Read that bullet point again. If you’ve worked with Sitecore for a while, this should trigger an alarm.
Sound the alarm: most implementations have 2 or more CD servers! I’d say 80% or more of Sitecore implementations are using more than a single CD server, in fact. Carrying this logic forward, 80% of Sitecore implementations should not be running on Lucene! This is a big departure for Sitecore as a company, who would historically dodge conversations about what is the right technology for a given situation. I think the Sitecore philosophy was that advocating for one technical approach over another is risky and means a Sitecore endorsement could make them accountable if a technology approach fails for one reason or another. For as long as I’ve known Sitecore, it’s a software company intent on selling licenses, not dictating how to use the product. Risk aversion has kept them from getting really down in the weeds with customers. This tide has been turning, however, with things like Helix and now this more aggressive messaging about the limitations of Lucene. I think it’s great for Sitecore to be more vocal about best practices, it’s just taken years for them to come around to the idea.
As a bit of a search geek, I want to state for the record that this new Solr over Lucene guidance from Sitecore is not really an indictment of Lucene. The Apache Lucene project, and it’s cousin the .Net port Lucene.net that Sitecore makes use of out-of-the-box, was ground breaking in many ways. As a technology, Lucene can handle enormous volumes of documents and performs really well. Solr is making use of Lucene underneath it all, anyway! This recent announcement from Sitecore is more acknowledgement that Sitecore’s event plumbing is no substitute for Solr’s CAP-theorem straddling acrobatics. Sitecore is done trying to roll their own distributed search system. I think this is Sitecore announcing that they’re tired of patching the EventQueue, troubleshooting search index update strategies for enterprise installations, and giving up on ensuring clients don’t hijack the Sitecore heartbeat thread and block search indexing with some custom boondoggle in the publishing pipeline. They’re saying: we give up — just use Solr.
Amen. I’m fine with that.
To be honest, I think this change of heart can also be explained by the predominant role Azure Search plays in the newest Sitecore PaaS offering. Having an IP address for all the “search stuff” is awful nice whether you’re running in Azure or anywhere else. It’s clear Sitecore decided they weren’t keen to recreate the search wheel a few years ago, and are steadily converging around these technologies with either huge corporate backing (Azure Search) or a vibrant open source community (Solr).
I should also note, here, that Coveo for Sitecore probably welcomes this opportunity for Sitecore teams to break away from the shackles of the local Lucene index. I’m not convinced that long-term Coveo can out-run the likes of Azure Search or Solr, but I know today if your focus is quickly ramping up a search heavy site on Sitecore you should certainly weight the pros/cons of Coveo.
With all this said, I want to take my next few posts and dig really deep into Solr for Sitecore and talk about performance monitoring, tuning, and lots of areas that get overlooked when it comes to search with Sitecore and Solr. So I’ll have more to say on this topic in the next few days!