Azure Search compared to Solr for Sitecore PaaS (Chapter 2: Querying)

I carried forward my Azure PaaS benchmarking work from earlier this month (see this post on the indexing side of the equation for the start of the story).

For a quick refresher, I’ve used an ARM template based deployment of Sitecore to get a system resembling the following:

ARM Templates Arch

The element I’m exercising in the benchmarks is how Sitecore’s web servers work with the “Search” icon in the diagram above.  I tackled the document ingestion side (how data gets into the search indexes) in my earlier post.  This post addresses the querying side of things (how data gets out of the search indexes).

By default, Azure PaaS search with Sitecore is configured to use Azure Search.  Solr is another viable option.

Here’s where I’ll interject that Coveo also has an excellent search technology for Sitecore.  There are specific use-cases where Coveo is a strong fit, however, and in my indexing the sitecore_core_index evaluations in the earlier post Coveo would not be considered a good fit.  This changes, however, for the set of benchmarks I’ve run in this post.  I am in the process of testing the Coveo approach in Azure PaaS for Sitecore . . . it’s hot off the presses, so there are still rough edges to work around . . . but Coveo is not part of this write-up for the time being.  I will post an update here once I’ve completed the analysis involving Coveo.

In considering Azure Search vs Solr, I used a methodology with JMeter laid out in a great KB article from Sitecore at  I have a LaunchSitecore site running and I use JMeter to automate visits to the site, simulating simple user behaviour.  I don’t go too crazy with this, because I’m more interested in exercising a basic Sitecore work load than doing a deep-dive in xDB traffic simulation.

My first post showed a clear advantage to Solr for the indexing side of search, but for the querying side I can say there is very little variance between Azure Search and Solr.  Sitecore does a good job of protecting data repositories with layers of data and html caches, but even with those those features disabled (we’re talking cacheHtml=”false”on the site definition, <cacheSizes> configuration all set to a heretical zero (“0”), etc) there isn’t a significant difference between the two technologies.

I’m not going to put up a graph of it, because the throughput as measured by JMeter for tests of 20, 50, 100, 200, or  more visitors performed almost the same.

I could develop a more search heavy set of benchmarks, performing a random dictionary of searches against a large custom index that Sitecore responds to but must bypass all caches etc, but that feels like overkill for what I’m looking to achieve.  Maybe that’s appropriate once I bring Coveo into the benchmarking fun.

For this, I wanted to get a sense for the relative performance between Azure Search and Solr as it relates to Sitecore PaaS and I think I’ve done that.  Succinctly:

  1. Solr is considerably faster at search indexing (courtesy of the search provider implementation in Sitecore)
  2. both Azure Search and Solr perform about the same when it comes to querying a basic Sitecore site like LaunchSitecore (again, courtesy of the search provider implementation in Sitecore)

This isn’t the definitive take on the topic.  It’s more like the beginning.  Azure Search is native to Azure, so there are significant advantages there.  There is a lot of momentum around Azure and Sitecore in general, so that story will continue to evolve.

There are Solr as a service options out there that make Solr for Sitecore much easier (such as which I’ll blog about in the next few days), but Solr can be a lot for corporate IT departments to take on, so it isn’t a simple choice for everyone.



Azure Search compared to Solr for Sitecore PaaS (Chapter 1: Ingestion)

I’ve been investigating Azure PaaS architectures for Sitecore lately, and I wanted to take a few minutes and summarize some recent findings around the standard Sitecore search providers of Solr and, new for Sitecore PaaS, Azure Search.

To provision Azure PaaS Sitecore environments, I used a variant of the ARM Template approach outlined in this blog.  For simplicity, I evaluated a basic “XP-0” which is the name for the Sitecore CM/CD server combined into a single App Service.  This is considered a basic setup for development or testing, but not real production . . . that’s OK for my purposes, however, as I’m interested in comparing the Sitecore search providers to get an idea for relative performance.

The Results

I’ll save the methodology and details for lower in this post, since I’m sure most don’t want to wait for an idea for the results.  The Solr search provider performed faster, no matter the App Service or DB Tier I evaluated in Azure PaaS:


The chart shows averages to perform the full re-index operation in minutes.  You may want to refer to my earlier post about the lack of HA with Sitecore’s use of Azure Search; rest assured Sitecore is addressing this in a product update soon, but for now it casts a more significant shadow over the 60+ minutes one could spend waiting for the search re-index to complete.


In these PaaS trials, I setup the sample site LaunchSitecore.  I performed rebuilds of the sitecore_core_index through the Sitecore Control Panel as my benchmark; I like using this operation as a benchmark since it has over 80,000 documents.  It doesn’t particularly exercise the querying aspects of Sitecore search, though, so I’ll save that dimension for another time.  I’ve got time set aside for JMeter testing that will shed light on this later…

To get the duration the system took to complete the re-index, I queried the PaaS Sitecore logs as described in this Sitecore KB article.  Using results like the following, I took the timestamps since I’ve found the Sitecore UI to be unreliable in reporting duration for index rebuilds.


You can get at this data yourself in App Insights with a query such as this:

| where timestamp > now(-3h)
| where message contains " Crawler [sitecore_core_index]" 
| project timestamp, message
| sort by timestamp desc

Remember, I’ve used the XP0 PaaS ARM Templates which combine CM and CD roles together, so there’s no need for the “where cloud_RoleInstance == ‘CloudRoleBlahBlah'” in the App Insights query.

Methodology – Azure Search

For my Azure Search testing, I experimented with scaling options for Azure Search.  For speedier document ingestion, the guidance from Microsoft says:

“Partitions allow for scaling of document counts as well as faster data ingestion by spanning your index over multiple Azure Search Units”

The trials should perform more quickly with additional Azure Search Partitions, but I found changing this made zero difference.  My instincts tell me the fact Sitecore isn’t using Azure Search Indexers could be a reason scaling Azure Search doesn’t improve performance in my trials.  Sitecore is making REST calls to index documents with Azure Search, which is fine, but possibly not the best fit for high-volume operations.  I haven’t looked in the DLLs, but perhaps there’s other async models one could use in the the Azure Search provider when it comes to full re-indexes?  It could also be that the 80,000 documents in the sitecore_core_index is too small a number to take advantage of Azure Search’s scaling options.  This will be an area for additional research in the future.

Methodology – Solr

To host Solr for this trial, I used a basic Solr VM in the Rackspace cloud.  One benefit to working at Rackspace is easy access to these sorts of resources 🙂  I picked a 4 GB server running Solr 5.5.1.  I used a one Solr core per Sitecore index (1:1 mapping), see my write-up on Solr core organization if you’re not following why this might be relevant.

For my testing with the Solr search provider, Sitecore running  Azure PaaS needed to connect outside Azure, so I selected a location near to Azure US-East where my App Service was hosted.  I had some concerns about outbound data charges, since data leaving Azure will trigger egress bandwidth fees (see this schedule for pricing).  For the few weeks while I collected this data, the outbound data fee totaled less than $40 — and that includes other people using the same Azure account for other experiments.  I estimate around 10% (just $4) is due to my experiments.  Suffice it to say using a Solr environment outside of Azure isn’t a big expense to worry about.  Just the same, running Solr in an Azure VM would certainly be the recommendation for any real Sitecore implementation following this pattern.  For these tests, I chose the Rackspace VM since I already had it handy.

I’d be remiss to not mention the excellent work Sitecore’s Ivan Sharamok has posted to help make Solr truly enterprise ready with Sitecore.  Basic Auth for Solr with Sitecore is important for the architecture I exercised; this post is another gem of Ivan’s worth including here, even if I didn’t make use of it in this specific set of evaluations.  Full disclosure: I worked with Ivan while I was at the Sitecore mothership, so I’m biased that his contributions are valuable, but just because I’m biased doesn’t mean I’m wrong.


I’ll include my chart once again:


These findings lead me to more questions than answers, so I’m hesitant to make any sweeping generalizations here.  I’m safe declaring Sitecore’s search provider for Solr to be faster than the Azure Search alternative when it comes to full index re-builds, that’s clear by an order of magnitude in some cases.  Know that this is not a judgement about Solr versus Azure Search;  this is about the way Sitecore makes use of these two search technologies out of the box.  The Solr provider for Sitecore is battle-tested and has gone through many years of development; I think the Azure Search provider for Sitecore could be considered a beta at this point, so it’s important to not get ahead of ourselves.

A couple other conclusions could be:

  1. Whether using Solr or Azure Search, there is no improvement to search re-index performance when changing between the S3 to P3 tiers in Azure App Services.
  2. Changing from the S1 to S3 tiers, on the other hand, makes a big perf difference in terms of search re-indexing.
    • Honestly, the S1 tier is almost unusable as the single CPU core and 1.75 GB RAM are way too low for Sitecore; the S3 with 4 cores and 7 GB RAM is much more reasonable to work with.

Next Up

It’s time for me to consider the more fully scaled PaaS options with Sitecore, and I need to exercise the query side of the Sitecore search provider instead of just the indexing side.

Auto-suggest with Solr Facets in Sitecore

Sitecore’s auto-suggest feature for search in the Content Authoring environment is pretty slick, but there is some confusing documentation from Sitecore about how to set it up properly with Solr.  As of today, Sitecore’s documentation on integrating with Solr indicates…

“When you implement Solr with Sitecore you need to enable term support in the Solr search handler.  The term functionality is built into Solr but is disabled by default. To power the dropdowns in the UI you must enable the terms component.

That above documentation will be updated at some point by Sitecore, since it’s no longer the case for the latest version of Sitecore — 8.2 rev. 161221 (Update-2).

In earlier versions of Sitecore, search in the Sitecore Content Editor could make use of the Solr “terms” component to populate suggestions.  This is why this guidance has previously been part of the Solr integration documentation from Sitecore.  Read more about Solr’s use of this auto-suggest through terms at

Sitecore’s strategy of making use of the “terms” component has changed with recently, however.  Sitecore now uses faceting with Solr instead of terms.

To prove this out, I’m going to turn to the Solr logs after I try some queries for content in the Sitecore client.  Refer to this documentation from Sitecore if you’re looking for more context on how to use the search facility — there are a lot of features that are very under-utilized, in my experience.  I’ll specify a clause by typing Updatedby: and then “siteco” to engage the auto-suggest feature:searbhby

Very nice, right?

Under the covers, the Solr logs will reveal something like this . . .

2017-02-17 19:33:07.546 INFO  (qtp33171127-11) [   x:trial_core] o.a.s.c.S.Request [trial_core]  webapp=/solr path=/select params={q=*:*&facet.field=parsedupdatedby_s&facet.prefix=siteco&rows=0&facet=true&version=2.2&facet.sort=true} hits=24626 status=0 QTime=2

. .  . and that can be further debugged by turning it into the URL request powering that auto-suggest response . . .


. . . and that would return results like the following:


If instead we tried an author: search in Sitecore, for example, the facet.field would be parsedcreatedby_s instead of parsedupdatedby_s.

I don’t want to go too far down this rabbit hole.  I really just wanted to share that despite what the documentation shows, it’s not necessary to enable the Solr term component on the /select requestHandler in Solr if you’re using the most recent version of Sitecore.  I’ve confirmed with official Sitecore support that this change was tagged as change #444661 and that’s it was incorporated into the product since Sitecore 8.1 update-1 (rev. 151207); the release notes for 8.1 update-1 are vague, but here it is:

Autocomplete for known fields such as language did not work in the Content Editor Search tab using the SOLR provider. The problem was related to the SOLR server configuration. This has been fixed so that Sitecore no longer depends on this configuration. (444661)

Happy faceting to all!


Strategies for Sitecore Index Organization into Solr Cores

A few days ago, I shared a graphic I put together to illustrate how Solr can be used to organize Sitecore “indexes” into Solr “cores” — this post has the complete graphic.  I want to elaborate on how one sets Sitecore up to use these two approaches, and dig further into the details.

1:1 Sitecore Index to Solr Core Strategy

To start, here’s a visual showing the typical way Sitecore “indexes” are structured in Solr using a one-to-one (1:1) mapping:


This shows each of the default search indexes defined by Sitecore organized into their own cores defined in Solr.  It’s a 1:1 mapping.  This 1:1 strategy means each index has their own configuration (“conf”) directory in Solr, so seperate stopwords.txt, solrconfig.xml, schema.xml, and so on; it also means each index has their own (“data”) directory in Solr, so separate tlog folders, separate Segment files, etc.

This is the setup one achieves by following the community documentation on setting up Sitecore with Solr; specifically, this quote from that write-up is where you’re doing a lot of the grunt work around setting up distinct Solr cores for each Sitecore index:

“Use the process detailed in Steps 4-7 to create new cores for all the remaining indexes you would like to move to SOLR.”

Since this is the common strategy, I’m not going to go into more details as it’s straight-forward to Sitecore teams.

Kitchen Sink (∞:1 Sitecore Index to Solr Core) Strategy

Here is the comparable graphic showing the ∞:1 strategy of structuring Sitecore indexes in Solr; I like to think of this as the Kitchen Sink container for all Sitecore indexes, since everything goes into that single core just like the kitchen sink:


With this approach, a single data and configuration definition is shared by all the Sitecore indexes that reside in Solr.  The advantages are reduced management (setting up the Solr replicationHandler, for example, requires updating 15 solrconfig.xml files in the 1:1 approach, but the Kitchen Sink would require only one solrconfig.xml file to update).  There are significant drawbacks to consider with the Kitchen Sink, however, as you’re sacrificing scaling options specific to each Sitecore index and enforcing a common schema.xml for every index stored in this single core.  There are plenty of reasons not to do this for a production installation of Sitecore, but for a crowded Sitecore environment used for acceptance testing or other use-cases where bullet-proof stability and lots of flexibility when it comes to performance tuning, sharding, etc is not necessary, you could make a good case for the Kitchen Sink strategy.

The only change necessary to a standard Sitecore configuration to support this Kitchen Sink approach is to patch the contentSearch definitions for the Sitecore indexes where the name of the Solr “core” is specified (stored by default in config files like Sitecore.ContentSearch.Solr.Index.Master.config,  Sitecore.ContentSearch.Solr.Index.Web.config, etc).   This is telling Sitecore which Solr core contains the index, but the actual name of the core doesn’t factor into the ContentSearch API code one uses with Sitecore.   A patch such as the following would handle both the sitecore_master_index and the sitecore_web_index to organize into a Solr Core named “kitchen_sink:”

<configuration xmlns:patch="">
          <index id="sitecore_master_index" type="Sitecore.ContentSearch.SolrProvider.SolrSearchIndex, Sitecore.ContentSearch.SolrProvider">
            <param desc="core">kitchen_sink</param>
          <index id="sitecore_web_index" type="Sitecore.ContentSearch.SolrProvider.SolrSearchIndex, Sitecore.ContentSearch.SolrProvider">
            <param desc="core">kitchen_sink</param>

If you peek into the Solr Admin for the kitchen_sink core that I’m using, specifically the Schema Browser in the Solr Admin UI, it becomes clear how Sitecore uses a field named “_indexname” to represent the Sitecore index value.  For this screenshot below, I’ve set the kitchen_sink core to contain two Sitecore indexes: sitecore_master_index and sitecore_web index:


This shows us the two terms stored in that _indexname field, and that there are 18,774 for sitecore_master_index and 5,851 for sitecore_web_index.  Even though the indexes are contained in the same Solr Core, Sitecore ContentSearch API code like this . . .

Sitecore.ContentSearch.ISearchIndex index = 
    using (Sitecore.ContentSearch.IProviderSearchContext ctx = 

. . . doesn’t care whether all the Sitecore indexes reside in a single Solr “Core” or if they’re in their own following a 1:1 mapping strategy.

Caveats and Going In A Different Direction

There was a bug or two in earlier versions of Sitecore related to this, so be careful with early Sitecore 7.2 or Sitecore 8 implementations (and if you’re using Sitecore 7.5, you’ve got plenty of other things to worry about so don’t sweat a Solr Core organization strategy!).

I should also note that while this post is looking at combining Sitecore indexes into a single Solr Core for convenience and to reduce the management headaches of having 15 sets of Solr Cores to update etc, there are some implementations that go in the opposite direction.  Consider a strategy like the following:



There may be circumstances where keeping Sitecore indexes in their own Solr Core — and even isolating them further into their own Solr implementation — could be in order.  Solr runs in a JVM and this could certainly factor in, but there are other shared run-time resources that Solr sets aside for the whole Solr application.

I’m not familiar enough with these sorts of implementations that I want to comment further or recommend any course of action related to this right now, but it’s good to think about and consider with Solr tuning scenarios.  I just wanted to share it, as it’s a logical dimension to consider given the two previous strategies in this post.


The Solr *Optimize Now* Button for Sitecore Use Cases

If you’ve worked with Sitecore and Solr, you’re no stranger to the Solr Admin UI.  There are great aspects to that UI, and some exciting extension points with Sitecore implications too, but I know one element of that Solr UI that causes some head-scratching . . . the “optimize now” feature:



The inclusion of the badcauses people to think “something is wrong . . . red bad . . . must click now!”

What is this Optimize?

I’m writing this for the benefit of Sitecore developers who may not be coming at this from a deep search background: do not worry if your Solr cores show this badicon encouraging you to optimize now.  Fight that instinct.  For standard Sitecore Solr cores that are frequently updating, such as the sitecore_core_index, sitecore_master_index, sitecore_analytics_index, and — depending on your publishing strategy — sitecore_web_index, one may notice these cores almost always appear with this “optimize now” button in the Solr Admin UI.  Other indexes, too, may be heavily in use depending on how your Sitecore implementation is structured.  If you choose the optimize now option and then reload the screen, you’ll see the friendly green check mark next to Optimized and you should notice the Segment Count drops to a value of 1:


Segments are Lucene’s (and therefore Solr’s) file system unit.  On disk, “segments” are where data is durably stored and organized for Lucene.  In Solr version 5 and newer, one can visualize Segment details for each Solr Core via the Solr Admin UI Segments Info screen.  This shows 2 Segments:


If your Segment count is greater than 1, the Solr Admin UI will report that your Solr Core is in need of Optimization (with that somewhat alarmingbadicon).  The Optimize operation re-organizes all the Segments in a Core down to just one single Segment . . . and for busy Sitecore indexes this is not something to do very often (or at all!).

To track an optimize operation through at the file system level, consider this snapshot of the /data/index directory for a sitecore_master_index before performing optimization; note the quantity of files:optimizebefore

After the optimization, consider the same file system:


When in doubt, don’t optimize

Solr’s optimize now command is like cleaning up a house after a party.  It reduces the clutter and consolidates the representation of the Solr Core on disk to a minimal footprint.  The problem, is, however, optimizing takes longer the larger the index is — so the act of optimizing may produce very non-optimal performance while it’s doing the work.  Solr has to read a copy of the entire index and restructure the copy into a single Segment.  This can be slow.  Caches must be re-populated after an optimization, too, compounding the perf impact.  To continue the analogy of the optimize now being like cleaning after a party, imagine cleaning up during a party; maybe you pause the music and ask everyone to leave the house for 20 minutes while you straighten everything up.  Then everyone returns and the partying resumes, with the cleaning being a mostly useless delay.

To draw from the official Solr documentation at

“Optimizing is very expensive, and if the index is constantly changing, the slight performance boost will not last long. The trade-off is not often worth it for a non static index.”

For those Sitecore indexes in Solr that are decidedly non-static, then, ignore that “optimize now” feature of the Solr Admin UI.  It’s better to pay attention to Solr “Merge Policies” for a rules based approach to maintaining Segments; this is a huge topic, one left for another time.

When to consider optimizing

Knowing more about the optimization process in Solr, then, we can think about when it may be appropriate to apply the optimize command.  For external data one is pulling into Solr, for example, a routine optimization could make sense.  If you have a weekly product data load, for instance, where 10,000 items are regularly loaded into a Solr Core and then they remain un-changed, optimization after the load completes makes a lot of sense.  That data in the Core is not dynamic.  When the data load completes, you could include an API call to Solr that triggers the optimize.

An API call to trigger an optimize in Solr is available through an update handler call : http://solr-server:8983/solr/sitecore_product_catalog/update?stream.body=<optimize><query>*:*</query></optimize&gt;

Sitecore search has a very checkered past with the Lucene Optimize operation.  I’ve worked on big projects that were crippled by too frequent optimizing work like that discussed in Uli Weltersbach’s post.  We ended up customizing the Optimize methods to be no-op statements, or another variation like that.  For additional validation, check out the Lucene docs on the optimize method:

“This method has been deprecated, as it is horribly inefficient and very rarely justified.”

Since Solr sits on top of Lucene, the heritage of Lucene’s optimize is still relevant and — in the Solr Admin UI — we see a potential performance bottleneck button ripe for clicking . . . fight that instinct!



Solr Configuration for Integration with Sitecore

I’ve got a few good Solr and Sitecore blogs around 75% finished, but I’ve been too busy lately to focus on finishing them.  In the meantime, I figure a picture can be worth 1,000 words sometimes so let me post this visual representation of Solr strategies for Sitecore integrations.  One Solr core per index is certainly the best practice for production Sitecore implementations, but now that Solr support has significantly matured at Sitecore a one Solr core for all the Sitecore indexes is a viable, if limited, option:


There used to be a bug (or two?) that made this single Solr core for every Sitecore index unstable, but that’s been corrected for some time now.

More to follow!

Sitecore and Solr Diagnostics: Pings and Logs

Like I mentioned last week in this post about Sitecore + Solr – Lucene, I wanted to share some other posts about running Sitecore with Solr.  While 2017 is shaping up to be the year of Azure for Sitecore, I also expect Solr to continue to grow in prominence across Sitecore implementations.

Solr exposes lots of information about it’s current condition and state, but it’s not that easy to access.  I’ll take a run at shedding light on some of the monitoring basics and let’s see where this goes:


A proof-of-life for Solr is usually part of an implementation, whether it’s for a load-balancer in front of Solr nodes or some other form of a health check.  This is typically done via the Solr /admin/ping endpoint so a request to http://localhost:8983/solr/sitecore_master_index/admin/ping?wt=json would result in something like:

  "responseHeader": {
    "status": 0,
    "QTime": 0,
    "params": {
      "q": "{!lucene}*:*",
      "distrib": "false",
      "df": "text",
      "rows": "10",
      "wt": "json",
      "echoParams": "all"
  "status": "OK"

I chose the JSON response writer for Solr (more about this at  Solr’s default response writer is XML and returns a lot more formatting text around the exact same information.

There is actually a lot of flexibility around this ping RequestHandler for Solr (check out the documentation).  I’m scratching the surface here since, for a plain vanilla Sitecore install, knowing that ping will either be “OK” or return the failing HTTP error code is usually sufficient.


So the ping is easy, but only reports OK or NOT OK.  It doesn’t provide any real granularity around that determination . . . so let’s assume you’ve got a Solr server that is not returning an “OK” status from the ping.  If you’ve restarted Solr on the server, and Solr still isn’t responding “OK” then it’s probably time to dig into the Solr logs.  Solr defaults to using a variation of Log4j to capture trace logs, so the documentation on Log4j can be helpful (just note that currently Log4j is on version 2.x, while the Solr versions I’ve worked on [up to Solr 6.2] are all using the earlier Log4j 1.2 version [log4j-1.2.17.jar]).

Solr will use the file server\resources\ as the basis for Log4j configuration, but this can vary depending on how you’ve installed Solr.   If you need to determine where this file is configured in your implementation, you can check the Dashboard of the Solr web admin tool.  This shows the runtime configuration settings for Solr.  I’ve highlighted the two log related settings in my example:


The -Xloggc parameter is for Java Garbage Collection (GC) logging and usually not something you would consider Solr specific, but good to know it exists if that becomes a topic to investigate.

Inside a standard file one finds the following:

#  Logging level
log4j.rootLogger=INFO, file


log4j.appender.CONSOLE.layout.ConversionPattern=%-4r %-5p (%t) [%X{collection} %X{shard} %X{replica} %X{core}] %c{1.} %m%n

#- size rotation with log cleanup.

#- File to log to and log format
log4j.appender.file.layout.ConversionPattern=%d{yyyy-MM-dd HH:mm:ss.SSS} %-5p (%t) [%X{collection} %X{shard} %X{replica} %X{core}] %c{1.} %m\n

# set to INFO to enable infostream log messages

That configuration file is fairly self-explanatory.  Adjust the rootLogger value off of INFO if you’d prefer DEBUG or WARN or some other level.  I think I have just one other note on the above configuration file:

the MaxFileSize setting of older Solr installations was set to 4 MB; this is pretty small for a production instance, so the example above (using a Solr 6 default) of using 50 MB makes more sense for production.

If one peeks into a solr log file, there’s a couple common record types by default.  Here’s what a query logged for Solr looks like:

webapp=/solr path=/select params={df=text&distrib=false&fl=_uniqueid&fl=score&shards.purpose=4&start=0&fsv=true&fq=_indexname:(sitecore_testing_index)&shard.url=|} hits=0 status=0 QTime=0

After the query parameters, the log reports the number of hits for the query, the status (0 means “OK”) and the duration for this request in milliseconds.  Note that Solr queries are logged after they complete . . . so if you have a run-away query you may not see it in the log with the standard configuration.  More about this at a later time, perhaps.

Updates to Solr are logged using a similar pattern, they will start with webapp=/solr path=/update.

It’s possible to  setup Log4j to create more narrowly targeted logs.  For example, you could have queries and updates written to different logs for easier troubleshooting (see this example for the nitty gritty on targeting Solr logs).

If your project doesn’t want all this “noise” about each update and query to Solr, or there’s a concern that all this logging could become a bottleneck, you can set the log level for Solr to WARN and only see warnings and errors in your logs.  You will be sacrificing that query performance data, however, so you’ll not have a record to review for slow performing queries . . . so a recommended practice is to trigger a “WARN” condition when a query exceeds a certain threshold.  In solrconfig.xml add a <slowQueryThresholdMillis> element to the query section of solrconfig.xml:


Any queries that take longer than 2 seconds (2000 milliseconds defined above) will be logged as “slow” queries at the WARN level.  Note that this setting would need to be duplicated for every core that’s defined in Solr (meaning, with Sitecore’s one core per index  approach you’ll have several solrconfig.xml files to touch).

Besides these fairly routine entries in Solr logs, there will also be exceptions and sometimes using the Logging section of the Solr web admin provides a convenient way to see the most recent set of these exceptions.  This is a convenient way of viewing Log4j items for your Solr implementation in near real-time; note if you make changes to the Level setting from this web interface, it is not persistent after a Solr restart.

Example of a Solr logging section in the web admin interface:


There are a number of other ways to review the Log4j data, old-school Chainsaw is still around, some I know some prefer Otros log viewer, but I’ve had success with Splunk (see screen shot below):


I’ll get into other methods for diagnostic work with Solr in the next post.