Sitecore’s SessionDictionaryData class saves the day

More and more projects are using Azure SQL as the database back-end for Sitecore (so long as they’re running Sitecore 8.2 and newer — if alignment to Sitecore official support guidance is important to you). This sets up a new class of performance considerations around Azure SQL, and I want to share one tuning option we learned while investigating high DTU usage for the Sitecore xDB “ReferenceData” database in a Sitecore 9 PaaS build. We wanted to off-load some of the work this “ReferenceData” database was doing, and investigations into which Azure SQL queries were causing the DTU spikes pointed to INNER JOINs between the ReferenceData.DefinitionMonikers and ReferenceData.Definitions tables.

Sitecore support pointed us in the right direction at this juncture, since the default DictionaryData was using AzureSQL for persistence — we should consider a store more suited to rapid key/value access. If this sounds like a job for Redis, you’d be correct, and fortunately Sitecore has an implementation that’s suited for this type of dictionary access in the Sitecore.Analytics.Data.Dictionaries.DictionaryData.Session.SessionDictionaryData class.

The standard Sitecore pipeline we’re talking about is the getDictionaryDataStorage pipeline and it’s used by Sitecore Analytics to store Device, UserAgent, and other key/value pair lookups. Here’s it’s definition:

The alternative we moved to is to use session state for storing that rapidly requested data,  so we updated the DictionaryData node to instead use the class Sitecore.Analytics.Data.Dictionaries.DictionaryData.Session.SessionDictionaryData. For this Azure PaaS solution, it amounts to using Azure Redis for this work since that’s where the session state is managed. Here’s the new definition:

What this boils down to is the implementation in Sitecore.Analytics.DataAccess.dll of Sitecore.Analytics.DataAccess.Dictionaries.DataStorage.ReferenceDataClientDictionary was shown to be a performance bottleneck for this particular project, so changing to use the Sitecore.Analytics.dll with it’s Sitecore.Analytics.Data.Dictionaries.DictionaryData.Session.SessionDictionaryData aligns the project to a better-fit persistence mechanism.

We considered if we could improve upon this progress by extending the SessionDictionaryData class to be IIS in-memory regardless of the Sitecore session-state configuration; there would be no machine boundary to cross to resolve the (apparently) volatile data. Site visitors would require affinity to a specific AppService host in Azure, though, with this and it’s possible – or even likely — that Sitecore assumes this is shared state across an entire implementation. We talked ourselves out of seriously considering a pure IIS in-memory solution.

I think it’s possible we could improve the performance with the default ReferenceDataClientDictionary by tuning any caches around this analytics data, but I didn’t look into that since time was of the essence for this investigation and the SessionDictionaryData class looked like such a quick win. I may revisit that in the next iteration, however, depending on how this new solution performs over the long term.

Sitecore Session Persistence Notes

I’ve neglected this blog of late, being focused on a number of “not easily blogged about” scenarios across several Sitecore projects.  It’s too bad, because the work is very interesting, but it doesn’t lend itself to a page or two write-up with a digestible take-away for the general Sitecore community out there.

I do want to keep in the habit of blogging, though, so I’m going to mention this ongoing discussion I’ve been a part of about session management with regards to Sitecore.  There are a few options for managing HTTP session state with Sitecore covered in https://doc.sitecore.net/sitecore_experience_platform/setting_up_and_maintaining/xdb/session_state/session_state: SQL Server, MongoDB, and Redis.  Those three technologies are really just the tip of the mountain, as implementation details for each can get quite detailed.  For the discerning Sitecore implementation, it can be useful to understand the nuances of each session state provider.  While not an exhaustive look at any one of these solutions, I wanted to post some notes on each one given the current state of Sitecore architecture (June 2017):

SQL Server

This is often the default session provider we gravitate to.  The SQL Server “Boost” script from Sitecore is something we’ve used on implementations (see “Optimize SQL Server performance” on that link), but it is not without it’s rough edges (see our Rackspace write-up on how to alter permissions so TempDB is reliably available across service restarts).

You’ll notice the approach for improving SQL Server performance with session state is all about getting session state “in-memory” to the furthest extent possible.  Remember this when we examine the other two providers below . . .

I will say that, generally speaking, SQL Server is easy to administer as it’s a well-known technology and updating it, scaling it, managing fail-overs, etc is simple compared with the alternatives.  SQL Server has been part of the Windows dev stack for ages, now, so it’s often the default session provider one gravitates to.

MongoDB

With MongoDB serving as the persistence layer for Sitecore’s xDB, it became a fully supported and viable option for HTTP session state with Sitecore at the same time.  The comparative performance between MongoDB and SQL Server is up for debate (Redis too, for that matter!), and it usually comes down to testing based on how the specific implementation is using session with Sitecore etc; I’m not going to hazard any generalizations on relative perf, as that’s not really the point of this post.

Instead, I’d like to point out how MongoDB does not come in just a single flavor.  The two most common flavors, or “storage engines,” are MMAP and WiredTiger, but there are still others designed to serve specific use cases.  Take, for example, the Percona Server for MongoDB hosted by ObjectRocket that has a posted option for the RocksDB storage engine.  RocksDB with MongoDB may not be a great fit for Sitecore session state (RocksDB is tuned for write-heavy work loads — and, in some cases, if you’re making extensive use of TTL indexes for Sitecore then RocksDB fits those scenarios in certain appealing ways), but it does open the door to MongoDB being more than just a one-size-fits-all data repository (read more about RocksDB and it’s Facebook pedigree here).  One MongoDB storage engine option that is easily overlooked is for WiredTiger “in-memory” that will force data to be stored in RAM . . . and this is perfect for HTTP Session State for most Sitecore builds.

In fact, if you consider the SQL Server “boost” approach that uses TempDB to store session state for Sitecore . . . WiredTiger “in-memory” is attacking the problem from the same direction.  Store everything in RAM!  This is why one must be cautious with general comparisons between SQL Server and MongoDB, the devil is always in the details: a far better comparison would be “boosted” SQL Server for Sitecore using TempDB vs MongoDB WiredTiger “in-memory” storage engine.  And note the network latency . . . and the size of the session objects . . . and you’re getting the point, I trust.  To really answer the SQL Server vs MongoDB question for Sitecore sessions, one has to develop a matrix of performance evaluations and level assumptions across the board.  “It depends” is the only honest answer that doesn’t come with a list of caveats.

If you’re curious on this MongoDB topic for your project, go to http://objectrocket.com/docs/mongodb_plans.html and spin up a WT 3.2 storage engine plan for 5 GB of storage (this allows 1.5 GB for RAM).  1.5 GB for RAM is going to be overkill for most small/medium Sitecore implementations — but again, you’ll want to test with your specific session data set to see!  Furthermore, network latency of 10 ms or less is going to help make the most of an ObjectRocket hosted MongoDB service like this — otherwise, the network latency may not make it worth the money.  Let me know if you pursue this with ObjectRocket, as there are some benchmarking measures we want to do but we haven’t had a real implementation to try it out on.  So if you feel like being a guinea pig, please let me know at grant.killian [at] rackspace.com.  It would be great to have real world metrics to prove this all out.

Redis

If the way to get the best session management perf out of SQL Server and MongoDB is to find in-memory solutions, Redis looks like the slam dunk since it’s just an in-memory storage solution.  We find most clients aren’t interested in managing Redis infrastructure, so again a hosted option such as ObjectRocket has appeal.

Sitecore relies on the StackExchange.Redis assembly, which doesn’t support Redis Sentinel — it’s a bit of a saga at https://github.com/StackExchange/StackExchange.Redis/pull/406;  therefore there’s not a great high availability story with the self-hosted Redis and Sitecore right now.  How concerned one should be with HA of fairly transient HTTP Session State for Sitecore, however, is an open question.  I usually wouldn’t worry about it too much.  Honestly, Redis is a technology that we’re just now starting to get really serious about at Rackspace so our sophistication in this space will improve dramatically in the months to come.  Between Azure Redis and all the Sitecore PaaS movement we’re seeing, it’s become a key player in a lot of Sitecore architectures.

How to use Redis in place of Sitecore in-memory caches

In this previous post I discussed the case for separating functions typically combined into a single out-of-the-box Sitecore instance into distinct parts, supporting better scalability, performance, availability, etc.  I concluded that piece with a reference to using Redis cache for Sitecore caches, instead of the default IIS in-memory cache.  Prior to Sitecore 8.2, this wasn’t entirely possible (although one could alter some Sitecore caching aspects through extensive customization of certain Sitecore assemblies).  Implementations looking to free up IIS resources, to get more out of their limited Sitecore IIS licenses, might consider swapping the in-memory caches for Redis.

Major caveat: This specific blog post doesn’t analyze the performance implications of doing this; my focus here is to cover how to accomplish the swap of the in-memory cache instead of examining all the nuances of why or why not to do it.  That being said, I do expect the use of Redis for Sitecore caches to be a surgical decision made based on the particulars of each project.  It’s not a one-size-fits-all solution.  There are also questions of which caches to put into Redis, and where Redis should be running (considerations of connectivity, Redis perf, and the whole landscape of Redis can also be brought into this conversation).  This will make for future material I cover, probably in conjunction with the big time Redis pros at ObjectRocket (who, conveniently, are part of my family at Rackspace).  I hope and expect others in the greater Sitecore community will expand on this topic, too!

Customizing Cache Configuration with Sitecore 8.2 +

Sitecore 8.2 includes a new .config file at App_Config\Include\CacheContainers.config — as the name suggests, this is where one can identify cache containers that will replace the default ones Sitecore uses.  For my example, I’m going to specify a custom container for the website[html] cache; any cache can be included here (I refer you to the old reliable /sitecore/admin/cache.aspx page — the name column enumerates all the Sitecore caches):

cache

With my customization, my CacheContainers.config file looks like this:

<configuration xmlns:patch="http://www.sitecore.net/xmlconfig/">
 <sitecore>
  <cacheContainerConfiguration>
   <cacheContainer name="website[html]" type="Rackspace.Sc.Caching.ReferenceRedisCache, Rackspace.Sc.Caching">
    <param desc="name">$(0)</param>
    <param desc="maxSize">$(1)</param>
    <cacheSizeCalculationStrategy type="Sitecore.Caching.DefaultCacheSizeCalculationStrategy, Sitecore.Kernel" />
   </cacheContainer>
  </cacheContainerConfiguration>
 </sitecore>
</configuration>

I’m not going to regurgitate the comments from the default CacheContainers.config file, but it’s certainly worth reviewing.

Once the CacheContainers.config is pointing to the custom class and assembly (in my case type=”Rackspace.Sc.Caching.ReferenceRedisCache, Rackspace.Sc.Caching”) it’s necessary to create the custom assembly to provide the Redis cache behaviour.  The foundation for this is to implement the new Sitecore.Caching.ICache interface (which requires Sitecore.Caching.ICacheInfo and Sitecore.Caching.Generics.ICache<string> interfaces. It can look like a lot of work, but it actually isn’t so bad.

One way I found to make it easier was to reference the out-of-the-box Sitecore Sitecore.Caching.MemoryCacheAdapter in the Sitecore.Kernel assembly.  One can consider the MemoryCacheAdapter for an example of the implementation of these interfaces:

cache2

Reflector, or whatever your preferred decompilation tool of choice, for the win!

In my case, I create a ReferenceRedisCache for this. . .

public class ReferenceRedisCache : Sitecore.Caching.ICache

. . . and the nuts and bolts come down to swapping Redis in place of the System.Runtime.Caching.MemoryCache object from the MemoryCacheAdapter implementation.  There are other considerations, of course, but this is the big picture.

I’m not going to post all the code here as mine is still a bit of a work in progress, and I need to evaluate it in terms of performance etc.  You’re much safer using the decompiled Sitecore source as a guide, at this point.  With this post, I did want to share how one can approach this powerful Sitecore customization; you can selectively change from Sitecore’s in-memory caching to just about any option you’d like this way.

There is another avenue that I’ve not explored with this.  Sitecore provides a Sitecore.Caching.BaseCacheConfiguration class that can be overridden to introduce customizations.  Sitecore.Caching.DefaultCacheConfiguration is the standard approach you could use as an example; this is an alternative way of substituting your own caching logic, but I haven’t dug into it yet.

I’ll clean up my code, do some profiling and evaluations, and come back with the full details of my custom Rackspace.Sc.Caching.ReferenceRedisCache implementation soon.

The Sitecore Pie: strategic slicing for better implementations

The Sitecore Pie

pie
At some point I want to catalog the full set of features a Sitecore Content Delivery and Content Management server can manage in a project.  My goal would be to identify all the elements that can be split apart into independent services.  This post is not a comprehensive list of those features, but serves to introduce the concept.

Think of Sitecore as a big blueberry pie that can be sliced into constituent parts.  Some Sitecore sites can really benefit from slicing the pie into small pieces and letting dedicated servers or services manage that specific aspect of the pie.  Too often, companies don’t strategize around how much different work their Sitecore solution is doing.

An example will help communicate my point: consider IIS and how it serves as the execution context for Sitecore.   Many implementations will run logic for the following through the same IIS server that is handling the Sitecore request for rendering a web page.  These are all slices of the Sitecore pie for a Sitecore Content Delivery server:

  1. URL redirection through Sitecore
  2. Securing HTTP traffic with SSL
  3. Image resizing for requests using low-bandwidth or alternative devices
  4. Serving static assets like CSS, JS, graphics, etc
  5. Search indexing and query processing (if one is using Lucene)

If you wanted to cast a broader net, you could include HTTP Session state for when InProc mode is chosen, Geo-IP look-ups for certain CD servers, and others to this list of pie slices.  Remember, I didn’t claim this was an exhaustive list.  The point is: IIS is enlisted in all this other work besides processing content into HTML output for Sitecore website visitors.

Given our specific pie slices above, one could employ the following alternatives to relieve IIS of the processing:

  1. URL Redirection at the load balancer level can be more performant than having Sitecore process redirects
  2. Apply SSL between the load balancer and the public internet, but not between the IIS nodes behind your load balancer — caled “SSL Offloading” or “SSL Termination”
  3. There are services like Akamai that fold in dynamic image processing as part of their suite of products
  4. Serving static assets from a CDN is common practice for Sitecore
  5. Coveo for Sitecore is an alternative search provider that can take a lot of customer-facing search aspects and shift it to dedicated search servers or even Coveo’s Cloud.  One can go even further with Solr for Sitecore or still other search tiers if you’re really adventurous

My point is, just like how we hear a lot this election season about “let Candidate X be Candidate X” — we can let Sitecore be Sitecore and allow it to focus on rendering content created and edited by content authors and presenting it as HTML responses.  That’s what Sitecore is extremely valuable for.

Enter the Cache

I’ve engaged with a lot of Sitecore implementations who were examining their Sitecore pie and determining what slices belong where . . . and frequently we’d make the observation that the caching layer of Sitecore was tightly coupled with the rest of the Sitecore system and caching wasn’t a good candidate for slicing off.  There wasn’t a slick provider model for Sitecore caches, and while certain caches could be partially moved to other servers, it wasn’t clean, complete, or convenient.

That all changed officially with the initial release of Sitecore 8.2 last month.  Now there is a Sitecore.Caching.DefaultCacheManager class, a Sitecore.Caching.ICache interface, and other key extension points as part of the standard Sitecore API.  One can genuinely add the Sitecore cache to the list of pie slices one can consider for off-loading.

In my next post, I will explore using this new API to use Redis as the cache provider for Sitecore instead of the standard in-memory IIS cache.