Tuning ClayTablet integration with Sitecore

I’ve had a couple opportunities to work with ClayTablet – Sitecore integrations.  For the most part, it’s very straight-forward and the ClayTablet package does the hard work.  Here’s an intro to what ClayTablet does with Sitecore, in case you’re not familiar.  It can be very convenient for organizations wanting professional translations of their web content without staffing their own native speakers in various languages!

Sync Translation

For starters, one should generally use the “Sync Translation” option on the Translation Options screen  in ClayTablet (either in the Bulk Translation wizard or the Automatic Item Export for Translation dialog).

synctrans

This sets ClayTablet to only send content fields that have changed since the last time an item was sent out for translation.  It can keep translation costs low and will limit the volume of information sent back-and-forth with ClayTablet.

Speaking of volume of information, the rest of these notes relate to how ClayTablet handles that Sitecore data.  If one has a lot of translations, 100s or 1000s of items, it’s important to optimize the integration to accommodate for that data flow.

SQL Server

ClayTablet runs from it’s own SQL Server database, so common sense guidance regarding SQL Server applies here.  A periodic maintenance plan (consistent with Sitecore’s tuning recommendation to the core Sitecore databases) is in order, which means setting up the following tasks to run weekly or monthly:

  • Check Database Integrity task
  • Rebuild Index Task
  • Update Statistics task

Setting the Recovery Mode to Simple for the ClayTablet database is also recommended, as it keeps the disk footprint to a minimum.  When ClayTablet is churning through 1000 or more translations, our Rackspace team has observed SQL Server Transaction logs growing out of control when using a Full Recovery Mode in SQL Server.

Sitecore Agents

ClayTablet uses a background process to handle moving translations both to and from the ClayTablet mother ship.  In CT3Translation.config on Sitecore CM servers, there’s UploadService and DownloadService agents defined something like:ct.JPG

The above image shows the default definitions for those Agents, and that means every 2 minutes these execute to handle moving data to or from Sitecore.  For a busy enterprise running Sitecore, the potential for 1000 or more translation items to process at any day or time from your CM server can be a concern — so it’s smart to consider altering that schedule so that the agents only run after hours or at a specific time (here’s a good summary of how to do that with a Sitecore agent).

Run ClayTablet Logic On Demand

In a case where one needs the ClayTablet agents to run during regular working hours, maybe in the situation where a high priority translation needs to be integrated ASAP, one could use an approach to adding a custom button in Sitecore to trigger a specific agent such as this.  This way ClayTablet wouldn’t casually run during busy periods on the CM, but you could kick off the UploadService and DownloadService on demand in special circumstances.

Building on the Run ClayTablet Logic on Demand idea, I think the slickest approach would be to use the ClayTablet SDK to query the ClayTablet message queues and determine if any translations are available for download.  This “peak ahead” into what ClayTablet would be bringing down into your Sitecore environment is only feasible through the SDK for ClayTablet.  This peak operation could run in the background every 2 minutes, for example, and alert content authors when translations are available; for example, maybe we add a button to the Sitecore ribbon that lets content authors run the ClayTablet DownloadService . . . and this button turns orange when the background agent peaks into the queue and finds translations are ready?  Content authors could then choose whether to initiate the DownloadService based on current business circumstances, or to wait and let the process run during the after-hours execution.

Evaluate for Yourself

ClayTablet keeps a log of processing activities along side the standard Sitecore logs, so I suggest familiarizing yourself with that log.  When troubleshooting a period of slow responsiveness on a Sitecore CM Server, we dug into the ClayTablet log and found a pattern such as this for the time period of the reported slowness:

16:42:38 INFO  5108 TranslatedSitecoreItem(s) need to handle
. . . much lower in the file . . .
18:29:04 INFO  3450 TranslatedSitecoreItem(s) need to handle
. . . much lower in the file . . .
20:59:10 INFO  1191 TranslatedSitecoreItem(s) need to handle
. . . you get the idea

There were many many log entries to sift through so the above is just a tiny digest.  It worked out that ClayTablet was crunching through about 1 TranlatedSitecoreItem every 5 seconds.  For 5 hours.  There are a series of activities that happen when an item is updated in Sitecore, index updates, workflow checks, save operations, etc.  It appeared that this steady stream of item manipulations and the cascade of those supporting events contributed to the load on the Sitecore CM server.

Like everything in Sitecore, this must be taken in context.  If an organization has small or infrequent translation activity, a trickle of ClayTablet updates isn’t much to worry about.  If there are 1,000 translated items coming in, however, and it’s a pattern that can repeat itself, it’s worth investing the energy in isolating those workloads to proper time windows.

Sitecore Data Cache Tuning Crash Course

I’ve been out of the blogging space for a while, due to a number of reasons that aren’t too relevant here . . . suffice it to say that I’ve never left the Sitecore space, just got invested in some very demanding projects.  I’m feeling the itch to write-up some more of my notes about Sitecore, and the fact is it helps me to process, retain, and better understand the subject when I shape it into a few paragraphs on this blog.

In my time at Sitecore, every couple weeks I would encounter an implementation intent on running with their cache setting Caching.DisableCacheSizeLimits configured to true.  It is a fact that the Sitecore documentation suggests this is a good idea (they say so in the Sitecore CMS Tuning Guide) for 64-bit environments with plenty of memory.  Unfortunately, this nugget of guidance can bring about a lot of problems as the “one size fits all” cache tuning logic within Sitecore leaves something to be desired.  I ran into such a situation this week, which prompted me to write this post.

I’ve seen higher CPU usage, publishing errors,  and out of memory exceptions caused by setting Caching.DisableCacheSizeLimits to true.  The cache sizes could regularly exceed the memory of the environment (out-of-memory exceptions!), or a high memory machine allowed so much memory to be set aside for the cache, that it would take a really long time to evict it . . . there is no substitute for a well-tuned Sitecore cache, tailored to the needs of each organization.  This is a case where the quick fix (letting Sitecore manage your cache automatically courtesy of the Caching.DisableCacheSizeLimits setting) can be vastly inferior to a well thought out solution.  The tortoise can beat the hare, in this case.

So, I almost always recommend setting Caching.DisableCacheSizeLimits to false and tuning the Sitecore data caches according to the CMS Tuning guide (https://sdn.sitecore.net/Reference/Sitecore%207/CMS%20Performance%20Tuning%20Guide.aspx).  Those instructions aren’t as detailed as I would like, so let me share some additional notes on how to do this:

    1. Work on a back-up of your real Sitecore databases and use a hardware profile that matches your production environment. . . running against a fraction of the real content will give you incorrect results, as will tuning a cache on a 4 GB machine when your prod servers are 32 GB!
    2. Be sure to disable the configuration setting Query.MaxItems so you’re not artificially limiting how many items the Sitecore API returns
    3. If it’s inconvenient to run a load generator, as the CMS Tuning guide suggests you should do as part of the testing, you can just run the “give me the kitchen sink” XPath query in the old XPath Builder tool.  It’s located at http://%5BServerName%5D/sitecore/shell/default.aspx?xmlcontrol=IDE.XPath.Builder –this used to be a core part of developer life with Sitecore, but now this XPath builder is a vestigial tail that doesn’t get too much use.  It’s very handy for this cache flooding exercise!  Run an XPath query like /sitecore/content//* that should flood your Sitecore caches with the entire content tree.
    4. This video is old, but technically an excellent resource for Sitecore cache tuning: http://mediacontent.sitecore.net/wmv/CacheRecordingWebinar.mp4.
      1. Among other things, the video covers what to prefetch cache, and a methodology for determining settings such as Caching.AverageItemSize using System.GC.GetTotalMemory() method calls.  It’s all about how to calculate the right values for your implementation — and I can’t recommend this video more highly.

With all that said, there is a situation when setting Caching.DisableCacheSizeLimits to true makes sense.  The most common rationale I can think of is when one encounters a production Sitecore system with no changes to Sitecore out-of-the-box in terms of data caching.  I’ve seen this situation due to an implementation team that didn’t know how important this can be for perf, for example.  In one case, I was in this situation in mid November just a few days before Thanksgiving — and the customer needed a quick caching solution that didn’t require careful analysis etc.  In this case, as a temporary measure to improve the data cash landscape for the client, I suggested setting Caching.DisableCacheSizeLimits to true until we could assist them with  a thorough cache tuning effort.  This got their system beyond the very limited data caching  configuration Sitecore comes with out of the box (probably dating back to the Win32 days!).  This wasn’t in place long term, but it was sufficient to improve their performance ahead of a busy online shopping period.