OPTIONS and PUT Verbs for the Sitecore Publishing Service

I’ve been involved with a lot of implementations of the new Sitecore Publishing Service lately and they’re starting to all run together. Before this episode fades too far into history, I want to record what I learned in case it’s useful to others (or myself!) at some point.

I’m going to skip all the preamble about .NET Core and how the new Publishing Service is designed to be lean and mean . . . that’s covered adequately in other places by now. I’ll jump right into this scenario where the Publishing Service was installed but all publishing operations would fail from the Sitecore Content Management server. It seemed like the CM wasn’t able to reach the .NET Core Publishing Service endpoint, but I confirmed the following were all in place:

  1. the host file included an entry for 127.0.0.1 pointing to Sitecore.Publishing.Service
  2. the Publishing Service was hosted through IIS with the Sitecore.Publishing.Service binding
  3. http://sitecore.publishing.service/api/publishing/operations/status responded with the expected JSON response “status 0” — so I concluded the service itself was OK

The Sitecore logs did have this cryptic “NotFound Not Found” exception, which was all I had to go on at first:

ERROR [Sitecore Services]: HTTP POST
URL http://cm.website.com/sitecore/api/ssc/publishing/jobs/0/ItemPublish

Exception System.AggregateException: One or more errors occurred. ---> Sitecore.Framework.Publishing.Client.Http.HttpServiceResponseException: The remote service encountered a problem processing the request:
NotFound Not Found
System.Net.Http.StreamContent
   at Sitecore.Framework.Publishing.Client.Http.JsonClient.<SendRequest>d__14.MoveNext()
--- End of stack trace from previous location where exception was thrown ---

I envisioned a frustrated IIS server yelling “NotFound Not Found” — and, actually, that wasn’t so far from the truth. To discover the truth, I needed Fiddler.

There’s a ton of API calls going on when Sitecore CM servers interact with the Publishing Service, so Fiddler was the way to get a handle on all that communication. Sitecore has a KB article on how to wire up Fiddler to your Sitecore implementation, which is handy, and I put it to good use.

It was also necessary to dial up the verbosity of the Publishing Service instrumentation so we can see the full picture. One does this through the \config\sitecore\sc.logging.xml file under the Publishing Service installation. Set the filters like this . . .

 <Filters>
<Sitecore>Debug</Sitecore>
<Default>Debug</Default>
</Filters>

. . . and restart the .NET Core service to make sure these take effect.

Fiddler can provide an overwhelming volume of information; imagine hundreds of entries like the following:

Fiddler1

One technique is to carefully clear the Fiddler traces until just before you interact with the browser to perform your action, in my case just before I press the final Publish button in the Sitecore CM environment. I still had a lot of noisy Fiddler data to ignore, which I’ll remove from the screenshot below, but finally it revealed this red entry meaning failure.Fiddler2

And if you click on the details of that Fiddler trace you can inspect all the particulars. In this case it was pretty obvious the HTTP Verb Put wasn’t allowed here:

Fiddler3

The fix was fairly complicated because there are good reasons for servers to block HTTP Verbs that aren’t in use, so we had to make the case PUT was required for the Publishing Service to work (and then add HTTP Verb blocking at a different layer of the implementation). We also found the OPTIONS HTTP verb was crucial for Publishing Service work using the same Fiddler technique described above. We ended up needing to permit both those 2 additional HTTP Verbs to enable proper Sitecore Publishing Service activity.

This isn’t really the final way to permit those verbs in a real production implementation, but it’s close enough for the purposes of this blog . . . check out the IIS Request Filtering options we needed to explicitly permit for this CM server to communicate to the Publishing Service running on the same VM host:

Fiddler4

Sitecore’s had API service layers that depended on these HTTP Verbs for many years, so I shouldn’t be surprised to find the Publishing Service is also making use of them. I just never connected the two together. This customer, too, is very security conscious and features are thoroughly locked-down unless you explicitly enable them. This Sitecore 8 project isn’t otherwise using the Sitecore Sevices Client, for example, or the Item API. The good news is these PUT and OPTIONS exceptions are only internal to the system and not opened up outside to the public internet thanks to firewalls and other networking.

I will also note that we briefly explored the idea of customizing the HTTP Verbs the Publishing Service made use of  . . . but you can imagine threading the needle of dependencies through that stack of libraries would be a very tough challenge.

Advertisements

New Sitecore 8.2 & Sitecore 9 Security Patch

I guess I’m on a security hardening binge for 2019, since I’ll share a hot-off-the-presses security hardening measure from Sitecore today. I don’t want to say too much about the vulnerability, but this article explains in general terms and applying it to all Sitecore server roles for version 8.2 through any current 9 releases is emphasized as the best practice. I’d do it at the next opportunity.

I created a gist with the PowerShell necessary to apply this patch, just update line #8 with the path to your Sitecore website:

Our team has internal automation taking the above a bit further and using another layer of abstraction, and that’s secret Rackspace sauce I won’t share publicly,  but the snippet above should have your environment patched in just a few seconds.

One of the key elements to the patch involves an update to the /sitecore/admin/logs.aspx page which, if you dig into it, reveals a grip-load of additional C# validation logic and other stuff going on . . .

secpatch

There’s a lot to unpack in there if you’re curious, but suffice it to say that Sitecore’s keeping all your bases covered and isn’t trusting user input (using a broad interpretation of that principle).

Sitecore Commerce security hardening note

Let’s start the New Year off with a fun Sitecore Commerce note. Using the latest Sitecore Commerce available today, that is running Sitecore 9.0 update-2 with Sitecore Commerce update-3 (you have to cross-reference https://dev.sitecore.net/Downloads/Sitecore_Commerce.aspx and https://kb.sitecore.net/articles/804595 to really sort this out), we’re applying routine security hardening.

Now that Sitecore is truly built on a hybrid of “plain” .Net and .Net Core, this security hardening effort is more nuanced.

Sitecore is still updating their documentation for the Sitecore 9 space and one can end up at dead-ends like https://doc.sitecore.com/developers/90/platform-administration-and-architecture/en/security-guide-251908.html that leads you over to the .Net Framework documentation when there are better notes with 100% relevancy to Sitecore elsewhere on the Sitecore site. I persevered and eventually found Sitecore’s updated information like this on the hash algorithm https://doc.sitecore.com/developers/90/platform-administration-and-architecture/en/change-the-hash-algorithm-for-password-encryption.html. Still, this documentation overlooks the .Net Core details and given that this Sitecore Commerce project we’re working on will use the latest and greatest, we had to do our own research.

Fortunately, we have some history with this having published https://developer.rackspace.com/blog/Updated-Security-Hardening-For-Sitecore-8.2 or earlier versions going back several years. The PowerShell we’ve used for ages to automate this work, however, wasn’t going to cut it with this new Commerce and .Net Core dimension:

snippet

Instead, we need to do something like this to update the JSON configuration for the Sitecore Identity Server. While you could get fancy and parse the JSON, I used a more direct replace approach to knock this out quickly:

$siteNamePrompt = Read-Host "enter Identity Server website name"
$site = get-website -name $siteNamePrompt
$appSettingsPath = "{0}\wwwroot\appsettings.json" -f $site.physicalPath
Get-Content $appSettingsPath).replace("""PasswordHashAlgorithm"":""SHA1""},", """PasswordHashAlgorithm"":""SHA512""},") | Set-Content $appSettingsPath

The end result is  that SitecoreIdentityServer\wwwroot\appsettings.json file needs an updated PasswordHashAlgorithm value:

        “IDServerCertificateStoreLocation”: “LocalMachine”,
“IDServerCertificateStoreName”: “My”,
        “PasswordHashAlgorithm”: “SHA512”
}

Given the distributed nature of Sitecore 9 with Commerce, I think a discrete change like this just for the IdentityServer doesn’t warrant a lot of effort to integrate into the bigger security hardening Powershell script we use. It may be worthwhile to just update SIF at this point instead of applying security hardening after the Sitecore installation is complete. We’re also talking about SIF extension modules to run this type of logic after SIF is complete. For now, I’ll probably just keep this note handy for the foreseeable future and see whether Sitecore integrates the security hardening guidance directly into SIF in a future release (hint hint!) — or, over time we may collect a set of these best practice adjustments that deserves more effort to automate into a scripted deployment. For now, I think I’ve taken it as far as it deserves.

How I Add Custom Sitecore Publishing Service Targets

At this point, I think I’ve installed, configured, or customized the new Sitecore Publishing Service at least a dozen times for various projects. Sometimes it’s on PaaS, sometimes on IaaS . . . I’ve used a variety of different versions depending on the compatibility matrix (see below as of Dec 16, 2018):

PubSvcVisual

I’m going to skip all the preamble about how the new Sitecore Publishing Service works, about .Net core being the new hotness, why this component can be a great addition to many distributed Sitecore implementations, etc — smart people have written a lot about this already. For example, check out Stephen Pope’s no-holds-barred look at the Publishing Service at http://www.stephenpope.co.uk/publishing or Jonathan Robbins has a nice overview piece at https://jonathanrobbins.co.uk/2016/09/02/setting-up-sitecore-publishing-service/.

I’ve learned a good bit from all the iterations of working with the component and I think consistently the most error-prone part of the setup is aligning any additional custom Sitecore publishing targets one is using in an implementation. This write-up from Geykel Moreno at AlphaSolutions has all the good information, but it’s not as easy to follow because it doesn’t post a comprehensive sc.publishing.xml file — it took a bit of trial and error for me, so to simplify for posterity I’m going to share a reference sample Gist at https://gist.github.com/grant-killian/d2fe8d3e89c5d7b15f47464dd1809d62 that includes 2 additional custom publishing targets. I’ve inserted XML comments for the 3 locations one must update in the config\sitecore\publishing\sc.publishing.xml file:

  1. You need to add your ConnectionString entry for each database to the Publishing/ConnectionStrings XML
  2. You need to add your Services/DefaultConnectionFactory/Options/Connections XML definition for each custom target
  3. You need to add entries for each target to the StoreFactory/Options/Stores/Targets XML that will include the GUID of the Sitecore item that defines each publishing target, along with the Name of the item and additional details

Here’s the gist with the full XML for reference:

Sitecore artifact table patch config

I’ve patched EventQueues several times through the years, so I saved this Gist to make it easier for future opportunities.  There’s not much new to share in terms of introducing why one does this, refer to this blog about Sitecore artifact tables (or the old reliable Sitecore CMS Tuning Guide). You also have to take care around the order of configuration file processing, so zzzzzArtifactTableRetention.config this sucker if you really must 🙂

Here’s the Gist – https://gist.github.com/grant-killian/ffa1e84770b10a90e2454e241986b911  and here’s the expanded XML:

Walkthrough of Solr Query Analysis for Sitecore

Anton Tishchenko wrote a good quick piece on “stemming” for Solr, and I wanted to build on what he shared.

Solr is generally way underutilized by the Sitecore implementations I’ve seen. It makes for plenty of territory for blogging, though, so maybe bit-by-bit the word will get out about what a powerful distributed system Solr really is.

Let me setup a specific example using Sitecore 9 update-2 with Commerce. I’ll use the CatalogItemsScope Solr core, but you could review most any Solr core with the standard Sitecore schema.

Let me pause here to explain where the default Sitecore Commerce Solr Configuration defines this search index, because it took some digging. In the  . . . SIF\Configuration\Commerce\Solr\sitecore-commerce-solr.json file you’ll find:

// The names of the cores to create
“Customers.Name”: “[concat(parameter(‘CorePrefix’), ‘CustomersScope’)]”,
“Orders.Name”: “[concat(parameter(‘CorePrefix’), ‘OrdersScope’)]”,
“CatalogItems.Name”: “[concat(parameter(‘CorePrefix’), ‘CatalogItemsScope’)]”,

Using a fresh IaaS install of Sitecore 9 with Commerce, I’ll go to the Solr admin interface. Select the CatalogItemsScope Solr core from the dropdown list on the left side navigation. Choose the “Analysis” option to access this powerful way of evaluating two key operations one does with Solr: indexing and querying.  Solr defines different sets of analyzers for these operations, sort of like how Sitecore exposes the httpRequestBegin pipeline or other extensibility points that projects are always customizing. For this exercise, I’ll focus on the Query operation. The documentation on this Analysis screen has a lot more information on this, if you’re interested.

Enter the search phrase “an AWESOME television!” in the textbox for the Field Value(Query), and then specify the text_general as the Fieldname to Analyse: step1

Solr is going to run “an AWESOME television!” through the analysis process and show the results as they progress through each step of that analysis. For this write-up, I’ll uncheck the “Verbose Output” checkbox — but definitely play around with that feature as it shows the offsets and ordinal positions as Solr works it’s magic through each transformation.

After clicking the blue “Analyse Values” button you’ll see output like the following:

step2

Each row of output starts with an abbreviation (ST, SF, etc). That’s the key to which process of the “analyzer” has run; the pipe-separated list to the right of the abbreviation shows the search phrase as it’s progressing through Solr’s transformation logic.

  1. ST is the StandardTokenizerFactory. I removes the “!” punctuation mark, but otherwise doesn’t make any changes to the search query.
  2. SF is the StopFilterFactory. This would remove words listed in the CatalogItemsScope\conf\stopwords.txt file. In this case, with an out-of-the-box Sitecore 9 Commerce installation, there are no stopwords specified. This doesn’t change our search query in any way.
  3. SGF is the SynonymGraphFilterFactory. This component reads a list of synonyms from CatalogItemsScope\conf\synonyms.txt . . . and “television” is one of the samples they provide in that file, so now our query includes those synonyms
    • synonyms.txt includes this entry for example purposes:
      • Television, Televisions, TV, TVs
  4. The final step, LCF, is to run through LowerCaseFilterFactory which is takes “AWESOME” to “awesome” to ensure case isn’t evaluated in the query.

To drive home the point of this example, change the Fieldname to “text_en” from “text_general” and click the blue “Analyse Values” button. The results are different because the text_general and text_en Solr fields have a different set of components defined for the Query operation. Specifically, text_en adds:

  • EPF (EnglishPossessiveFilterFactory)
  • SKF (KeywordMarkerFilterFactory)
  • PSF (PorterStemFilterFactory)

Here’s what the Solr Analysis screen looks like:

step3

There is a lot that could be said about all this, and I’ll probably build on this in a future blog post . . . but I want to return to Anton’s original point about the Porter stemmer and highlight how powerful the Porter stemmer can be. Solr’s text_general is significantly different than text_en, and hopefully I’ve shed light on precisely how they differ on the query side. The docs at http://snowball.tartarus.org/algorithms/porter/stemmer.html review this in some detail. I’ve also used this online Porter stemmer to quickly see how words are decomposed into their key fragments for search relevancy. You don’t really need that online stemming tool, however, since you now see how to use the Solr administrative UI with the standard “text_en” field to have your own Porter stemming sandbox.

Quick note on onPublishEndAsyncSingleInstance vs onPublishEndAsync

This is more a note for my benefit — for search index update strategies, the onPublishEndAsyncSingleInstance’ makes the ‘onPublishEndAsync’ a deprecated option.
The legacy onPublishEndAsync remains to ensure backwards compatibility, but from Sitecore 9.0 onward it’s the default index update strategy used by Sitecore.
With that said, it appears in Sitecore 9.0 update-2 there’s a major defect with OnPublishEndAsynchronousSingleInstanceStrategy. The ContentSearch.ParallelIndexing.MaxThreadLimit setting is ignored by the onPublishEndAsyncSingleInstance strategy — so incorrect thread limits can be used (slow perf!). Sitecore’s patch reference # 285903 can be requested through Sitecore Support to address this.
I suppose it’s a consequence of the new  onPublishEndAsyncSingleInstance not having a mature and well-tested codebase surrounding it (onPublishEndAsync has been around for ages!).