The Deal With Reverse Proxies and Sitecore

Once in a while the topic of a reverse proxy with Sitecore comes up in client conversations . . . I think it’s not out-of-the-question with Sitecore, but it can certainly complicate an implementation.  Let me share a few of the complications I’m aware of, and some remedies. 

Reverse Proxy for Media Requests

I know anecdotally of customers using a reverse proxy just for media items, where they customized the publishing pipeline to clear the reverse proxy cache for the specific item.  This blog post explains it further: http://sitecoreblog.patelyogesh.in/2014/05/improve-sitecore-media-performance.html, but there are problems with this if the request contains language in the file path (en-us/item/…) and in some other cases.  I would consider this experimental as I haven’t personally seen it working in a real implementation.  There are a few blogs out there, however, that claim it’s 100% reliable for them.

I know one Sitecore implementation pursuing an approach somewhat like this, where they farm out media requests to dedicated Sitecore servers to only handle media.  This is one way they’re working to pull the overhead of returning media off of the main Sitecore IIS servers and onto another set of servers tuned for media requests.  A key difference here is that instead of this being a true reverse proxy, they’re using load balancer rules to steer media requests to specific Sitecore servers to handle the request.  From 1 million miles up, one could consider this a reverse proxy – but on closer inspection, it’s really a load balancer connecting to dedicated Sitecore media servers.

Reverse Proxy for Everything

I also know of customers using a reverse proxy more generally for content, so they must use the setting ‘Analytics.ForwardedRequestHttpHeader’ and set it to ‘X-Forwarded-For’ or ‘X-Real-IP’, depending on their proxy settings (it varies based on how the proxy represents the original IP address).  Through this setting, Analytics can make use of the client IP and provide content as one would expect from Sitecore.  This is the standard response to “how does one configure Sitecore to work with a reverse proxy.”

One Known Sitecore Gotcha & Resolution

Amazon CloudFront does what many reverse proxies do and pushes a comma-separated list of IPs into their HTTP Header (see the Client IP Address section in the Amazon docs) so it comes through in the header as X-Forwarded-For: client-IP-address, proxy-IP-address, another-proxy-IP-address.

By default, when using the ForwardedRequestHttpHeader, Sitecore pulls the last address in a comma-delimited string  in the forwarded request header (in the processor for Sitecore.Analytics.Pipelines.CreateVisits.XForwardedFor is this Process method . . . yellow highlight shows the logic in question) :

        public override void Process(CreateVisitArgs args)
        {
            string forwardedRequestHttpHeader = AnalyticsSettings.ForwardedRequestHttpHeader;
            if (!string.IsNullOrEmpty(forwardedRequestHttpHeader))
            {
                string str2 = args.Request.Headers[forwardedRequestHttpHeader];
                if (!string.IsNullOrEmpty(str2))
                {
                    string str3 = str2.Split(new char[] { ',' }).Last().Trim();
                    if (string.IsNullOrEmpty(str3))
                    {
                        this.LogWrongIp(forwardedRequestHttpHeader, str2);
                    }
                    else
                    {
                        IPAddress address;
                        try
                        {
                            address = IPAddress.Parse(str3);
                        }
                        catch (FormatException)
                        {
                            this.LogWrongIp(forwardedRequestHttpHeader, str2);
                            return;
                        }
                        args.Visit.Ip = address.GetAddressBytes();
                    }
                }
            }
        }

The resolution for this is to swap in a custom assembly and config file (Sitecore support can hook you up with that if you need it, reference issue #421555) that provides specific behavior to change this use the last IP logic via a new Analytics.ForwardedRequestHttpHeaderGetFirstIP setting.  This may be folded into the main Sitecore product at some point, since it’s recognized as a common challenge for some reverse proxies.

One Known Sitecore Gotcha Without Resolution

If you’re running with SSL so url https://theSite.com/Sitecore  goes to your reverse proxy and transforms into a non-SSL request, as in http://theSite.com/Sitecore, then once it reaches Sitecore, some client dialogs using IFrames wouldn’t work due to the src value being http and not https.  I think the main point here is be careful around SSL and a reverse proxy with your site, especially the Sitecore Authoring environment.  I’m no reverse proxy expert, so I don’t know how wide-spread a challenge this is, but I would thoroughly test with SSL certificates any Sitecore configuration using a reverse proxy.

Conclusion

With these complications in mind, I would suggest one avoid the reverse proxy for Sitecore requests as a general rule, but if security or other external pressures drive the decision, it’s not out-of-the-question.  Just test it thoroughly and consider engaging Sitecore services or support if you run into obstacles.  The answer from Sitecore could be that what you’re doing is not officially supported — and that’s my main reservation.  The reverse proxy is not a fully supported technology for Sitecore.  Sitecore QA doesn’t thoroughly test reverse proxy configurations (from what I’ve seen), and there are questions around how sticky session will work and other pieces of the HTTP request when a reverse proxy is introduced.