Sitecore Logging to the database (part 3 – the performance picture)

This is part 3 of a series looking at customizing Sitecore to use logging to the database instead of the default file-based logging.  Refer to the first and second posts to get a full grasp of what I’m talking about here.

Asynchronicity

I alluded to this earlier, but the Log4Net ADONetAppender is not asynchronous, despite the buffering behavior discussed briefly in the second post in this series.   It stands to reason that logging to a database could negatively impact performance in a high-traffic situation with lots of logging. I recommend load testing with the same level of logging (WARN, ERROR, etc) to confirm the database logging does not introduce a performance bottleneck. Furthermore, it is worthwhile to run load tests with more verbose logging levels (such as INFO) to gauge any potential impact on the implementation. If a situation develops that requires a short-term change to the logging level, it’s impotant to understand repurcussions.

To provide for asynchronous logging through a custom ADONetAppender, one can build an asynchronous ADONetAppender for log4net such as https://github.com/cjbhaines/Log4Net.Async. This is not officially part of log4net nor officially Sitecore supported code, but is an open-source library for asynchronous logging based on log4net.  It’s one approach.  In preliminary tests, this project is written using a different version of log4net but updating it to be compatible with the log4net library referenced in Sitecore.Logging.dll should be possible (but beyond the scope of this write-up — but I may revisit this subject in a follow-up post).  This post has another promising example of building the asynchronous support to log4net: https://issues.apache.org/jira/browse/LOG4NET-407?jql=project%20%3D%20LOG4NET%20AND%20text%20~%20asynchronous.

Performance Analysis

Given the synchronous nature of the ADONetAppender in log4net, an examination of the performance impact of logging to the database is something we need to pay close attention to.

I did a modest analysis of page load times using the configurations described in this blog series. For this analysis, tests were run on a single computer (IIS combined with SQL Server) with JMeter exercising a Sitecore 7.2 site with a basic page writing a randomized INFO, WARN, and ERROR log message through the Sitecore.Diagnostics object. The code ran in the code-behind of the page_load event of a SubLayout with no component caching in effect. The BufferSize setting for the database appender was set to “1” to avoid any buffering effect (except in the specific test as noted below).

A pattern emerges showing a consistently faster page response time, albeit only slightly, for the Stored Procedure approach over the plain SQL text alternative.
Basic JMeter Page Load Analysis

  All Logging Disabled Plain SQL Logging Stored Proc SQL Logging File System Logging File & Stored Procedure Stored Proc with default BufferSize
20 Users (4,000 samples) 149 ms 180 ms 169 ms 158 ms 150 ms 147 ms
100 Users (40,000 samples) 840 ms 940 ms 884 ms 788 ms 936 ms 832 ms

The results show page load time in milliseconds. The data may say more about the workstation they were run on than about the performance of one logging implementation over another in a true high traffic situation, but the relative comparison is informative. Implementations should run their own tests to evaluate the particulars of their situation and environment, of course.

The basic analysis done here supports the concept of using a Stored Procedure when logging to a database instead of plain SQL, and taking advantage of the BufferSize setting. It’s interesting that having logging entirely disabled is nearly the same as logging to the database using a Stored Procedure and the default BufferSize value (512 messages buffered at a time). Again, these are conclusions drawn from a limited set of test data.

A general rule drawn from this analysis is that compared with the Sitecore installation defaults (logging to the local file system), logging to a database as we’ve gone over in this blog series may add .05 of a second (50 milliseconds) to the overall page processing time.  This is just a rough figure, but can serve as a starting point for planning a Sitecore logging strategy, and of course each implementation is different so running your own perf tests is absolutely critical!

Data Clean-Up

Typically, Sitecore automatically cleans up the text-based log files on a regular basis. One should have a clean-up routine in place for when logs are stored in the database, too. Consider adding a scheduled SQL Server Task to run clean-up logic such as:

DELETE CustomLog
WHERE [Date]<DATEADD(day,DATEDIFF(day,0,GETDATE()-30),0)

This example would remove data over 30 days old from the log file. One could schedule the frequency of the clean-up and the exact SQL to run based around the parameters of the implementation (and the SQL Server capacity and environment). Taking backups of the logged data before cleaning up could be in order, for example, if a responsibility exists to keep records for long-term audit purposes or just to be able to view changes over time.

Conclusion

This exploration of logging to the database in Sitecore is representative of what my group does a lot of at Sitecore.  A customer might have a single line on a requirements document stating: “log to the database” and then we get involved to understand why the customer wants to do it and how a customer might best make it happen.  Like nearly every facet of software development, things can get complicated beneath the surface.  Be suspicious of quick and easy answers, and this logging conversation is packed with nuance.  At some point (especially if a customer needs it!), I may elaborate on the asynchronous logging approach I sketched out above — but with the performance numbers I generated, I feel confident that database logging isn’t a non-starter for a high-traffic site.  The potential is there, of course, for database logging to be a bottleneck so one must be very deliberate in how or when to pursue it.

Sitecore Logging to the database (part 2 – earning the good developer badge)

This is a follow-up to my introduction on customizing Sitecore logging to use a database instead of a flat file.  This post covers some advancements over the basics in the first post, and sets the table for a conversation about performance and Sitecore database logging in the follow-up.

Centralizing the Connection String

If the database to write log output to is referenced in other capacities in the implementation, a database connection string pointing to the database will likely reside in app_config/ConnectionStrings.config. To avoid having multiple connection string configurations to the same database (one in the <log4net> definition and one in the <connectionStrings> definition), you may want to centralize the log4net connection string information and make exclusive use the ConnectionStrings.config entry. This eases maintenance and earns you the “good developer” badge for this functionality.

The following node should be added to the <appender> XML configuration, using our CustomDBAppender from the earlier post as the example:

<connectionStringName value="logRepository"/>

This can replace the entry for <param name=”connectionString” …

And in the App_Config/ConnectionStrings.config file one needs to add something such as the following:

<add name="logRepository" connectionString=" user id=[login];password=[pwd];Data Source=[server];Database=[CustomLogDB]" />

The DBAppender class in our example needs modifications to support this centralization of the connection string. Specifically, add a simple string property to the class to support the custom <connectionStringName/> node we added previously:

private string _cnString = String.Empty;
public string connectionStringName
{
  get
  {
    return _cnString;
  }
  set
  {
    _cnString = value;
  }
}

And to bring in the new connection string source to the DBAppender class, one should add an override of the ActivateOptions method as follows:

public override void ActivateOptions()
{
  this.ConnectionString = System.Configuration.ConfigurationManager.ConnectionStrings[connectionStringName].ConnectionString;
  base.ActivateOptions();
}

Using a Stored Procedure

The general performance benefits of using a Stored Procedure over plain SQL Command Text are well-known. When logging to the database, the potential for easier execution plan re-use in SQL Server is one reason to use a stored procedure; another good reason is the ability to limit the amount of “logic” within web.config – using a stored procedure can make maintenance easier. Again, let’s earn that “good developer practice” badge!

The steps to alter the configuration to use a stored procedure are straight forward:
1. Create a stored procedure in the database that will encapsulate the SQL INSERT statement:

CREATE PROCEDURE logFromSitecore
	@log_date		DateTime
	, @thread		varchar(255)
	, @log_level	varchar(255)
	, @logger		varchar(255)
	, @machine		varchar(4000)
	, @message		varchar(4000)
	, @custom		varchar(4000)
AS
BEGIN
	INSERT INTO CustomLog ([Date],[Thread],[Level],[Logger],[Message],[Machine],[Custom]) 
	VALUES (@log_date, @thread, @log_level, @logger, @message, @machine, @custom)
END

2. Change the XML configuration for the node in web.config to reference the stored procedure:

<appender name="DatabaseAppender" type="CustomDBAppender.DBAppenderWrapper, CustomDBAppender" >
<param name="CommandText" value="logFromSitecore" />
<param name="CommandType" value="StoredProcedure" />
...

This is all that’s required for the logging to now run through a stored procedure instead of using plain SQL command text. In the follow-up post about performance, I will return to this topic in more detail . . . but this covers the mechanics of using a stored procedure with your database logging.

Data Volume

No matter which approach one takes to managing the connection string or the question of stored procedures vs in-line SQL, managing the volume of data logged to the database is an important consideration.
Care should be taken in configuring the amount of data logged, just to guard against an overwhelming volume of database writes becoming a bottleneck. Log4net supports this in a number of ways, one method is the “priority” setting that is under the node in the log4net web.config definition. The “priority” setting filters out log messages below that value, so a < priority value=”WARN”/> would only write WARN messages and higher. You may want to consider using the ERROR value, even, to be more conservative about this. For production environments, I recommend updating the priority to WARN or ERROR for the configuration defined in section above. By default, Sitecore sets this to INFO. Change it to something like the following:

<root>
  <priority value="WARN"/>
  ...

BufferSize Setting

The ADONetAppender keeps a buffer of log messages in memory, and persists the messages to the database once the buffer is full. This buffer can be configured via the “bufferSize” setting () under the node in the log4net web.config definition. By default, Reflector tells me the value is 512. When checking the database table for a specific message, it’s important to know that a specific recent message might be buffered and not written to the database because of this setting. In a production environment, Sitecore recommends against setting a bufferSize value lower than the default (512), for performance reasons; a value of “1,” for example, would write every log message to the database as soon as it is generated and could create considerable traffic to the database. Depending on the environment, one may consider setting this bufferSize value higher than the default; in a development environment, however, a value of “1” may be desireable to have messages written immediately to the database.

Recognizing that the database appender is buffering messages, it’s conceivable that a server outage could result in a loss of any buffered messages that haven’t been written to the database. In such a circumstance, the logged activity could be vital to diagnosing the circumstances of the outage. With this scenario in mind, I recommend the following:

Maintain a minimal file-based Sitecore log on the server (keeping only the most recent 2 days, for example); this would have a better chance of capturing any log messages potentially lost due to a sudden outage.  In a hypothetical outage situation, one could resort to checking the file-based log on the server in question.  

ReconnectOnError Setting

The ADONetAppender has a “ReconnectOnError” setting which controls if this appender will try to reconnect to the database on error. The default behaviour is for the appender to not try to reconnect to the database if an error occurs (ReconnectOnError is false by default). Given a scenario where there’s a temporary database connection problem, subsequent logging events are discarded if the ReconnectOnError setting is set to false. To force the appender to attempt to reconnect to the database, set this property to true in the definition in web.config:

  <ReconnectOnError value="true"/>

Note: When the appender attempts to connect to the database there may be a delay of up to the connection timeout specified in the connection string. This delay will block the calling application’s thread. Until the connection can be reestablished this potential delay may occur multiple times. To mitigate such a scenario, one might consider introducing true asynchronicity into the ADONetAppender (more on this later).

OK, to recap these two posts on logging to the database with Sitecore:

  • Post one (enough to be dangerous):
    • The custom ADONetAppender Class
    • Log Database and Table
    • XML Configuration
  • Post two (earning the good developer badge):
    • Centralizing your ConnectionString
    • Using a Stored Procedure
    • Advanced settings to consider

In the third installment, I’ll address performance in some detail and asynchronicity.

Sitecore Logging to the database (part 1 — enough to be dangerous)

By default, Sitecore supports instrumentation/logging using the Log4Net library, and by default that output is written to the file system. This can be undesirable when it comes to troubleshooting or reviewing logged activity on secured environments, particularly in scaled multi-server implementations.  I was working with a Sitecore customer recently who was determined to use the database as the destination for logging (despite certain perf implications, more on this later).  They had a number of Content Delivery servers and having a consolidated view of the logged activity, easily accessible to Sitecore developers and queryable, drove this decision.

Note that in my capacity at Sitecore, I’m not advocating for logging to the database, but I am in a position to respond to the customer on how an implementation might tackle database logging if they were determined to do it.  Hopefully that’s enough caveats to satisfy the likes of Sitecore’s Chris Castle 🙂

ADONetAppender Class

To begin, the log4net.Appender.ADONetAppender is the standard log4Net object used to write log output to the database. Instead of using the default ADONetAppender, I recommend creating a custom class to extend the ADONetAppender. This provides a powerful mechanism for implementation specific logging logic, so that details not typically captured in a Sitecore log can be tracked (such as the executing machine, Sitecore user, URL, etc).

A basic customized ADONetAppender might look as follows:

namespace CustomDBAppender
{
  public class DBAppenderWrapper: log4net.Appender.ADONetAppender
  {
    protected override void Append(log4net.spi.LoggingEvent theEvent)
    {
      log4net.helpers.PropertiesCollection col = theEvent.Properties;
      if (!object.ReferenceEquals(Environment.MachineName, null))
      {
        col["machine"] = Environment.MachineName;
      }
      col["custom"] = "Best NHL team: Avalanche";
      base.Append(theEvent); 
   } 
  } 
}

(I will improve the code formatting on future posts, so just bare with me).

The expectation is that the customer would customise the Append method heavily.

One could add conditional logic based on the LoggingEvent.Level (Info, Error, etc), or tap into the Sitecore.Context object by referencing Sitecore.Kernel.dll with the project. Note: if one does reference Sitecore.Kernel.dll, be sure to reference the exact version used in your implementation to avoid conflicts.

CustomDBAppender Assembly

The previous section defines a CustomDBAppender assembly that compiles into CustomDBAppender.dll. There is one class in the assembly: DBAppenderWrapper. This .dll should be placed in the /bin directory of the Sitecore implementation.

Create log database and table

One needs a database and a table to store the logged output. Sitecore recommends against using a Sitecore Core, Master, or Web database for this purpose as those should be reserved for Sitecore application internals. Assuming one creates a database for this purpose named “CustomLogDB”, a table to store the logged output could be created with the following SQL script:

USE [CustomLogDB]
GO
CREATE TABLE [dbo].[CustomLog](
  [ID] [int] IDENTITY(1,1) NOT NULL,
  [Date] [datetime] NOT NULL,
  [Thread] [varchar](255) NOT NULL,
  [Level] [varchar](20) NOT NULL,
  [Logger] [varchar](255) NOT NULL,
  [Message] [varchar](4000) NOT NULL,
  [Machine] [varchar](4000) NULL,
  [Custom] [varchar](4000) NULL
) ON [PRIMARY]

The ID, Date, Thread, Level, Logger, and Message columns are defined to match the default Sitecore logged output. To support a custom implementation (any properties set in the CustomDBAppender.DBAppenderWrapper class), additional columns are warranted. In this example, “Machine” and “Custom” are defined in addition to the Sitecore standard logging columns.

One could build in extensibility here with an XML column type (and support very flexible logging schema with such an “CustomLoggedOutputAsXML” column), but I recommend a simple column structure with 1:1 mapping column to custom attribute unless circumstances warrant a more complicated table. Always evaluate the performance implications of complicating a table definition with XML column types, for example.

XML Configuration

Web.config, by default, defines the log4net settings that one must modify to output logging to a database. The log4net.Appender.SitecoreLogFileAppender Is defined by default under the <log4net> node. Add the following configuration just under the <log4net> node for the CustomDBAppender built in the step above:

<appender name="DatabaseAppender"
type="CustomDBAppender.DBAppenderWrapper, CustomDBAppender" >
<param name="ConnectionType"
value="System.Data.SqlClient.SqlConnection, System.Data, Version=2.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089" />
<param name="connectionString"
value="user id=[login];password=[pwd];Data Source=[server];Database=[CustomLogDB]" />
<param name="CommandText" value="INSERT INTO CustomLog ([Date],[Thread],[Level],[Logger],[Message],[Machine],[Custom]) VALUES (@log_date, @thread, @log_level, @logger, @message, @machine, @custom)" />
<param name="Parameter">
<param name="ParameterName" value="@log_date" />
<param name="DbType" value="DateTime" />
<param name="Layout" type="log4net.Layout.PatternLayout">
<param name="ConversionPattern" value="%d{yyyy&apos;-&apos;MM&apos;-&apos;dd HH&apos;:&apos;mm&apos;:&apos;ss&apos;.&apos;fff}" />
</param>
</param>
<param name="Parameter">
<param name="ParameterName" value="@thread" />
<param name="DbType" value="String" />
<param name="Size" value="255" />
<param name="Layout" type="log4net.Layout.PatternLayout">
<param name="ConversionPattern" value="%t" />
</param>
</param>
<param name="Parameter">
<param name="ParameterName" value="@log_level" />
<param name="DbType" value="String" />
<param name="Size" value="50" />
<param name="Layout" type="log4net.Layout.PatternLayout">
<param name="ConversionPattern" value="%p" />
</param>
</param>
<param name="Parameter">
<param name="ParameterName" value="@logger" />
<param name="DbType" value="String" />
<param name="Size" value="255" />
<param name="Layout" type="log4net.Layout.PatternLayout">
<param name="ConversionPattern" value="%c" />
</param>
</param>
<param name="Parameter">
<param name="ParameterName" value="@message" />
<param name="DbType" value="String" />
<param name="Size" value="4000" />
<param name="Layout" type="log4net.Layout.PatternLayout">
<param name="ConversionPattern" value="%m" />
</param>
</param>
<param name="Parameter">
<param name="ParameterName" value="@machine" />
<param name="DbType" value="String" />
<param name="Size" value="4000" />
<param name="Layout" type="log4net.Layout.PatternLayout">
<param name="ConversionPattern" value="%P{machine}" />
</param>
</param>
<param name="Parameter">
<param name="ParameterName" value="@custom" />
<param name="DbType" value="String" />
<param name="Size" value="4000" />
<param name="Layout" type="log4net.Layout.PatternLayout">
<param name="ConversionPattern" value="%P{custom}" />
</param>
</param>
</appender>

Note: Be sure to update the connectionstring property to the appropriate value for your environment.

The two parameter definitions for @machine and @custom use a different parameter definition (log4net.Layout.PatternLayout).

In addition to the Appender node, one must edit the <root> node for log4net that indicates where root logging output is directed. Out-of-the-box, Sitecore defines a root node as follows:

<root>
<priority value="INFO"/>
<appender-ref ref="LogFileAppender"/>
</root>

Using the example we’ve built through this document, the updated root node would be defined as follows:

<root>
<priority value="INFO"/>
<!-- custom db appender info:-->
<appender-ref ref="DatabaseAppender" />
<!--<appender-ref ref="LogFileAppender"/>-->
</root>

This completes the modifications necessary to the XML configuration.

This is just the end of the beginning.  This is enough to be dangerous with database logging in Sitecore, and I will shortly post follow-ups to this elaborating on considerations for the connection string, custom settings, asynchronicity, etc.

Who is this guy?

Introductory blog post . . . here goes . . . I’m Grant Killian and I used to blog a lot back in the olden days when .Net was new and there was not much documentation or true consensus on web development and the Microsoft platform.  16 years ago C# was new, we could barely spell JIT, and the question “Java or Microsoft” was something we all legitimately lost sleep over.  No Microsoft web developers knew what a virtual method was, much less a virtual machine.  Azure was the name some Crayola marketer came up with to sell a different shade of blue, not a hosting environment nirvana where they serve chocolate milk at every meal and you setup production environments in minutes.
grant

I was a computer nerd working as a programmer with Microsoft and Java technologies.  The day the .Net platform became generally available, I started writing production code with it and for a good year prior to that I had teammates working on beta versions of the new Microsoft technology; this was one of those showcase projects where Microsoft swoops in at the last minute to make it look all shiny and perfect before it was used as a developer conference demo and case study.

I remember the project using PocketPC and IIS in cutting edge fashion, and I also recall the project never actually seeing real use in production because of changing scope and corporate restructuring.

In about 2002, I started a .Net user group in Virginia with the clever name and URL WeProgram.Net (it’s now some Chinese ecommerce site — I should’ve kept the domain!).  That user group was a great vehicle for obtaining technical books from publishers who would give them to us to promote their label, and man were we reading a lot of technical books in those days.  There was tons of collaboration through blogs, newsgroups, and conferences . . . there was so much to learn and so little documentation.

Fast forward to now, and software development has evolved and considerably matured.  I’ve shifted employers and projects a few significant times, including an awesome 4 year stint in Switzerland with a tiny little bank and towering snow-capped mountains.  All along, I’ve worked on web technology projects, but I left the blogging world about a decade ago (while my friends at CodeBetter.com are still going strong, amazingly!).

Life can be circular.  I side with Toynbee.  So here we go again.

I’m now employed at Rackspace; I previously worked at Sitecore for a couple years on their professional services team.  I now assist Sitecore customers at Rackspace, helping them make the most of the platform.  It’s always something new, and challenging, and usually fun in a nerdy way

Like with the early days of the .Net framework, development for Sitecore involves a fluid ecosystem where smart people innovate and share their ideas with the community.  There are libraries and customizations of all shapes and sizes.

Sitecore is like a new planet just discovered, with various colonies of hippie Marxists (the MVC/Glass cognoscenti?), back-to-the-land minimalists (the Sitecore scalability/perf purists?), and religious zealots (the Sitecore Analytics true-believers?).  Technologists are doing very interesting and creative things with Sitecore.

This blog will be where I catalog observations and experiences while working on these assorted Sitecore projects.  A field log.  Hopefully, others can learn through my scribblings.  I learn a lot from the online Sitecore community, so quid pro quo right?  I can’t promise to contribute with great regularity, but I’ll share notes on this blog based on my experiences as time permits.