Review of Minion Reindex

Back in October 2014 Midnight SQL released v1.0 of Minion Reindex, a free, open source index maintenance solution. I’m all for making my day job easier so I once I heard about this I wanted to get it into a development environment asap to see what it can do. Unfortunately other things have gotten in the way so I’m only getting around to reviewing it now.

I’m going to go through the deployment, main objects and an initial test of the solution but first off, let’s address the massive elephant in the room that is, Ola Hallengren’s maintenance solution. Ola’s scripts are tried and tested, I’ve been running them in my production environment for god knows how long so it’s going to take something pretty special to make me change. Midnight SQL are aware of this and have detailed the differences between the two solutions here.

Going through the list, the features of Minion Reindex that caught my eye were:-

  • Minimal job configuration – reduced amount of job/job steps needed
  • Exclude databases/tables without altering jobs
  • Configure database/table thresholds and settings without altering jobs
  • Monitoring/Collecting initial and post fragmentation

Ola’s solution does require a fair amount of configuration in the job steps. It also logs all information into one table, dbo.CommandLog which stores some information as XML. That’s fine to be honest but I’m looking to see if this solution can make my life simpler. Can it reduce the amount of time I spend making adjustments to index maintenance (i.e. – the jobs) on my servers? Will it give me more information than I have now? Will that information be easier to retrieve (i.e. – not having to parse XML)?

Deployment
Deployment is a cinche, just download the .zip file, extract and then run the MinionReindexInstall.sql script in the database you wish to deploy to. One thing I don’t like is the ascii art at the top of the script, it looks a bit tacky for my taste but each to their own.

Another thing to mention is that the script will enable xp_cmdshell, not something I’m particularly keen on because of security concerns surrounding it (see here https://msdn.microsoft.com/en-us/library/ms175046.aspx) so I’d rather have it turned off unless it is absolutely necessary. I’m only testing on one instance of SQL Server 2012 Development Edition and I am not using the centralised logging feature (which is why the script enables xp_cmdshell) so I removed the code that enabled xp_cmdshell from the deployment script.

Objects Created
The script creates its own schema, a scalar function, three stored procs and nine tables. I’m not going to go through what each of them does as they are all detailed in the documentation that comes in the .zip file, but here are the main ones:-

Tables
IndexMaintLog – Index operations breakdown by database
IndexMaintLogDetails – Index operations breakdown by individual database
IndexSettingsDB – Maintenance settings at database level
IndexSettingsTable – Maintenance settings at index level

Stored Procedures
IndexMaintMaster – Main procedure which analyses databases and then calls IndexMaintDB
IndexMaintDB – Performs the actual index maintenance in the databases

SQL Server Agent Jobs
MinionReindexDBs-All-All – Performs the heavy work i.e. – rebuilding indexes
MinionReindexDBs-All-REORG – Performs light weight maintenance (reorganising)

Fairly straightforward, right?

Initial Run
So the table that’ll start you off is IndexSettingsDB. Here you’ll add in rows for each database in your SQL instance OR just leave the default row. There’s loads of settings here, allowing you to finely tune your index maintenance for each database without having to touch the Agent jobs. Initially I just inserted a row for each database (using the default settings) and gave the MinionReindexDBs Agent job a whirl.

Whilst the jog is running, have a look at the IndexMaintLog and IndexMaintLogDetails tables. These will show you what operations the job has performed, what it is currently doing and, in the case of IndexMaintLogDetails, what is queued up.

The level of detail in these tables is fantastic. The IndexMaintLog table will give you a good overview of what has been done in each database, the execution time, the number of tables processed and not only the number of indexes processed but the number of indexes reorganised vs rebuilt.

Moving on to the IndexMaintLogDetails, we’re now getting down to the individual index stats. So we’ve got execution time of both reindex and stats update, initial fragmentation, post fragmentation (if you’ve included that option in IndexSettingsDB) and the index usage stats (seeks,scans, lookups etc.). I’ve always separated out my collection of stats from my maintenance tasks but I’m definitely going to switch on the post frag collection and see how it performs.

These two tables are absolutely screaming out for a SSRS report to be slapped over the top of them. And they’d be pretty easy to write, only two tables to track trends in your databases which would allow you to see changes in fragmentation patterns over time.

Further Testing
The next things for me will be, setting up table level exclusions in certain databases and setting the order of databases and tables to be reindexed. Once that’s all setup I’ll get both Agent jobs on a schedule, start fragmenting my test databases (I’ll shrink them) and then analysing the stats.

Conclusion
I like this solution, it’s easy to deploy, configure and run. I’m going to let this run in my dev environments for a while to see how it performs with a view to deploying to production.

I’d definitely recommend that DBAs should be downloading this, installing in a development environment and playing around.

What’s also encouraging about this solution is that Midnight SQL list on their website features that will be included at a later date. There are some features that I’d like to see, system table maintenance and the ability to rebuild individual partitions would probably be at the top of the list, but it’s good to know that there will be future releases (and with that, hopefully support).

Minion Reindex can be downloaded from here:- http://www.midnightsql.com/minion/

Error Log Management

Thought I’d write a quick post about managing error logs within SQL Server. Good error log management is essential for investigating issues that occur in SQL Server. The last thing you want is a very large error log file that takes ages to open when you are trying to determine the cause of a problem.

I always change the maximum number of error logs to 30 (at a minimum) and then recycle the error log each night. This way I can keep a month’s worth of logs in easily manageable portions (this does depend on the system but one day’s worth of logs in each file is generally OK).

So to change the maximum number of error logs, go to Management > SQL Server Logs and right-click and select Configure. Change the maximum number of error logs (I changed it to 30) and then hit the script button which will give you the following:-

USE [master];
GO

EXEC xp_instance_regwrite 
	N'HKEY_LOCAL_MACHINE', 
	N'Software\Microsoft\MSSQLServer\MSSQLServer', 
	N'NumErrorLogs', 
	REG_DWORD, 
	30;
GO

Microsoft has provided the stored procedure sys.sp_cycle_errorlog so this can be setup to run nightly as a SQL Server Agent job. However I have found that the following error can occur on busy systems (running SQL Server 2005 I’m afraid):-

“Unable to cycle error log file from ‘…\ERRORLOG’ to …\ERRORLOG.1′ due to OS error ’32(The process cannot access the file because it is being used by another process.)”

When this happens the old log is archived but the new log is not initialised, meaning that no information will be written to the log until it is recycled manually (once you’re in the office). Not good, especially if an issue has occurred in that time.

So to try and combat this, I created the following stored procedure to try to recycle the error log a maximum of three times if an error occurs:-

CREATE PROCEDURE [dbo].[RecycleErrorLog]
AS

DECLARE @Tries TINYINT
SET @Tries = 1
WHILE @Tries <= 3

BEGIN

  BEGIN TRY
		EXEC master.sys.sp_cycle_errorlog
		BREAK
  END TRY

  BEGIN CATCH
		SELECT ERROR_NUMBER() AS ErrorNumber
		SET @Tries = @Tries + 1
		CONTINUE
  END CATCH;

END;

GO

I set the maximum number of re-tries to 3 as if the job fails three times in a row I would say that there is something more severe going on which would need to be investigated.

So I’m feeling pretty clever, the stored procedure is running fine until one day:-

“Unable to cycle error log file from ‘…\ERRORLOG’ to …\ERRORLOG.1′ due to OS error ’32(The process cannot access the file because it is being used by another process.)”

Ah, it’s no problem I tell myself. The error will be generated but the procedure will re-run and recycle the SQL Server Error Log. So…..I stroll into work, check the logs and…..they haven’t recycled. Not only that, nothing has been recorded since the attempt to cycle, I’m back to where I started!

I check the SQL Server Agent Job’s history and the job is telling me that it has run successfully. It seems that the error is not returned to the procedure, there’s a record in the SQL Server Error Log:-

“Attempting to cycle error log. This is an informational message only; no user action is required.”

I look back at the original error, there was an important part that I missed!

“…due to OS error ’32…”

It’s an OS error, it’s not going to be recorded in SQL. The only place the error is recorded is in the Application Event Log. So it looks like I’m going to have to run the stored procedure and then check the Application Event Log for errors. The best way I can think of doing this is using powershell, so here’s the script I wrote:-

#Set variable values to zero
$MaximumRetries = 0
$ErrorChecker = 0

#Open connection to SQL Server
$SQLServer = "MYSERVER"
$SQLDBName = "master"

$SqlConnection = New-Object System.Data.SqlClient.SqlConnection
$SqlConnection.ConnectionString = "Server=$SQLServer;Database=$SQLDBName;Integrated Security=True"
$SqlConnection.Open()

$SQLCmd = new-Object System.Data.SqlClient.SqlCommand("sp_cycle_errorlog", $SQLConnection) 
$SQLCmd.CommandType = [System.Data.CommandType]'StoredProcedure' 

#Run sp_cycle_errorlog
$SQLCmd.ExecuteReader()

#Check application log for errors
$datetime = (Get-Date).AddSeconds(-5)
$events = get-eventlog application -EntryType Error -After $datetime | where-Object {$_.source -like "MSSQL*"}	

do
{
	#If the application log has an error
	if ($events)
		{	
			#Set maximum retries to 3
			if ($MaximumRetries -lt 4)
				{	
					#Re-execute stored procedure
					$sqlCmd.ExecuteReader()		
					
					#Check application log
					$datetime = (Get-Date).AddSeconds(-5)
					$events = get-eventlog application -EntryType Error -After $datetime | where-Object {$_.source -like "MSSQL*"}	
			
					$MaximumRetries = $MaximumRetries + 1		
				}
	#If the application log has no errors, close connection			
	elseif (!$events)
		{
			$SqlConnection.Close()
			$ErrorChecker = 1
		}
		}	
}
while ($ErrorChecker = 0)

I saved this as a .ps1 file and then created a SQL Server Agent Job to run it. I still get alerts for the error but each time the script has done its job and re-run.

Has anyone else seen this behaviour? If so, I’d be interested in hearing how you dealt with it, my email is dbafromthecold@gmail.com

Index Fragmentation

So you’ve setup a bunch of indexes in your database to tune the queries coming in. They’re all running much faster than before and you’re leaning back in your chair feeling pretty pleased with yourself. You’re done right?

Well…no. As data gets inserted into and deleted from tables within a database, the indexes on the tables become fragmented. Fragmentation is defined as:-

“Fragmentation exists when indexes have pages in which the logical ordering, based on the key value, does not match the physical ordering inside the data file.”

But does it matter that the logical ordering does not match the physical ordering? Microsoft published a white paper here which showed that with increased fragmentation, the time taken for each read from disk increased whilst the average size of each read decreased. I know it was performed using SQL Server 2000 but it’s still relevant. They summarise the results as:-

“The workload performance increase realized in the small-scale environment ranged from 60 percent at the low level of fragmentation to more than 460 percent at the highest level of fragmentation. The workload performance increased realized for the large-scale environment ranged from 13 percent at the low fragmentation level to 40 percent at the medium fragmentation level”

Note – This is the summary of the DSS workload test, complex queries against multiple levels of fragmentation

To summarise:-
Small-scale environment
After defragmenting the indexes an increase in performance of 60% was gained when there was a low level of fragmentation and there was a gain of 460% at a high level of fragmentation
Large-scale environment
After defragmenting the indexes an increase in performance of 13% was gained when there was a low level of fragmentation and there was a gain of 40% at a high level of fragmentation

Wow, removing the fragmentation in the indexes gave a performance increase between 13% and 460% (across the two environments).

The paper also discusses why the performance in the large-scale environment was less affected by fragmentation…basically the environment had a better IO subsystem…which means that throwing money at index fragmentation can appear to solve the problem but you are really just papering over the cracks.

But, what DBA doesn’t like improving the performance of a system without having to spend any money? “Look, I’ve improved the performance by X%!”. (cue smug face)

So a plan to reduce the fragmentation of the indexes needs to be implemented. But how do you approach this? Rebuild every index nightly? Will your system allow you to do that? What if you are running Standard Edition and you cannot rebuild indexes online?

The way I approach index maintenance is to customise for each database I manage. To do this, the indexes that require either rebuilds or reorganisation need identifying. Microsoft have provided the DMV sys.dm_db_index_physical_stats which will allow us to do so:-

SELECT
	[schemas].name AS [SchemaName],
	[objects].name AS [TableName], 
	[indexes].name AS [IndexName],
	[dm_db_index_physical_stats].index_depth,
	[dm_db_index_physical_stats].avg_fragmentation_in_percent,
	[dm_db_index_physical_stats].fragment_count,
	[dm_db_index_physical_stats].avg_fragment_size_in_pages,
	[dm_db_index_physical_stats].page_count,
	GETUTCDATE() AS [CaptureDate]
FROM 
	[sys].[dm_db_index_physical_stats](DB_ID(),NULL,NULL,NULL,'LIMITED' ) AS [dm_db_index_physical_stats]
INNER JOIN 
	[sys].[objects] AS [objects] ON ([dm_db_index_physical_stats].object_id = [objects].object_id)
INNER JOIN 
	[sys].[schemas] AS [schemas] ON ([objects].[schema_id] = [schemas].[schema_id])
INNER JOIN 
	[sys].[indexes] AS [indexes] ON ([dm_db_index_physical_stats].object_id = [indexes].object_id
								AND  [dm_db_index_physical_stats].index_id = [indexes].index_id )
WHERE 
	[dm_db_index_physical_stats].index_type_desc <> 'HEAP'
AND 
	[dm_db_index_physical_stats].avg_fragmentation_in_percent > 20;
GO

Note – this will pick up data on all indexes that are over 20% fragmented.

By adding GETUTCDATE() we can monitor and track the fragmentation over time, allowing us to identify which tables’ indexes are being fragmented the quickest. Nice and simple to setup, first create a table to hold the data:-

CREATE TABLE dbo.[IndexAnalysis]
		(AnalysisID					INT IDENTITY(1,1) NOT NULL PRIMARY KEY,
		SchemaName					SYSNAME,
		TableName					SYSNAME,
		IndexName					SYSNAME,
		IndexDepth					INT,
		AvgFragmentationInPercent	FLOAT,
		FragmentCount				BIGINT,
		AvgFragmentSizeInPages		FLOAT,
		PageCount					BIGINT,
		CaptureDate					DATETIME);
GO

Then create a stored procedure to capture the data and insert it into the table:-

CREATE PROCEDURE dbo.[CaptureIndexFragmentation] AS

INSERT INTO dbo.[IndexAnalysis]
SELECT
	[schemas].name AS [SchemaName],
	[objects].name AS [TableName], 
	[indexes].name AS [IndexName],
	[dm_db_index_physical_stats].index_depth,
	[dm_db_index_physical_stats].avg_fragmentation_in_percent,
	[dm_db_index_physical_stats].fragment_count,
	[dm_db_index_physical_stats].avg_fragment_size_in_pages,
	[dm_db_index_physical_stats].page_count,
	GETUTCDATE() AS [CaptureDate]
FROM 
	[sys].[dm_db_index_physical_stats](DB_ID(),NULL,NULL,NULL,'LIMITED' ) AS [dm_db_index_physical_stats]
INNER JOIN 
	[sys].[objects] AS [objects] ON ([dm_db_index_physical_stats].object_id = [objects].object_id)
INNER JOIN 
	[sys].[schemas] AS [schemas] ON ([objects].[schema_id] = [schemas].[schema_id])
INNER JOIN 
	[sys].[indexes] AS [indexes] ON ([dm_db_index_physical_stats].object_id = [indexes].object_id
								AND  [dm_db_index_physical_stats].index_id = [indexes].index_id )
WHERE 
	[dm_db_index_physical_stats].index_type_desc <> 'HEAP'
AND 
	[dm_db_index_physical_stats].avg_fragmentation_in_percent > 20;
GO

This can then be run as a nightly job so that you can track the fragmentation of the indexes over time.

SELECT 
	SchemaName AS [Schema Name],
	TableName AS [Table Name],
	IndexName AS [Index Name],
	AvgFragmentationInPercent AS [Current % Avg Fragmentation],
	LAG(AvgFragmentationInPercent,1) OVER (PARTITION BY TableName, IndexName ORDER BY CaptureDate) AS [Previous % Avg Fragmentation],
	AvgFragmentationInPercent -
	LAG(AvgFragmentationInPercent,1) OVER (PARTITION BY TableName, IndexName ORDER BY CaptureDate) AS [Difference],
	CaptureDate AS [Capture Date]
FROM 
	dbo.IndexAnalysis

Note – This SELECT statement uses the LAG() function which is only available in SQL Server 2012 onwards.

Once you have the data you can then decide how to approach the maintenance. You could use something like:-

SELECT 		   
	'ALTER INDEX [' + IndexName + '] ON [' + SchemaName + '].[' + TableName +'] ' +
    (CASE	
     WHEN (avgfragmentationinpercent >= 20 AND avgfragmentationinpercent < 40) THEN 'REORGANIZE'
     WHEN avgfragmentationinpercent > = 40 THEN 'REBUILD' END) AS [SQL Statement]
FROM 
	dbo.[IndexAnalysis] 

This will generate the required ALTER INDEX statements, you could then use sp_executesql to perform the operation. Once you have the data, you can proceed with the best approach that suits your environment. One thing that I have found is that, with large indexes in a busy system, catching fragmentation early and performing frequent REORGANIZE operations instead of less frequent REBUILDs is preferable as REORGANIZE statements are less resource intensive.