Where is my data stored?

Tables within SQL Server can contain a maximum of 8060 bytes per row. However even though columns are limited to a maximum of 8000 bytes in size, the combined size of all the columns in a table can exceed the 8060 limit. But what happens when this limit is exceeded?

Let’s create a database and have a look:-

USE [master];
GO

CREATE DATABASE [StorageTest];
GO

Now create a table to test with:-

USE [StorageTest];
GO

CREATE TABLE dbo.[Test1]
( PKID		 INT,
  FirstName	 CHAR(24),
  MiddleName CHAR(24),
  LastName	 CHAR(24));
GO

The length of each row within this table will be:-

Column Data Type Size
PKID INT 4 bytes
FirstName CHAR(24) 24 bytes
MiddleName CHAR(24) 24 bytes
LastName CHAR(24) 24 bytes
Total 76 bytes

More info on data types and their sizes

Let’s insert one row and have a look at the size:-

INSERT INTO dbo.[Test1]
(PKID, FirstName, MiddleName, LastName)
VALUES
(1, 'Andrew', 'Philip', 'Pruski');
GO
EXEC sp_spaceused 'Test1';
GO

InitialTableSize1

We can see that 1 row is in the table which is 8KB in size. This is because SQL stores data on 8KB pages so 8KB is the minimum size for a table which has data in it! We can see this by running the undocumented DBCC command, DBCC IND:-

DBCC IND('StorageTest','Test1',1);
GO

TableAllocations1

Not going to go into what each of the columns mean but you can see that this table has two rows. Both classified as “in-row data”, the first row is page type 10 which is an IAM page and the second row is page type 1 (a data page), which contains the row we inserted.

More info on DBCC IND and page types

“In-row data” is exactly what it sounds like, data in the rows on the page.

What happens when we add a column that will take the size of the row over the 8060 limit? I commonly see large columns added to tables with the reason of “we don’t know how big the data will be so we’re making this column as big as SQL will allow” *sigh*. For example:-

ALTER TABLE dbo.[Test1]
ADD [Address] VARCHAR(8000);
GO

Let’s populate that column with some data:-

UPDATE dbo.[Test1]
SET [Address] = 'zz'
WHERE PKID = 1;
GO

What’s the size of the table now?

EXEC sp_spaceused 'Test1';
GO

InitialTableSize1

Still the same, good, but if this column is completely filled to its maximum so the size of a row will be over the limit, what will happen to the size of the table and where will the data be stored?

Let’s fill the column up for the row that is already in the table:-

UPDATE dbo.[Test1]
SET [Address] = REPLICATE('zz',4000)
WHERE PKID = 1;
GO

Now we have a row that has exceeded the maximum size allowed, what has that done to the size of the table?

EXEC sp_spaceused 'Test1';
GO

TableSize2

Huh? The size of the table has tripled from 8KB to 24KB! What’s going? Let’s run DBCC IND again and have a look:-

DBCC IND('StorageTest','Test1',1);
GO

TableAllocations2

There are now two new pages allocated to the table, classified as “Row-overflow data”. Because we’ve gone over the row limit, SQL has allocated a new IAM page and a new page to store the data that has gone over the limit. This means that just by adding that column and populating it to its maximum limit the table has noticeably increased in size. If there were thousands/millions of rows in this table there would be a big impact on the performance of queries SELECTing from it due to the extra amount of page reads that would have to be performed.

I know this is a very contrived example but it’s good to see how SQL reacts to columns being added to tables that can make a row exceed its maximum size.

I have to admit though, I rarely see VARCHAR(4000) added to a table. What’s far more common is VARCHAR(MAX). How does SQL react to having a column with this datatype added to a table? Well, let’s find out.

First reset our table (dropping the column and rebuilding to bring it back down to its original size):-

ALTER TABLE [Test1] DROP COLUMN [Address];

ALTER TABLE [Test1] REBUILD;
GO

Add the column:-

ALTER TABLE dbo.[Test1]
ADD [Address] VARCHAR(MAX);
GO

Populate the row with data:-

UPDATE dbo.[Test1]
SET [Address] = REPLICATE('zz',4000)
WHERE PKID = 1;
GO

Check the size of the table:-

EXEC sp_spaceused 'Test1';
GO

TableSize3

Again the table has tripled in size (ignore the reserved size, it’s larger as we’ve been messing about). If we look at the page allocations we can see why:-

DBCC IND('StorageTest','Test1',1);
GO

TableAllocations3

SQL has allocated two new pages to the table, classified as”LOB data”. This time SQL has pushed out the data in the NVARCHAR(MAX) column to a special page type designed for large data types. Other data types that this applies to are nvarchar(max) and varbinary(max) (also text, ntext and image in previous versions).

So in summary, most data within SQL Server is stored on pages classified as “In-row data”. If the combined size of columns in a row exceeds 8060 bytes, then SQL will push data onto pages classified as “Row-overflow data”. Finally if certain LOB data types are used, then SQL stores that data on pages classifed as “LOB data”.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s