04/09/2018

The Hidden Danger of Readable Secondaries in SQL Server AlwaysOn Availability Groups

by Dmitri Korotkevitch

The following post is an excerpt from the author’s upcoming book on Expert SQL Server Transactions and Locking.

SQL Server AlwaysOn Availability Groups have become the most common High Availability technology used with SQL Server. This technology provides protection at the database group level, and it stores a separate copy of each databases on each server or node, which eliminates storage from being a single point of failure in the infrastructure.

Moreover, the Enterprise Edition of SQL Server allows you to configure read-only access to the secondary nodes to scale the read-only workload in the system. Doing so however, may lead to unexpected side effects on the primary node in the group.

When you run the queries against secondary nodes, SQL Server always uses the snapshot isolation level, ignoring the SET TRANSACTION ISOLATION LEVEL statement and any locking hints. Snapshot isolation eliminates possible readers/writers blocking, and that isolation level is used even if you do not enable ALLOW_SNAPSHOT_ISOLATION database option.

Snapshot isolation also means that SQL Server will use row versioning on the primary node. You may not be able to use optimistic isolation levels programmatically when they are not enabled; nevertheless, SQL Server will use row versioning internally. The databases on the primary and secondary nodes are the same, and it is impossible to use row versioning only on the secondary nodes.

There are several obvious implications of row versioning in the system. First, it introduces additional tempdb load on the servers. SQL Server copies old versions of rows to the version store in tempdb during data modifications, appending 14-byte version store pointers to the modified or deleted rows in the database. These pointers, in turn, may lead to page splits on fully populated data pages, and increase index fragmentation. As the tip, do not use FILLFACTOR = 100 for indexes when you have row versioning enabled.

However, row versioning also leads to another, less known phenomenon. Long running snapshot transactions on secondary nodes may defer ghost and version store cleanup on the primary node. Such transactions work with snapshots of the data as of the time when a given transaction began. Therefore, SQL Server cannot remove deleted rows and reuse their space because of the possibility that a snapshot transaction needs to access the old versions of the rows.

Let's look at an example. Begin by creating two tables in the database as shown in Listing 1. The table dbo.T1 will have 65,536 rows and will use 65,536 pages - one row per data page. Table dbo.T2 is empty.

Listing 1. Table creation (Run on primary node)

create table dbo.T1
(
ID int not null,
Placeholder char(8000) null,

constraint PK_T1
primary key clustered(ID)
);

create table dbo.T2
(
Col int
);

;with N1(C) as (select 0 union all select 0) -- 2 rows
,N2(C) as (select 0 from N1 as T1 cross join N1 as T2) -- 4 rows
,N3(C) as (select 0 from N2 as T1 cross join N2 as T2) -- 16 rows
,N4(C) as (select 0 from N3 as T1 cross join N3 as T2) -- 256 rows
,N5(C) as (select 0 from N4 as T1 cross join N4 as T2 ) -- 65,536 rows
,IDs(ID) as (select row_number() over (order by (select null)) from N5)
insert into dbo.T1(ID)
select ID from IDs;

As the next step, start a transaction on the secondary node and run a query against the dbo.T2 table as shown in Listing 2. Even though you are using an explicit transaction, the same behavior will occur in the case of a long-running statement in an autocommitted transaction – think about complex or non-optimized SELECT statements that run for a long time.

Listing 2. Starting a transaction on the secondary node

begin tran
select * from dbo.T2;

Next, delete all data from dbo.T1, and run a query that will trigger Clustered Index Scan of the table on the primary node. The code for such a query is shown in Listing 3.

Listing 3. Deleting data and performing a table scan (Run on primary node)

delete from dbo.T1;
go

-- Waiting 1 minute
wait for delay '00:01:00.000';

set statistics io on
select count(*) from dbo.T1;
set statistics io off

--Output: Table 'T1'. Scan count 1, logical reads 65781

Despite that the table is empty, the data pages have not been deallocated. That those pages are still allocated leads to significant I/O overhead on the primary node.

Now let's look at index statistics using the code from Listing 4.

Listing 4. Checking index statistics (Run on primary node)

select index_id, index_level, page_count, record_count, version_ghost_record_count
from sys.dm_db_index_physical_stats(db_id(),object_id(N'dbo.T1'),1,NULL,'DETAILED');

Figure 1 shows the output of the query. As you can see, the leaf index level shows 65,536 rows in version_ghost_record_count column. This column contains the number of ghosted rows that cannot be removed due to active transactions that rely on row versioning in the system. In our case, the transaction requiring the ghosted rows runs on the secondary node.

New Content Item

^{Figure 1. Index statistics}

There is nothing special about this behavior of retaining ghosted rows. The ghost and version store cleanup tasks would behave the same if snapshot transactions were running on the primary node. What you might not have expected though, is to see that the same impact is introduced by a transaction on the secondary node.

Remember this behavior of retaining ghosted rows when you plan to use readable secondaries in a “set it and forget it” manner. It is common to see setups in which database teams have built readable, AlwaysOn Availability Group secondaries for reporting purposes while completely ignoring the workload on those secondaries after creating them. Such an approach may backfire and lead to unexpected performance issues on primary nodes in the system.

On the flip side, there is absolutely no reason to avoid using optimistic isolation levels when you have readable secondaries enabled. SQL Server already uses row versioning internally even if you do not enable it in the database. The overhead is there regardless, and you might as well benefit from it.

About the Author

Dmitri Korotkevitch is a Microsoft Data Platform MVP and Microsoft Certified Master (SQL Server 2008) specializing in the design, development, and performance tuning of complex OLTP systems that handle thousands of transactions per second around the clock. Dmitri blogs at AboutSQLServer.com, and is author of the following recent books on SQL Server:

This article was contributed by Dmitri Korotkevitch, author of Expert SQL Server Transactions and Locking.