INFO: Why’d we get rid of Single Instance Storage in Exchange Server 2010?

Posted by: kurtsh | December 15, 2011

INFO: Why’d we get rid of Single Instance Storage in Exchange Server 2010?

A customer asked the question: Doesn’t the elimination of Single Instance Storage in Exchange Server 2010 mean that our Exchange stores are going to balloon in size?

We addressed this back in February.
We HAD to change our storage architecture to provide dramatically better IO performance, huge increases in scalability, and more flexible.

One of our main goals for Exchange 2010 was to provide very large mailboxes at a low cost. Disk capacity is no longer a premium; disk space is very inexpensive and IT shops can take advantage of larger, cheaper disks to reduce their overall cost. In order to leverage those larger capacity disks, you also need to increase mailbox sizes (and remove PSTs and leverage the personal archive and records management capabilities) so that you can ensure that you are designing your storage to be both IO efficient and capacity efficient.

During the development of Exchange 2010, we realized that having a table structure optimized for SIS was holding us back from making the storage innovations that were necessary to achieve our goals. In order to improve the store and ESE, to change our IO profile (from many, small, random IOs to larger, fewer, more sequential IOs), and to resolve our inefficiencies around item count, we had to change the store schema. Specifically, we moved away from a per-database table structure to a per-mailbox table structure.

This architecture, along with other changes to the ESE and store engines (lazy view updates, space hints, page size increase, b+ tree defrag, etc.), netted us not only a 70% reduction in IO over Exchange 2007, but also substantially increased our ability to store more items in critical path folders.

The outcome was a bit of Exchange database growth. To compensate for this we effectively implemented compression which eliminated the growth side effect.

As a result of the new architecture and the other changes to the store and ESE, we had to deal with an unintended side effect. While these changes greatly improved our IO efficiency, they made our space efficiency worse. In fact, on average they increased the size of the Exchange database by about 20% over Exchange 2007. To overcome this bloating effect, we implemented a targeted compression mechanism (using either 7-bit or XPRESS, which is the Microsoft implementation of the LZ77 algorithm) that specifically compresses message headers and bodies that are either text or HTML-based (attachments are not compressed as typically they exist in their most compressed state already). The result of this work is that we see database sizes on par with Exchange 2007.

Here’s the articles from the Exchange Team Blog that talk about why SIS’s removal from Exchange 2010 ends up being a really good thing!

Dude, Where’s My Single Instance? (February 2011)
http://blogs.technet.com/b/exchange/archive/2010/02/22/dude-where-s-my-single-instance.aspx
Top 10 Exchange Storage Myths (March 2011)
http://blogs.technet.com/b/exchange/archive/2010/03/29/3409629.aspx

Posted in Uncategorized

Kurt Shintaku's Blog