Shredded Storage Whitepaper

This is the data and information from the codeplex project on the Shredded Storage testing framework.  This was to be released several months ago, but things just didn't work out the way we all planned.  So I'm posting it here:

Purpose of this whitepaper

Shredded Storage has up to this point been a very elusive
feature of SharePoint 2013.  There have
been several posts in the community that have attempted to tackle this
incredibly advanced topic, and although some have been spot on in most regards,
they were far from complete when describing the intricate dance that occurs
throughout the many layers of SharePoint and Office.  Each of the authors had different goals and
motivations for creating this whitepaper. 
These ranged from simply the curiosity standpoint to the much more
directed and measurable product vendor implications on features like RBS and
Deduplication. The result of our ambitions is this highly technical white paper
and supporting code that uncovers how shredded storage works in various
scenarios and how it can be improved in the future.  We have also provided you with a guide on how
you can achieve the same conclusions as we have here.

Introduction to Shredded Storage

Shredded Storage is a new data platform improvement in
SharePoint 2013 related to the management of large binary objects.  Shredded Storage is designed to accomplish
three tasks; reduce storage, optimize bandwidth and optimize file I/O.  The idea behind Shredded Storage is if
subsequent writes back to the database are the same, then there is no need to save
those parts again and thus you save on storage, network and disk I/O.  This particular feature was originally
designed to manage heavy write scenarios such as co-authoring. 

All of these are accomplished by simply breaking apart the
blobs into smaller pieces.  It really is
as simple as that, but the details are much more complex. Think about if you
were to manually take a piece of paper and rip it into a bunch of pieces.  If the piece of paper has an image on it,
then it would make sense that you should be able to easily piece it back
together.  The difference here, is your
brain is a pretty amazing machine and can recognize the pieces.  Let’s take a blank piece of paper.  How easy do you think it would be to put it
back together if you randomly rip it apart? 
Actually, it would be quite easy as the pieces will fit nicely.  Now say you put the blank piece of paper into
a shredder that shreds them all into the same size.  How easy do you think it will be to put it
back together?  Not very easy.  You would need something to guide you in
putting it back together.  Maybe some
ultraviolet markers?  That’s where the
implementation of shredded storage comes into play.

Shredded Storage is much more than simply a set of stored
procedures in a SQL Server database.  It
is actually a tiered solution made up of three layers.  The SharePoint layer (which includes the
basic object model and other services like CelllStorage.svc and OneNote.svc
that support Office Clients and Office Web Apps), the Cobalt layer, and the
database backend.  The combination of
these three parts are what define Shredded Storage.  However, as you will learn, the Windows
Operating System has a MAJOR role to play in everything.

This white paper will dive very deeply into all three layers
and how they work with each other. 

SharePoint Layer (Object Model)

When it comes to download and upload of files, SharePoint
will utilize the SharePoint Object Model to make calls to the SPFile methods
(SaveBinary and OpenBinary).  This method
in turn calls the SPRequest class methods. 
Inside this method is where the magic happens.  This method, which is not managed code, but
in the COM portion of the SharePoint application will make a call to generate
an SPFileStreamStore based on the SPFile information.  This SPFileStreamStore is managed by the
SPFIleStreamManager class.  The
SPFileStreamManager is the key class to managing the Cobalt layer on the
SharePoint side.  You will also see some
other Microsoft.SharePoint.Cobalt* classes that help with the interaction.

Writing Files

When it comes to writing a file to the SharePoint database,
the SPFileStreamManager will execute its ComputeStreamsToWrite method.  This method is in charge of creating a
SPFIleStreamStore object which is then used in creating a
CobaltStreamRevokableContainer object. 
This object will then contain a CobaltFilePartition which in turns
contains a schema. , getting back a Schema from the SPHostBlobStore .

Reading Files

When it comes to reading a file from the database, again the
SPFileStreamStore is the main class that contains these methods.  The most important methods are the
GetBlobsById and GetBlobsAfterBsn methods. 
These two methods are where blobs are retrieved from the database.  A set of stream ids are passed to the
proc_GetStreamsById stored procedure and the first chunk of data is returned.  A data reader is opened and each blob is
converted to a SPCoordinatedStreamBuffer. 
So where does the FileReadChunkSize come into play?  It is actually used as an input parameter to
the GetStreamsById stored procedure to tell SQL Server to try to break up the
file content into smaller parts (but it will only be called once and only get
for each requested shred no more than the FileReadChunkSize.  After the first return of the data, if there
is more to be returned, subsequent calls are made using the GetStream stored
procedure and passing in the offset for the content, but yet again, only
returning up to the FileReadChunkSize. 
These parts are then aggregated by the SPCoordinatedStreamBuffer and
once all parts of the shreds have been retrieved they are put back together
into the actual shreds for consumption by the Cobalt layer.

These fully pieced together shreds are made into a Cobalt
HostBlobStore.Blob and a list of these are returned from the method.   The data of the shred is converted to a
Cobalt Atom.

The net result is the value of the FileReadChunkSize does
have an effect on the number and size of the result sets that are returned back
to the web front end.  If this setting is
several multiples smaller than the FileWriteChunkSize, you will see an almost
similar multiple in the time it takes to download the file.

SharePoint Layer (CellStorage.svc)

A separate but related feature to Shredded Storage is the
ability for Office Clients to do delta updates when working with Office XML
based documents that reside in SharePoint. 
As part of the process, Office clients will attempt to make a call to
the CellStorage WCF service.   If the
service is available the WFE will only send the requested pieces of a file to
the client.  On a save operation, only
delta changes to a file are sent back to the WFE.  Note that when this service is not available
or an error occurs, the Office client will fall back to the normal HTTP PUT
mechanisms as if it were a regular file share and the entire file will be sent
on each save. When this delta update and Shredded Storage is being utilized,
you see both an optimization from the client to the WFE and the WFE to SQL
server.   

Delta updates only support Office XML documents. The reason
they are only Office XML is that deep inside CellStorage.svc it makes a call to
create a pointer to an XmlReader object. 
Obviously, the older files are not based on this file format and an
attempt to read them is futile, therefore, you won't have any calls to
cellstorage.svc, but simply the regular HTTP PUT calls.  The CellStorage protocol (which is part of
the Cobalt assemblies) supports various different command types:

·        
GetDocMetaInfo

·        
WhoAmI

·        
ServerTime

·        
Cell (get and set)

·        
Coauth

·        
SchemaLock

·        
ReleaseLock

There is a standard process to this:

·        
First step is to send a request for the Document
MetaInfo

·        
Second step is to actually get the parts of the
document that are being viewededited at that moment (cell get)

·        
Third step is to request to start editing the
document (requesting a schema lock)

·        
Fourth step is send back any changes that a
person makes (cell set)

·        
Last step is to tell the server you are done
(release the schema lock)

During this entire process, the client will ping the
sharedaccess.asmx web service to ensure that it is the only one editing the
document.  This is done about every 20
seconds.  As part of the request, it is
looking for the ETag to change.  If it
has changed, that means someone updated the document and the version you have
is now old and you will need to refresh your copy, or overwrite what they
did.  This scenario should never happen,
but it looks like they attempt to check that it somehow does.

CellStorage.svc requests and responses are using the older
XML format as part of the messages.  This
is a costly approach because XML is a bloated format as compared to the newer
JSON format which is used in Office Web Apps communications.  We hope one day this protocol will change to
support the newer format and increase the performance even further.

SharePoint 2013’s has several WCF Services aside from the
CellStorage service.  Many of these
service reside in the ISAPI directory of the SharePoint Root (C:Program
FilesCommon FilesMicrosoft SharedWeb Server Extensions15ISAPI).  If you open the web.config file of the ISAPI
directory, you will find that all the WCF service bindings are configured with
a maxBufferSize of “4194304”.  

This will limit the size of the files from which the Office
Clients will be able to download the file shreds.  As you will see later, the performance of the
IO decreases dramatically when you set a FileWriteChunkSize higher than “4MB”.

Office Web Apps 2013 – (Onenote.ashx)

Office Web Apps has an entirely different set of multi-user
editing features.  These features are
using the end point OneNote.ashx.  This
API works hand and hand with Cobalt to “lock” sections of a document when
multiple users are using it.  This multi-user
and locking operation will generate “partitions” in the shredded storage
database store.  Partitions are sets of
shreds in the content database that correspond to what each user is changing in
a file for their editing session. 

When using regular Object Model calls, these shreds are
normally broken apart by the boundaries defined by the Cobalt schema of the
file being edited.  When using multi user
editing, an even granular break down of the shreds is performed (down to the
separate xml files in the Office XML document) and mapped to the internal xml
components being edited.  When the
editing sessions are closed, the partitions are collapsed back to their single
partition.  Having an editing session
“end”, is a very important part to the process. 
If a user doesn’t close their session, the partitions will be left in
the database until the service cleans them up. 
In our code testing, we found it possible to lock yourself out of
editing, when you had another session open. 
From a coding standpoint, this is the correct way to do this.  The default timeout for an Office Web Apps
session is 5 minutes.[A8] 

The process of starting an editing session works like this:

·        
User clicks on an Office XML document from a
library

·        
A sharepoint page called WOPIiframe.aspx is
called.  This page sets up the necessary
access tokens that OWA will use for communicating to SharePoint

·        
At this point the IFrame will redirect you to
the OWA server and initialize the session via the WordViewerFrame.aspx page and
the docdatahandler.ashx http handler

·        
A JSON object that fully describes the document
will be sent to the browser.  This JSON
notation is a very complex set of JSON objects that map to the XML elements
inside the Office XML document

·        
A user starts an editing session with OWA, which
initiates the first call to OneNote.ashx. 
From this point on, all operations are done through OneNote.ashx.

·        
Every few seconds the browser will ping
OneNote.ashx to request any changes. 
This JSON format and protocol could be the discussion of a whitepaper on
its own

As the above conversation between the client browser and
Office Web Apps is performed, the Office Web Apps server is talking to
CellStorage.svc through the WordCompanionServer class. This class will
reference the HostEnvironment class for information on where the file is
located and what request adapters are implemented to proxy the calls between
OWA and the target.  The two most
important adapters are the ICellStorageAdapter and the ICoauthAdapter.  The ICellStorageAdapter is an interface that
is implemented by the CellStorage.svc WCF service mentioned above.   

Now that you have some background of the process and
interaction of these components, it’s time to look at the Cobalt layer which is
where a majority of the code for our CellStorage.svc resides.

Cobalt

The shredding part of the process is implemented by a much
improved Cobalt API (Microsoft.Cobalt). 
Cobalt is the layer responsible for implementing a break-up schema that
dissects the files into smaller parts, commonly called shreds in
SharePoint.  Cobalt is designed to be
utilized with any type of file, not just Office file types.  The Cobalt assembly also contains the code
that handles all the FSHTTP requests discussed earlier.  Cobalt operations include:

·         TBD

Cobalt supports several different schemas that define how a
file is broken apart.  Cobalt will accept
a file, determine what type of file it is (similar with how search determine a
files format) and will then apply an algorithm based on this file type.  This means that the type of file
matters.  You simply cannot upload an
Office XML file and expect to get the same results as compared to a simple
image file of the same size.  Not only
does the algorithm change based on file type, it also changes based on the size
of the file.  Table 1 shows the paths
that are taken and the algorithm that is applies when that path is picked.

TABLE 1:

TBD

     
     
     

 

Each algorithm will break apart the blobs in a different
way.  Therefore the same sized file that
is a Word Office XML document will break apart differently than an older Word
document that is simply binary.  You may
also find that images (blobs within blobs in the case of Office XML Documents)
will be treated differently and may get their own shred.

As the Cobalt breaks apart the blob, there is the possibly
of two extra shreds added to the set of shreds that store configuration
information about the other shreds. 
These configuration shreds help the web front end determine if the
shreds have changed and new shreds need to be committed to the database.

 

Database

SPFarm Configuration

In SharePoint 2013, the out of box default for the maximum
size of these parts is no larger than 64K, however this can be configured to be
larger.  This FileWriteChuckSize setting
is a major focus of this whitepaper.

It is again important to note that the size of the shreds
will not equal the file size divided by the write chunk size.  It is simply a watermark that is used as a
guide for the Cobalt algorithm involved.

These configuration shreds are used by the WFE to determine
what shreds need to be saved back to the content database.  This keeps the WFE from having to download
the entire set of shreds to do a comparison. 
These configuration shreds will add a small amount to each blob added to
the database.  Note however, there are
some circumstances where there is only one shred created that includes both
configuration and data

This break-up of the blob into shreds doesn’t achieve any
storage optimization until you turn on SharePoint versioning on a document
library.  When versioning has been
enabled, each time a user saves a file, the SharePoint WFE will break apart the
blob into segments using the same method that it used before.  Then each of these shreds will be compared
with the shreds already in the database. 
If they did not change, then there is no need to send the shred to the
database for persistence.  This database
level de-duping operation has been proven to achieve 30-40% storage savings.  This is where you gain in network
optimization between the WFE and the SQL Server.  You also gain optimization from the
standpoint of file IO.  Now you are only
sending and saving a small part of the file rather than the entire file.  This is also important in the fact that if
you had done a similar operation in SharePoint 2010, but only updated the
metadata for a file, you would end up with another version of the file in the
database.  With Shredded Storage, this is
not the case and your storage is optimized greatly.  Lastly, when you have implemented a log
shipping disaster recovery strategy, you will notice smaller writes produce
smaller transaction log sizes and drive a more efficient log shipping experience.

Content Database – Overview

Once Cobalt and SharePoint have determined the Cobalt
HostBlobStore.Blob(s) that need to be saved, each one is passed to the content
database via the .

Shredded Storage optimization is document focused.  This requires a quick description of how
SharePoint saves files.  When you upload
a document to a document library or add an item with an attachment, it creates
a new list item with a unique id assigned to it. It will also get a unique
document id created for the file attached to the item when not a document library.  The two of these are tied to each other and
both are unique guids.  When you upload
the same file to a second list, a new list item and document id will be created
that have no relation to the first set. When it comes to Shredded Storage, it
is this document id that is the important part. 
So in summary, if you upload the same document in two different document
libraries, you will gain no Shredded Storage benefits. 

Content Database – Tables

There are four tables in the database that manage the shredded
storage blobs.  This includes:

·        
dbo.DocStreams

·        
dbo.DocsToStreams

·        
dbo.AllDocs

·        
dbo.AllDocVersions

In SharePoint 2010 dbo.AllDocStreams stored the document
stream and related data for documents with content streams, in SharePoint
Server 2013 dbo.DocStreams replaces dbo.AllDocStreams where each row stores a
portion of the BLOB.

The improved protocols associated with Shredded Storage
identify the rows (in the new DocStreams table) necessary to be updated to
support the change and updates the BLOB associated with that change in the
corresponding row. Several new columns are present in the DocStreams table that
represent a shredded BLOB including:

·        
BSN: The BSN of the stream binary piece.

·        
Data: Contains a subset of the binary data of
the stream binary piece unless the stream binary piece is stored in Remote BLOB
Storage.

·        
Offset: The offset into the stream binary piece
where this subset data belongs.

·        
Length: The size, in bytes, of this subset data
of the stream binary piece.

·        
RbsId: If this stream binary piece is stored in
remote BLOB storage, this value MUST contain the remote BLOB storage identifier
of a subset of the binary data of the stream binary piece. Otherwise it MUST be
NULL.

A new DocToStreams table contains a pointer to a
corresponding row in dbo.DocStreams.  The
BLOB Sequence Number (BSN) is used to manage the BLOB sequence across
dbo.AllDocVersions, dbo.DocsToStreams, and dbo.DocStreams.  NextBSN is used to manage the last BSN for
each BLOB.

dbo.AllDocs contains a single row per file similar to
SharePoint Server 2010.

dbo.AllDocVersions contains one or more rows per file and
one row per file version.

Content Database – Stored Procedures

The main stored procedures in the process include:

·         TBD

Read and Write Scenarios

Client to WFE (Read)

TBD

WFE to Database (Read)

The following table outlines some of the generic results
from our testing when utilizing the FileReadChunkSize as compared to the size
of the file (and not compared to the size of the FileWriteChunkSize):

FileReadChunksize (% of file size)

Performance Hit

>12.5%

Typical Normal read operation

6% < x > 12.5%

10% hit on read operation

3% < x > 6%

20% hit on read operation

x < 3%

50% hit on read operation

These values would point to the fact that the default
setting anticipates your average file size will be under 512KB.  It should be noted that when you install
SharePoint 2013, all files that accompany the software are under this limit.

Client to WFE (Write[A9] )

These calls can be measured using Fiddler tracing.  You can simulate an Office 2010 client by
disabling Shredded Storage and attempting to download a file from SharePoint.

WFE to Database (Write)

TBD…

Remote Blob Storage

Two quick facts; Shredded Storage does not require RBS and
RBS does not require Shredded Storage. However, you should also be aware that
when Shredded Storage is used with RBS, there are side effects to having both
Shredded Storage and RBS enabled.  RBS’s
real value is realized only when you are working with larger files. When
combined with Shredded Storage’s default max file size of 64K, the actual
implementation of RBS can have a negative impact on the performance of
retrieving a complete blob from the storage subsystem.  However, some RBS implementations presents
capabilities that Shredded Storage does not.  One such feature is the ability to actually do
the de-duplication of the shreds. 
De-duplication will monitor for when the same blob is being saved and
then creates a pointer to a single instance of the blob.  This solves the problem of uploading the same
document to multiple libraries and having it use just as much space in the disk
subsystem.  As previously mentioned, this
is something that Shredded Storage does not do in this current implementation
and depending on your RBS provider, can do.

 

Deduplication

In testing of this feature when enabled with shredded
storage, it has been found that the de-dupping feature actually causes a
performance hit when writing files with target size of 64K and loses a majority
of its efficiency when the shredded storage watermark is maxed.  Proper testing should be done to ensure that
any changes you make to the Shredded Storage setting or features you have
implemented on your disk subsystem are compatible with your performance requirements.

 

Simple Shredded Storage Facts

Shredded Storage can provide other network optimizations
other than WFE to SQL when configured with the proper set of tools.  Thus far, you have been presented with the
utilization of Cobalt on the WFE to SQL Server side of the wire.  Originally, Cobalt was designed with
multi-user editing in mind.  This goes
hand and hand with SharePoint perfectly as it is a collaboration platform.  This makes sense as a common pattern is allowing
users to be able to collaborate on documents at the same time, whether that
collaboration feature is in a viewing or editing capacity.

Disabling Shredded Storage

There are properties that allow you to attempt to turn off
shredded storage.  This attempt can be
accomplished by setting the FileWriteChuckSize so that the shreds are incredibly
large, up to 2GB, and force the likeliness that only one shred will be created.
Be warned that changing the shredded storage setting over 4MB will incur a
large I/O hit for larger files. It is not recommended that you set this value
above 4MB.

It should be noted that you CAN disable shredded
storage.  This can be accomplished NOT by
changing the FileWriteChunkSize, but the *FILTERED OUT*

This operation is not something you should proceed with
unless you are doing it in a test farm. 
If you have installed SharePoint 2013 and let users upload information
and files, then subsequently disable Shredded Storage, NONE of the files will be modifiable again unless you delete the file
and then add it back OR re-enable shredded storage. 

Shredded Storage Testing Framework

As part of this whitepaper we have published a complete
Shredded Storage testing framework (including the source code) that will allow
you to build your own tests and to validate our tests to confirm our
results.  The following section is
designed to walk you through how to use the tool and analyze the results.

Installing the Tool

Follow these steps for installing the tool:

·        
Download the tool from Http://shreddedstorage.codeplex.com

·        
Run the db script or restore the test database
to SQL Server

·        
Install SQL Server 2008 R2 or later with SQL
Profiler tools

·        
Install the Office SDK 2.0 or higher

·        
Install Fiddler

·        
Copy the SPProfiler.exe, SPProfiler.exe.config
and MonitorSharePoint.tdf to the C: empsystem32 directory

How the tool works

First and foremost. 
The tool IS NOT MULTITHREADED.  Please take this into consideration when
modifying the code as it is not possible to modify it to support multi-threaded
calls with the various logging components used.

Fiddler integration
– You can wrap any call in the StartProxy Method to record any http requests
and responses to see what kind of traffic is being generated.  These tests can help to analyze what the
different is between older non-cobalt calls and cobalt calls.

SQL Profiler
integration
– You can start the SQL Profiler tool at any time by calling
StartTrace().

Time Monitoring – You
can record the time it takes for any method calls by calling StartTimer().

TestResult class – All
results should be saved to a TestResult object. 
Once the object is populated, you can simply save it to the database
with the SaveTestResult() method.

Generating content
– There are several methods that support the dynamic generation of Word and
PowerPoint files.  You can use these
methods to create a directory of a specific size, then make the call to
UploadDirectory (described below) to test the upload of the files

Reading Results

The following SQL queries provided with the tool are
provided to help you analyze your data:

·         TBD

Performance Tests

The most basic
tests would revolve around uploading and download files.  This can be easily tested by creating sets of
files of varying sizes and uploading and downloading them several times.  This is but one piece of the testing that is
needed for shredded storage. 

The following table
outlines some of the methods you can call to run various tests in your
environment.  You can build your own
calls using PowerShell or modifying the tools code

 

Method
Name

Shredded
Storage On?

What
does it test?

What
should you see?

RunDynamicUploadTest

Yes

This
test uploads a file starting at a target file size and then halves the
FileWriteChunkSize as many times as you tell it.

The
shred sizes will increase as the FileWriteChunkSize decreases.

RunDynamicDownloadTest

Yes

This
tests downloads a file starting at a target FileReadChunkSize and will half
the value as many times as you tell it

The
number of read stored procedures will increase and the time it takes for the
file to download will take longer.

RunImageUploadTest

Yes

This
test will generate a word document with images and then upload them.

You
will see that the entire file is shredded and the image is not separated out
into its own shred

RunOfficeDownload

YesNO

This
test will use the office client to open a file stored in SharePoint.

You
should see that the file size is almost exactly the same size as the file
itself in the HTTP calls.

UploadDirectory

YesNo

This
test method will upload a local directory to the sharepoint server.

You
can use this to test a batch of file uploads for comparison between SP2010 vs
SP2013.  Also a good way to test
various size files with FileWriteChunkSize settings

RunOWATest

Yes

This
method tests the HTTP calls between client and OWA server, you also see the
SQL IO on the backend.

This
method is not fully implemented as the OneNote.ashx protocol is VERY
difficult to reverse engineer.

RunShreddedStorageDisableTest

Both

This
test demonstrated that it is very bad to disable shredded storage using the
ServerDebugFlags. 

You
will see that any subsequent uploads of files will break after disabling
shredded storage.

AnalyzeFarm

N/A

This
will analyze all the files in your Farm and give you stats on the files
contained in all your content databases.

You
will need these stats to pick the proper setting for your FileReadChunkSize.

TestOpenLargeFileFromOffice

Yes

This
test is for analyzing the effect WCF 4MB limit has on the Cobalt layer in
SharePoint.

You
will see that the SQL IO has a HUGE hit when the FileWriteSChunkSize is set
above 4MB.  However, for smaller files
the actual amount of time it take for the client to render is not affected.

RunMaxWriteSettingTest[A10] 

Yes

This
test will set the FileWriteChunkSize to its maxiumum setting of 2GB and then
compares how the system runs between the default setting of 64KB.

You
should see that all the files are single sets of shreds versus multiple shreds
with the lower value.

TBD
– modify the WCF Settings and run large file test

 

 

 

TBD
– DeDub enable, monitor file size

 

 

 

TBD
– Enabled RBS, monitor upload speeds

 

 

 

TBD
– Enable RBS, monitor download speeds

 

 

 

RunManualTest

YesNo

This
test wills start recording any of the actions you perform and then save to
the database when you are finished

This
will assist with any custom testing you want to do without writing any code.

RunModifiedPropertyVersionTest

Yes

This
test will look at when you simply change a property like “Title” and don’t
change the file bytes.

You
should see the entire file gets written back when a single shred exists. This
is UNEXPECTED behavior.

RunModifiedVersionTest

Yes

This
test will generate a word file, upload the file, then modify it by adding
text to the end and upload it again.

This is the best case scenario for
shredded storage.  You should see that
the count and total size of the shreds don’t change drastically as in the
worst case scenario test.
[A11] 

RunChangingVersionTest

Yes

This
test will generate a file, upload the file, then generate a completely new
file and upload it with the same name.

This
is a worst case scenario test.  You
should see the total number of shreds increase in the database with each
version uploaded.  You will also see
the overhead that shredded storage adds to the base file.  By the time you get to version 10, you
should have 10x the number of shreds and size.

 

Comparison to SharePoint 2010

TBD