Monday 25 June 2012

Utilising the Content Type Hub within a document management solution

This post is intended to present one possible Document Management solution which utilises the Content Type Hub. The solution is based on a real world project for a client. The post is to outline the approach from a technical perspective and not to discuss requirements from a business perspective.

Background client information
  • Intranet is based on SharePoint 2010.
  • Intranet is stored in one site collection.
  • Content Types are not currently utilsed for document management.
 Client requirements overview
  1. Areas to create and manage documents are required throughout the organization.
  2. Content needs to be searchable from with the intranet.
  3. SharePoint is to provide a document management solution which over time replaced the current storage solution (a shared drive).
  4. Content is expected to grow significantly over time so the solution should scale from a content storage size perspective.
Solution

The requirements above are high level but provide enough to outline the solution. This solution here outlines how each requirement was met.

Site templates
Business analysis determined that there were two main types of site which would cover the core document management needs of the business. Site templates were created for "Project" and "Programme" sites. Each of these site templates contained document libraries (and other asset library's) where documents could be stored.

Content types

Business analysis identified document content types and their respective metadata requirements. These content types were associated with the document libraries contained in sites created by the site templates.

Multiple site collections and databases (still one SharePoint web application)

The intranet existed in a single site collection which had a database associated with it. The Document Management and scalable solution requirements indicated that one database was unlikely to be sufficient going forward based on the maximum recommended database size.
The design decision was taken to create at least two new databases. One would store "Project" sites and the other would store "Programme" sites.

The decision was also made that "Projects" and "Programmes" would be stored in their own site collections. A "Project" site would thus be created from a template in the "Projects" site collection. Each site collection would have its own database which would take some of the pressure off the upper operational limit (splitting the documents across two databases instead of one). Whilst this was not the ideal scalable solution (because you could still bottle neck if the "projects" / "programmes" site collections grew too large) it was deemed suitable by the business.

As an aside, from a scalability perspective it would have been preferable to have each "Project" and "Programme" as its own site collection. This would have the added benefit of allowing administrators to add new databases as required to the web application. E.g. If there were six "Project" site collections in one database and the database was approaching an operational limit, a new database could be attached and new site collections could be created in this database. In addition, due to the site collection boundary it would be easy to move an existing site collection to a new database if once specific site collection was becoming too large.

The content type hub

One limitation of multiple site collection is that each site collection has a "Site Content Type Gallery". This means that adding a new or updating an existing content type is expensive. A manual task of configuration in each site collection or the creation of a script would be required to make any modifications. This is where the "Content Type Hub" comes in to play. This is a feature which is available at the site collection level which allows the site collection to "Publish" its content types to other site collections who are "Subscribed" to it. Ergo this provides a central location for storing content types to be utilised across multiple site collections. Any changes in the content type hub site are replicated in the other site collections (after a couple of timer jobs have run).

This effectively completes the solution and demonstrates one architecture using the Content Type Hub in SharePoint 2010.

Here is a handy diagram to confuse you all further.