Thursday, April 26, 2007

Storing files uploaded into SharePoint Libraries/Galleries on the File System

One of the problems with the current version (and previous versions) of SharePoint is the fact that all the files uploaded into SharePoint are stored in the DB no matter what. Now for most scenarios this is great. For example you can store metadata alongside your file, have versioning and approval for you file, use SharePoint's check out and check in features, etc.

The fact that you're storing files in the DB also proves itself useful when your site is running on a server farm. Anyone who has developed a web site on a server farm has experienced the problem where users upload a file (and it gets uploaded into one of the servers) and then a user that hits the other servers can see the file (since we are listing files from a table in the DB) but when they click to download the file they get a FileNotFoundException (brings back memories doesn't it?). The fact that SharePoint stores all files uploaded into the site in the DB completely takes away all server synchronization issues when developing for a server farm.

Enough story telling let's get on with what this post is all about:

Now in certain scenarios (even on a server farm) you might need/want the files that the end user uploads into a SharePoint Library or Gallery to be stored in the file system. In some cases you only want them stored in the file system and in other cases you might just want them replicated in a specific file share on a server. Now a very classic example of this is when your site's content contributors are going to be uploading video/audio files into a secure library (where they only have access to) but you want to make all these files available for online usage through a Streaming Server. In other words you want the public users of the site to be able to playback the video/audio files through an embedded media player that is connected to your sites Streaming Server.

Now as far as I know most streaming servers can stream video/audio that is provided for them in a specific directory (they can't just go poking around in the SharePoint DB for traces of files!) so what can we do. As far as SharePoint support goes well there is none for this requirement. Hopefully the guys over at Microsoft will think of something by the next version/release of SharePoint but for now were stuck with some temporary solution:

Solution: to solve this problem I wrote a list event handler that would sit on the appropriate events of the specific Library. Once a file was added (or updated/deleted) I would perform the appropriate operation in the file system. As an example let's say we want all files added to be copied into our server's c:\StreamSource directory and also deleted from that directory once they are moved off of SharePoint. Before we get into code I should point out that when you upload a file into a SharePoint library the actual upload happens as a two part process. In the first step the file is uploaded and an ItemAdding/ItemAdded is raised then the user is presented with the Meta Data screen for that file and when they enter the meta data and click on the 'Check In' button the ItemUpdating/ItemUpdated event is raised. So if your replication of the file depends on the files meta data the replication has to happen in ItemUpdating (or ItemUpdated). Now for this example I'm going to use ItemUpdating to take care of the file replication:

public class FileReplicatorEventHandler : SPItemEventReceiver

{

public override void ItemUpdating(SPItemEventProperties properties)

{

byte[] content = properties.ListItem.File.OpenBinary();

string[] nameParts = properties.AfterUrl.Split('/');


string fileName = nameParts[nameParts.Length - 1]; //get the filename

using (FileStream fs = new FileStream(@"c:\StreamingSource\" + fileName,

FileMode.Create, FileAccess.Write, FileShare.None))

{

fs.Write(content, 0, content.Length);

}

}

}


Now if our end user goes and deletes an item from the library we want to make sure that the associated file in our "c:\streamingSource\" directory is also deleted. So here is the code to take care of that:


public override void ItemDeleting(SPItemEventProperties properties)

{

string[] nameParts = properties.BeforeUrl.Split('/');


string fileName = nameParts[nameParts.Length - 1];


File.Delete(@"c:\StreamingSource\" + fileName);

}


Based on this solution you could write a general component that can take care of any type of file replication on SharePoint. As a matter of fact if I find the time I'll try to create a general component and uploaded it here (or if anyone does it before me comment here so I won't waste my time. J ).

Wednesday, April 25, 2007

Advanced SharePoint Queries! :)

In the previous post I described two methods where you could fetch any data you needed from a SharePoint list, but sometimes we need more than a simple fetch from a list. One of the situations that I had to deal with in the most recent project I was involved in was fetching data from multiple lists scattered across different sub-sites and this had to be done in a performance optimized method (no recursively looping through all sub sites and searching each list separately!).

Let's consider a concrete example:

Assume we have multiple SharePoint blogs setup under a sub-site called 'blogs.' Now as we all know SharePoint blogs are sub-sites in themselves and posts/comments made to a blog are stored in lists under that sub-site. So if we have multiple blogs setup under the imaginary 'blogs' sub-site we will have a site hierarchy similar to this:

/blogs

/blogs/My Blog

  • Posts list
  • Comments list

/blogs/Some Other Blog

  • Posts list
  • Comments list

/blogs/XYZ Blog

  • Posts list
  • Comments list


Now suppose we want to display a 'master moderation list' where an administrator can view all posts/comments that haven't been approved yet (are pending) and then do whatever he/she would do with them (we only care about the first part).

This is where SharePoint Cross List Queries can come to the rescue. By creating a Cross List Query you can retrieve all items from all lists under a specific sub-site in one data table and then perform all needed operations on the result:

SPSite site = new
SPSite("http://mysite");

SPWeb web = site.OpenWeb();

CrossListQueryInfo qi = new CrossListQueryInfo();

qi.ViewFields = "<FieldRef Name=\"Title\" />" +
"<FieldRef Name=\"Body\" />" +
"<FieldRef Name=\"ID\" />" +
"<FieldRef Name=\"PublishedDate\" />";
qi.Query = "<Where><Eq>" +
"<FieldRef Name=\"_ModerationStatus\" />" +
"<Value Type=\"ModStat\">" +
"Pending" +
"</Value>" +
"</Eq></Where>" +
"<OrderBy>" +
"<FieldRef Name=\"Modified\" Ascending=\"FALSE\" />" +
"</OrderBy>";
qi.RowLimit = 100;
qi.Webs = "<Webs Scope=\"Recursive\" />";
//101 is the id for blog lists. Obviously you have to change this
//for any other list type that you are targeting
qi.Lists = "<Lists ServerTemplate=\"101\" >";
qi.WebUrl = "/blogs";
qi.ShowUntargetedItems = false;

CrossListQueryCache qCache = new CrossListQueryCache(qi);
DataTable dt = qCache.GetSiteData(web);

foreach (DataRow dr in dt.Rows)
{
//do whatever you need to do with each item
}

A couple of useful hints:

  • The CrossListQueryInfo class is located in the Microsoft.SharePoint.Publishing namespace so don't forget to add a using for it and a reference to the micrososft.sharepoint.publishing.dll
  • Cross list queries are very limited in which fields can be fetched or used in the where clause. I've had problems fetching the author of a list item or filtering based on similar fields. A solution that I had to use once was to handle events on the list so when items were added I would copy all the data that I needed into custom hidden fields and then use those fields in the cross list query.
  • Adding indexed fields will help with the performance of cross list queries.

SharePoint Queries

We all know that SharePoint lists can provide a simple, user friendly and straight forward way of storing & retrieving user editable data. Once you start developing anything using SharePoint it would be very hard to escape SharePoint lists. Other than the fact that most SharePoint features rely on them (like blogs, discussion boards, galleries, etc.) they will save you a lot of time when you want data stored in a table format and the end user needs to edit that data, the administrator needs to secure the data, you want workflow and approval features, change history and all the other little features that SharePoint provides on list items.

But the problem arises when you as the developer need to use that data in other parts of your program and would like to access it just as you would a regular table in the database.

To be more specific let's assume we have decided to create a public (Internet) facing site which is going to among other features display a list of our products in different ways on separate pages. Let's say we want to show the latest products, the cheapest priced products, etc.

Now let's make a few assumptions: (1) we don't have this information in a standard DBMS (2) we want our team of "product administrators" to be able to introduce new products, modify previous specs etc. through a simple web UI and (3) the lists displayed on the public site will be formatted with the public sites branding.

OK let's see how we can solve this using SharePoint features and minimal coding effort:

I would create a SharePoint list and call it 'Products' (what a surprise!). This list will be setup to have all the necessary fields such as product name, description, price, …

Now if I was going to provide a view of this list showing the cheapest priced products, the first thing would be to create a view on the list which has the right ordering and filtering to achieve the criteria for cheapest priced product. Creating this view is very simple and can be easily achieved through the standard SharePoint UI. But we are looking for a way to fetch this list of products so we can display them in our own special web part/user control with our own special formatting, rules, etc.

OK how to get to this data using the SharePoint API:

A couple of different approaches can be taken to get to this data and I'm going to touch on two of them here:

  1. You can access the same view that we described above using the backend API and fetch the items from the view.
  2. You can create a query on the fly and execute it against the list to fetch the list of items you need and then work on them as needed.

Before I get into the details of each approach let's do a little comparison of each approach:

The first approach is much easier to program. You create a view through the standard SharePoint UI and give it a specific name and then start using it in your code.

The second approach requires more code but it is more flexible since you can decide about what you fetch at runtime (as opposed to working on a fixed view). I also like the second approach a lot better since all the things that I need are encapsulated in my code and I'm not relying on any external elements that for any reason might not exist (or might get deleted) and cause problems for my code.

So let's look at some code.

1st approach: "You can access the same view that we described above using the backend API and fetch the items from the view"

SPSite site = new
SPSite("http://mysite");

SPWeb web = site.OpenWeb();

SPList productList = web.Lists["Products"];

SPView cheapestView = productList.Views["CheapestView"];


foreach (SPListItem item in productList.GetItems(cheapestView))

{

//do what ever you want to do with the item

}


2nd approach: "You can create a query on the fly and execute it against the list to fetch the list of items you need and then work on them as needed"

SPSite site = new
SPSite("http://mysite");

SPWeb web = site.OpenWeb();

SPList productList = web.Lists["Products"];

SPQuery qry = new
SPQuery();

int maximumCheapPrice = 100; //what ever the maximum price criteria is


qry.Query = "<Where><Leq>" +

"<FieldRef Name=\"Price\" /><Value Type=\"Number\">" +

maximumCheapPrice.ToString() +

"</Value>" +

"</Leq></Where>" +

"<OrderBy>" +

"<FieldRef Name=\"Price\" Ascending=\"TRUE\" />" +

"</OrderBy>";

qry.RowLimit = 10; //only fetch the top 10 cheapest products


foreach (SPListItem item in productList.GetItems(qry))

{


//do what ever you want to do with the item

}


The weird way of defining a filter and order by clause for our query is called CAML. It's a XML based language that you can use to define whatever filter you need to fetch the data. There are some good free editors (try this) that you can use to define the filter and then copy paste the result into your code.

Hello everyone,
This is my first attempt at blogging and I've aimed it at sharing my day-to-day problems, experiences and thoughts on different software development issues.
Some of the stuff might be really abstract while others might be quite concrete and obviously the content will vary depending on what project I'm currently working on.
Looking forward to reading everyone's comments and feedbacks.

Ehsan