Rapid Solution Deployment – SharePoint 2010

Blog #5 in How I Successfully Upgraded eBay to SharePoint 2010 – See Previous Blog In Series

As part of the eBay upgrade, we had several code changes that had to be made and tested.  This included Solution Deployment and all its glory.  One of the things I really got tired of doing was going into Central Adminsitration and undeploying solutions.  It is a very painful process.  You may ask, why didn't you do it through Visual Studio…yeah right…it errors out half the time and never really gives you a great message on how to resolve it.  Therefore, I wrote a script for that!  I'm going to be posting several of the scripts we utilized to very common processes during the upgrade.  Here is the script that we built to deploy the solutions:

$minutes = $args[0]

if (-not $minutes)
{
$minutes = 15
}

$di = new-object system.io.directoryinfo("c:WSPDirectory")

foreach ($fi in $di.getfiles("*.wsp"))
{
$date = get-date
$date = $date.addminutes($minutes*-1)

if ($date -lt $fi.lastwritetime)
{

$fi.name
$sol = $null

try
{
$sol = get-spsolution $fi.name  -erroraction silentlycontinue
}
catch{}

if ($sol)
{
"Retracting $fi"
uninstall-spsolution $fi.name -confirm:$false -allwebapplications -erroraction silentlycontinue
uninstall-spsolution $fi.name -confirm:$false -erroraction silentlycontinue

while($sol.jobexists -eq $true)
{
start-sleep 3
$sol.jobstatus
$sol = get-spsolution $fi.name
}

"Deleting $fi"
$sol.delete()
}

"Adding $fi"
add-spsolution $fi.fullname

$sol = get-spsolution $fi.name

"Installing $fi"
install-spsolution $fi.name -allwebapplications -gacdeployment -erroraction silentlycontinue -force
install-spsolution $fi.name -gacdeployment -erroraction silentlycontinue -force

while($sol.jobexists -eq $true)
{
start-sleep 3
$sol.jobstatus
$sol = get-spsolution $fi.name
}

"Deployed $fi"
}
}

Note that one thing that can be added to this script is an interrogation of the solution ID.  This is better than going off the name of the solution (which actually changes in production deployment using YYYY-MM-DD notation) and this script would not work for production.  Only in Development and QA was this handy.

Enjoy!
Chris

See the next blog post in this series HERE

Increasing SharePoint Performance – CSS, Javascript and Pictures

Blog #6 in How I Successfully Upgraded eBay to SharePoint 2010 – See Previous Blog In Series

One of the requirements of our upgrade was that the upgraded system would perform better than the current 2007 production environment.  This included the latency to Europe and other non-US countries.  Their are many options that one would have to do this, these include the following:

  • Increased bandwidth
  • Decrease latency
  • Page Weight
  • Caching (Server, F5, Proxy, CDN)

Increased bandwidth:

In our case, bandwidth was not an issue.  In your circumstances, that may or may not be the case.  If you have a small pipe, you won't be able to get too many requests going through it.  In the end, this really isn't a big deal.

Decreased Latancy:

SUPER SUPER important.  Latency is the time is takes for the html packets to get to the client.  If you have high latencies, your users are going to hate your intranet/SharePoint and call it slow.  Even though you spent millions on hardware and it is able to generate the HTML faster than any other system, it can't get it to them fast enough.

Decrease Page Weight:

If you were to fire up Fiddler and watch the typcial traffic flow for a request to SharePoint 2010.  You will see several requests are being made to the server.  Because there is an order to the madness, their is a multiplier effect to the time it will take for a user to fully load a page (even with async requests occuring).  YOUR NUMBER ONE GOAL IS TO MINIMIZE THE HTTP REQUESTS FOR A PAGE.  How do you do this?  You can do this in several ways:

  • Compress CSS
  • Minify Javascript
  • Evalute Picture Usage

Compress CSS:

The CSS classes between 2007 and 2010 changed A LOT!  If you had custom CSS that overrode the out of the box CSS, it very likely will not have any effect in 2010.  It should go without saying….you should remove this CSS.  You should also attempt to determine how many different CSS files you have being downloaded from your pages.  If you can consolidate those into one CSS, then you are going to be in a very good position moving foward.  Steps:

  • Find all CSS classes used in your pages
  • Evaluate if they can be combined into one CSS file
  • Determine what CSS is used 
  • Remove un-used CSS from your CSS files
  • Compress the CSS (remove all white space)

How does one find out is CSS is used or not?  Here's the steps:

  1. Get all your CSS classes used in your CSS files 
  2. Export all ASPX files from SharePoint to the SharePoint Root
  3. Run a search on all the files for each CSS
  4. Any CSS that is not found can be deleted

1.  How does one export all the CSS?  Heres a handy script:

$templateDir = "c:program filescommon filesmicrosoft sharedweb server extensions14 emplate"

del "$templateDirASPX" -recurse -force
mkdir "$templateDirASPX"

#export all aspx files to temp directory
$cdbs = get-spcontentdatabase
$count = 0

foreach($cdb in $cdbs)
{
"Processing " + $cdb.Name

$conn = new-object system.data.sqlclient.sqlconnection $cdb.legacydatabaseconnectionstring.replace("Timeout=15","")
$conn.open()
$cmd = $conn.CreateCommand()
$cmd.CommandTimeout = 4000
$cmd.commandtext = "select dirname, leafname, content from alldocs ad, alldocstreams ads where ad.Id = ads.id and ads.id in (select id from alldocs ad where extensionforfile in ('aspx')) and ad.internalversion = ads.InternalVersion";
$reader = $cmd.executereader()

while ($reader.read())
{
$filename = $reader["leafname"].replace("/","_")
$fileContent = $reader["content"]
$dirname = $reader["dirname"]

$temp = $count.tostring() + "`t" + $cdb.Name +"`t" + $dirname +"`t" + $filename
add-content "Files_ASPX.txt" $temp

[System.IO.File]::Writeallbytes($templateDir + "ASPX" + $count.tostring() + "_" + $filename, $fileContent)
$count++
}

$reader.close()
$conn.close()
}

2. How does one search for the CSS?  Here's another handy script (note it will take a while for this to finish):

$classes = get-content "Css.txt"
cd "c:program filescommon filesmicrosoft sharedweb server extensions14"

foreach($_ in $classes)
{
dir -r -i *.aspx | select-string $_ | select filename, linenumber >> $_.txt
dir -r -i *.ascx | select-string $_ | select filename, linenumber >> $_.txt
dir -r -i *.css | select-string $_ | select filename, linenumber >> $_.txt
}

3.  Create a list of all the CSS classes used in your CSS,

4.  Compare the output of the last script and your CSS list, determine what is not used.  I'll leave it to you to write the script for that.  Just use a hashtable 🙂

5. Compress your CSS, tools for that:

Minify Javascript:

SharePoint 2010 will cause your page weight to go up.  This is because of its heavy use of JavaScript.  They did think ahead and minify the javascript files, so that ensure you don't take a huge hit on the files.  If you have your own custom JavaScript and you haven't minified it, you are causing a higher page weight and your remote users will not have an optimal experience.

Tools to Minify are here:

Evaluate Picture Usage:

Does your site have a lot of pictures?  Doesn't look like SharePoint anymore?  That's both a good and bad thing.  The more pictures you have, the large the page weight.  Here's what you can do to decrease the page weight:

  • Ensure that buffer images are width 1 and heigth 1 – I can't tell you how many times I have seen an image created that is 10K and is used as a buffer image.  It only needs to be a few bytes and a 1×1 image!!!
  • Use sprites – make lots of small images into one to reduce your HTTP requests

Caching:

There are lots of ways to do caching.  And there is a ranking of most benefical and easy caching methods…here they are:

  • Proxy Server
  • Content Delivery Network (CDN)
  • Hardware compression
  • F5 Load Balancer
  • Web Server

Proxy Server – The proxy server method *really* is the best one.  You can put a proxy server in each of your geo locations.  The job of the proxy server is to cache common elements like CSS, Images and Javascript and do it on that side of the "pond".  This makes such a HUGE difference its not even funny.

Content Delivery Network (CDN) – very similar to the Proxy server in that it keeps a local copy of the content, but you have to factor that into your web page design/HTML.  I'm not a fan from an
easy of manageability standpoint, but it would match the proxy server in the performance increase you would see.

Hardware Compression – By deploying devices on your network edges, you can compress the WAN traffic.  This tends to be a very costly solution and requires a lot of testing.  In larger organizations, good luck with getting the network guys to mess with anything sitting next to the routers!

F5 Load balancer – an F5 can act like a proxy server by caching files, but typcially the F5 sites in front of the web servers on the Farm side of the network.  It is very cost prohibative to have an F5 in each data center in each geo location acting as a proxy server.  That being said, it can still site on the Farm side and cache the CSS, Images and Javascript.  It still is exposed to the latency of your network, but the files will be in memory and be served up faster to the clients because they won't have to be fetched from the web server file system or content database.

Web Server – You can have the web server cache HTML from the execution of the ASPX pages.  This does NOT help with the latency issues, but it does help with the actual execution performance.  As we will see in the next blog post, Object Caching KILLS SharePoint 2010 so you have to be careful where you implement it.  You can also implement Blob caching at the web server level.  This is very beneficial for publishing sites as it will keep SharePoitn from hitting the content database every time to load images, CSS and Javascript.

In the end, we were able to increase the visible and technical user performance for Eurpoean users by over 30%.  They have made alot of comments of how much faster it is, even if their are weird quirks with the sites, they are still amazed at how fast it is!  Chalk one up for the CJG-mister!

Enjoy!
Chris

See the next blog post in this series HERE

Output Caching in SharePoint 2010 – It doesn’t work, err it does…

Blog #7 in How I Successfully Upgraded eBay to SharePoint 2010 – See Previous Blog In Series

In my last blog post I made a comment that Output Caching does not work in 2010.  Well, it works, but that's the problem.  SharePoint 2010 has elements that are very poorly designed…but we all knew that didn't we?

What is object caching?  It's an ASP.NET feature that allows the HTML of the page to be cached in the web server's memory and very quickly be presented to the clients (albeit slowly with high latency).  Output caching can be turned on on your publishing sites in the Site Actions, Site Settings page for the site collection.  Again, once turned on, the HTML is cached….this is bad.  Yes, I said it is bad.  Why? 

There is this wonderful control called "PersonalSiteActions".  This control emits javascript directly to the page.  This javascript is specifc to a user.  Wha????   Specifc to a user?  But that means….yep….you got it.  The first person to hit the page will cause the HTML to render and THAT HTML WILL BE CACHED.   What does the rendered HTML look like?  Here it is:

function ctl00_ctl39_SocialNavControl_insertMyProfileMenu() {   
var menus = document.getElementsByTagName('menu');   
if (menus == nullreturn;   
var menu = null;   
for (var i=0, len=menus.length; i<len; i++) {       
if (menus[i].id.lastIndexOf('PersonalActionMenu') != -1) {       
menu = menus[i]; break;        }    }    if (menu == null)   return;   
var elm = document.createElement('ie:menuitem');   
elm.setAttribute('menugroupid', '50');   
elm.setAttribute('description', 'View and manage your profile.');   
elm.setAttribute('text', 'My Profile');   
elm.setAttribute('onmenuclick', 'STSNavigate2(event,'http:u002fu002fwww.blah.org:80u002fsitesu002fmyu002fPerson.aspx?accountname=CONTOSOGIVENSCJ)');   
elm.setAttribute('id', 'ID_MySiteLinksMenu');   

var elm2 = document.createElement('ie:menuitem');   
elm2.setAttribute('menugroupid', '50');   
elm2.setAttribute('description', 'Open your personal homepage');   
elm2.setAttribute('text', 'My Site');   
elm2.setAttribute('onmenuclick', 'STSNavigate2(event,'http:u002fu002fwww.blah.org:80u002fsitesu002fmyu002f')');   
elm2.setAttribute('id', 'ID_MySiteMenu');   

What happens when the next user comes along?  The javascript is CACHED.  When they click on the "My Profile" link, they will go to the GIVENSCJ personal site NOT their personal site.  This is what is know as a non-cache safe control.  One might think that you could just go into the ASPX page and edit this, but NO.  This is inside the PersonalSiteActions control.  One might think, I'll create my own control…well, that's a start, but upon further investigation via reflector, you see that is IMPOSSIBLE.  The control has several permission based checks for if a page is in Shared mode or Personal mode and all kinds of internal methods and properties.  Dead end.

This same problem happens for any control you have written that outputs user specific content.  You must ensure that ALL of your controls are cache safe for the basic out of the box output caching to work.  If you don't want these to be cached, you are goign to have to tell the output cache not to cache those sections.  Now we are customizing the out of the box master pages…YUK.

NOTE:  One other issue with the PersonalSiteActions is the "My Site" link.  Notice how it takes you to the default.aspx page that has the newsfeeds.  How is that your "My Site"???  Really?  The SharePoint team could have kept the branding similar as it just confused the hell out of users when they are navigating in SharePoint 2010.  So what can you do with this? 

  • Change the link where the javascript points
  • Update the page to have a redirect

The only way to change the link for the PersonalSiteActions is to do HTML capture and rewrite (yes we even tried javascript and fancy JQuery – no go!).  You can write an HTTP Handler for this.  What happens if you do this?  Our friend Output Caching is not a friend to HTML rewrite.  It forces the output cache to disable itself.  Output cache minus the quirks in SharePoint 2010, can cause a huge increase in application performance.  Implmenting HTTP redirect forces this off, but it worked as I was able to replace the link to the proper link for the my site

The easiest way to get the "branding" right, wa
s to put on the default.aspx page, a redirect to person.aspx.  Problem solved, output cache saved.

Enjoy,
Chris

How I Successfully Upgraded eBay to SharePoint 2010

And so it begins!  This is my first blog post in a few months…where have I been?  As some of you know, I have been the Sr. Architect of the eBay SharePoint 2010 Upgrade Project.  One of 3 *major* upgrades that are occuring in the United States in 2011.  I will be presenting a series of blog posts on how we did what we did and what challenges we ran into over the project.  This is the first of about 15-20 blog posts of material you have not seen before on any MVP, MCM or anyone else's blog post because I know for 100% fact, we were the first to do many of these things!  Which makes me think we were the first to really do a major upgrade with all the pieces (DR, DMZ, data center move, firewalls, etc).

First, a little background. The project started in early March and we were given a target completion date of mid July.  Several vendors bid on the process, but none had a response as deep and technical as ours, with several other carrots and goodies built in.  Also, all of them but ours said it was impossible to do in 3.5 months.  Many quoted 3x as long and 4x as much in price..WOW.  I was also well known by the eBay team for my work at other companies and the ACS SharePoint Courseware.  So my reputation preceeded me, which is good and bad depending on who you are (wink wink to Microsoft).

I started off the project in a very Avanade like manner (I worked at Avanade for about 1.5 years) using the ACM methodology, which is very similar to Microsoft's MOF.  The phases include:

  • Envisioning
  • Planning
  • Executing
  • Stabalizing
  • Deploying

As part of the Envisioning phase, I utilized a series of tools that I had built over the past few months that would do a DIFF of their SharePoint servers.  I will have a whole blog post on this tool in the next couple of weeks, and I can tell you right now, MCS was drooling to get their hands on it.  This tool determined EVERYTHING they had done to the environment.  Using this data, I was able to map it to the code and solutions and determine what items we didn't have code for.  This was a great process step as we learned where the code was, what it mapped too and where it was used.  All very vital elements to making sure the upgrade would be a success.

As part of the Planning phase, we needed to start to get an idea of the reasons behind moving to SharePoint 2010 and what changes MUST be made to the environment moving forward.  This evolved into building out a massive Governance document FROM SCRATCH, utilizing elements from our ACS Governance course.  It is much more details than anything out there today (sorry Nuedesic, Slalom et all).  This doucment really got the converstations going between the management and us to figure out what really had to happen and what the final Farm needed to look like.  We also had to identify what code needed to be refactored (and let me tell you, don't ever hire an accounting company to do SharePoint code – some of the worst code I have ever seen).

The executing phase, my favorite phase!  Once we knew what needed to happen, we started to do trial upgrades on the envrionments.  As part of this, we needed a moc development 2007 environment and a 2010 environment.  This was not an easy task.  There were several elements such as custom code, 3rd party applications and just a lot of random changes to the SharePoint root that weren't really documented.  Once we installed everything we thought we needed, it became impractial for us to click on every page in the farm to determine if there were any issues/errors.  Utilizing another tool I had built from other projects, I modified it to hit every page in the farm and record if any errors were occuring.  It gave us back nearly perfect data.  We were able to find several components that needed to have extra settings or code installed.  As part of this entire process, I started to create the most amazing document of the project…THE PRODUCTION BUILD GUIDE.   This document has every single step for a network/server admin to perform to get the farm installed and up and running with no interaction from us.  This document is pure GOLD.  We utilized this document to build out two more environments, QA and STAGING.  By this time, I and my friend Satya had built out and upgraded the environment 4 times (you should expect at a minimum to do the same).

Stabalizing.  This was probably the most important step of the project.  As a consultant, I can make anything work and fix any error, but when it comes to the functionality of the sites and components, I can't possibly know that.  This means that business owners must step in and look at the QA and Staging environments to determine if their components are working correctly.  In the end, everything worked out fine due to the stellar team we had, but honestly, we could have given alot more time to this vital piece.

Deployment:  Getting the hardware installed was somewhat of an easy part of the process, one of the hard parts was specing out the hardware that would be needed to support 10s of thousands of users.  I did the first configuration of the hardware and then eBay decided to bring in some MCS folks just to validate.  They unneedingly added in double the memory to the WFEs and APP servers, but hey…memory is cheap so blah (By the way, utilization of the servers shows my config was spot on).  Another thing we had to deal with was the movement of data centers.  We needed to move alot of large content databases to a new data center.  Our upgrade window was 60 hours.  I'll tell you, trying to ensure that a massive farm environment is only down for 60 hours is a REAL challenge.  We had to get really creative to make that happen.  I brought a lot to the table in terms of getting things into that window.  As part of the deployment phase, we needed to build a project timeline with all the steps of the build guide in it.  Each step was a taks with its respective timing (thank god MS Project can do hourly).  After I had built out this AWESOME document, I realized that tasks had to be started the next monday (this was Thrusday evening of the week).  We had to have a mandatory meeting to make sure everyone knew what needed to happen.  Everything went through perfectly, the build guide was the document that really drove the project timelines.  I'l tell you, every step we did out of order or did not do….bit us in the ASS.  I'll never not follow my build guide again!

In summary, the upgrade is complete.  The system is running great, but now we have hit the training aspect for the business users.  It has been quite a shock (even to me, the writer of the first SP2010 courses for End Users) to learn about all the things that the SharePoint team changed between 2007 and 2010.  Some are really great, some are *really* bad.  We have had numerous tickets coming in asking how to do this, how to do that and what the heck is this "ribbon" thing.  Next phase is to build out custom courseware for the users using the great tools we have built at ACS and have utilized for other customers like Exxon Mobil, Abbott Labs and very large government organizations. 

It has been/was a GREAT project.  And although the title of the blog is "How I Successfully Upgraded eBay to SharePoint 2010", you can't do something like that alone (you need at least 4-5 superstars).  The team at eBay was OUTSTANDING to work with and I have to say, the stars aligned for this project to be succesful.  I am confident, no one else on the planet could have executed like we did, I am truely grateful to eBay and the 3rd parties (Corasworks, AvePoint, etc) that we had to engage with on the pr
oject.  Only one person on the team actually has another public blog and that is Maarten Sundman.  He was pivotal in a lot of our code review, code modification and branding aspects of the project.  He was also a "get it done" type of person, which is great to have on a project like this.

Watch for the next blog post in this KILLER series,
Chris

Also, come find me at the SP Conference, we will have a session on our eBay upgrade project, feel free to ask some deep and technical questions at the session!

And lastly, if you need help with your SharePoint Upgrade, there isn't anything I don't know about the process, drop me a line and I'll be happy to help!

Continue to next blog post in this series – Upgrading UserProfiles – the non-database attach method

 

Upgrading UserProfiles to SharePoint 2010 Non Database Attach

Blog Post #2 of How I Successfully Upgraded eBay to SharePoint 2010 – Previous Blog in this series

One of the decisions I made as the Sr Architect of the eBay upgrade project, was to NOT do a database attach upgrade of the SSP.  I viewed the whole process as not very reliable and we didn't want a bunch of junk moving foward to 2010.  So, we went down the path of build it all from scratch.  This presents some interesting problems and this blog post will focus on the UserProfile Service Application component of the SSP.  In summary it entails:

  • Custom properties added in 2007 must be moved to 2010 (definition and data)
  • Custom profile privacy settings
  • All connections must be redone (BDC, Active Directory)
  • Migrate old custom user data

As part of the process, the first thing you have to do is to identify all the custom properties that were added in 2007.  This is actually an easy part as you can simply run a query against the SSP database and the PropertyList table.  Anything over a certain ID value is a custom property.  This definitely gets you the list, but what do you do with the list?  You COULD manually add each one to the UPS service application, or, you could write a lovely PowerShell Script to do the work for you.  How do the scripts work?

  1. We built a custom tab delimited file of their names, values, descriptions, privacy settings and connection information.
  2. We create a script that reads the data and will setup the properties – this was HUGE time saving as the property editing page in UPS is one of the worst designed pages in all of Central Administration
  3. Setup all BDC connections using SharePoint Designer (this includes assigning the proper permissions to the UPS service application accounts to be able to execute the BDC and retrieve data)
  4. We create another script that will setup the property connections (after BDC has been created)
  5. Custom profile privacy settings

That gets the basics done, but what about the data of the users in the 2007 farm?  Just so happens the SharePoint team though very hard about this one.  As part of the SharePoint Admin ToolKit, you have a User Profile replication tool.  This tool will grab the data in a target SharePoint farm (2007 or 2010) and move the data based on the account name of the user (not the recordid as that would change in each farm).

The only problem with this tool is that is doesn't like special characters very much.  These would include the main culprit the 'appersand' (ie '&').  We had to edit all the properties in the production farm that had this in the PropertyVal (of the UserPropertyValue table) to remove it and put 'and' in its place.  Once this was done, the tool was able to use it's error log to rerun for just those users.  You could slowly find all the special characters and remove them to have the users data get moved over successfully.

The last step, and this really is the most important one, is that you ensure that all the privacy settings came over.  We were told the Admin Toolkit would do this, but we found out that wasn't the case.  It is very simple to get these values from the older 2007 farm by joing the UserProfile_Full and the UserProfileValue tables on recordid and getting the "privacy" column.  Then all you have to do is to update the same values in the 2010 farm and BAM…you are DONE!

Enjoy, 
Chris

Check out the next blog in this series – Changing your Host headers

Upgrading SharePoint 2010 – Changing Your Host Headers – What It Means To Your Consultant

Part 3 of How I Successfully Upgraded eBay – Previous Blog Post

So what does it mean for you to "retire" an older web application for branding purposes?  It is a simple change in Central Administration, but a nasty proposition when it comes to content in the site and most consultants will take the easy way out and say, no you can't do that (because they know how much work it is and don't really know what will break).  What am I talking about?  Let me give you an example:

  • In 2007, the web application host header was http://abc.contoso.com
  • In 2010, you want it to be http://xyz.contoso.com
  • Moving a content database from one web applicaiton to another with a different name (for service delivery reasons)
  • Moving a site collection from one content database to another content database in a different web application

Again, it is easy to do via Central Administration.  It is easy to do the DNS and add an AAM, but did you think about content?  Huh, what about the content…here's what you didn't think about:

  • Non relative links used in 3rd Party applications (Corasworks) 
  • ASPX pages with non-relative links
  • Content Editor web parts with non-relative links
  • Update custom web part properties with non-relative links
  • Navigation nodes with existing non-relative links
  • Excel and InfoPath with Data Connections

The problem is non-relative links.  It was a big problem in 2007 and in most vendors products that they didn't check the URL of where the code was running and convert the saved content into relative links!  This is a major pain in the ass!  Now you will see in 2010 and in most vendor products, they have learned a VERY valuable lesson.  ALWAYS USE RELATIVE LINKS!

How does one upgrade 10s of 1000s of pages, 1000s of content editor web parts and navigation nodes with the new URLs?  You build a tool of course!  The tool is simple in its requirements:

  • Take an input file of old url to new url 
  • Update all aspx pages with the new links – CHECK
  • Update Content Editor web parts with the new links – CHECK
  • Update custom web parts with links – CHECK
  • Update Navigation nodes with new links – CHECK
  • Update….wait…how to you update Excel and InfoPath? – OH MY…(the topic of the next blog post)

The tool is easy to build, but add in 1000s of sites, 1000s of pages and web parts…you have some problems:

  1. It will take too long for it to update all your resources before the upgrade is finished.  So what do you do to meet your upgrade window?
  2. You have external links that still point to these older urls

The problem has a simple solution.  HttpModules.  All you have to do is intercept the HTTP request and redirect to the proper place.  How do you know where to redirect them?  Same as the tool, use the input file for the tool.  Create a SharePoint list that the HttpModule pulls from that will redirect the request to the proper place.  Here is a coule of examples:

There is an issue here.  You would have two entries in the configuration list:

In this case, the order of replacement matters. If the first entry is applied, they will incorrectly get redirected to http://xyz/HelloWorld.  This is not the desired outcome.  You must redirect in order of most specific first, then to the most generic.  You must also be careful with generic entries.  Suppose you have two web applications:

If you have the following in your configuration list:

You can expect that the requests to http://abc will be redirected properly, but the http://abcdef, will also get redirected and not the way you want.  Also in the case of http://abc->http://abcdef, you would be in a continuous loop of adding http://abc to the redirect and of course, you will eventually overrun the address line in the browser!

Enjoy,
Chris

See next blog post in this series

Upgrading to SharePoint 2010 – Excel Workbooks And InfoPath Forms With Data Connections

Blog #4 in How I Successfully Upgraded eBay to SharePoint 2010 – See Previous Blog In Series

So you decided to move your content databases around eh?  The urls for the site collections and webs are different?  Did you ever think about all those lonely lost users that created InfoPath Forms and Excel Workbooks with Data Connections to lists?   Hmm…probably not!

So what would be the steps to start this massive endeavour?  Well, the first would be to identify all the Excel and InfoPath workbooks in your farm, then you should download all of them, check if they have a data connection, then update them if they do.  Wha?  What if you have 1000s of them?  Damn…you will spend months doing that…good luck!

Wait…Microsoft built us a commandlet for it.  It is called "Update-SPInfoPathUserFileUrl".  What does it do you ask?  Well, it will iterate through all the infopath forms and data connection libraries in your farm and update similar to the Url Updater I talked about in the previous post.  The only problem with this commandlet…IT DOESN"T WORK!  Yes, you heard it right, it doesn't work.  After serveral calls with the product support team, they finally conceded to us this fact.  What is wrong with it you ask, here's a run down:

  • If the data connection library has anything else other than an ODC file, it will fail
  • If the ODC file has a CDATA tag in it (or is not a standard everyday XML file), the tool fails when it tries to create an XMLDocument of the file
  • For no reason at all, it won't update all the files

They submitted some internal bug requests and told us they would try to get us a hotfix (but it would take several weeks).  Unfortunately, that didn't fly very well with us as we had to do the upgrade in two weeks!  So, I wrote our own tool to do it programmatically!  I'll summarize the steps:

  • Find all the InfoPath and Data Connection files in the Farm (across all content databases)
  • Run the tool passing in the same input file of replacement URLs
  • Update the InfoPath Files (which are CAB files by the way) in place in their respective libraries
  • Update the Data Connection (ODC) files, which are simply XML in place

So finding the files is the easy part, here's the script:

#export all aspx files to temp directory
$cdbs = get-spcontentdatabase

$count=0
foreach($cdb in $cdbs)
{
"Exporting file list from " + $cdb.Name

$conn = new-object system.data.sqlclient.sqlconnection $cdb.legacydatabaseconnectionstring.replace("Timeout=15","")
$conn.open()
$cmd = $conn.CreateCommand()
$cmd.CommandTimeout = 4000
$cmd.commandtext = "select DirName, leafname from AllDocs where extensionforfile in ('xsn', 'odc')";
$reader = $cmd.executereader()
$i = 0

while ($reader.read())
{
$line = $cdb.WebApplication.Url + $reader["dirname"] + "/" + $reader["leafname"]
add-content "InfoPathFilesToUpDate.txt" $line
$i++
}

$reader.CLose()
$conn.CLose()
}

Once you have the list of the files you need to update, the rest is easy….relatively speaking [:D].  So what's the next step?  You gotta loop through all these files and determine if they have the old URLs in them.  For an xml file, that is pretty easy.  It's just a text file, but most ODC files are not valid XML files…YUK.  But that's beside the point, its easy to replace the ODC file connection info, just look for the <odc:Connection> element.  Inside of it will be a connection string with the url you are looking for.

Now on to the InfoPath files.  How does one update an InfoPath file without opening it in InfoPath and changing the Data Connection?  You could download it and use the "extract.exe" tool to output the file contents, then rebuild it using "makecab.exe", but wow, that is just too much work.  And yes, I tried it, and it sucked.  Basically Windows still has the ability to have a folder to be "tagged".  Did you ever have that problem with hackers hitting your FTP server and "tagging" it?  You couldn't delete their tag without wipping the entire drive.  Guess what…when extracting certain InfoPath files, you can "tag" your directory.  When making a scrip with a "delete all" files in a folder command errors out more RED than you ever care to see.  So how do you successfully update an InfoPath CAB file?

  • Use the awesome Reflector tool to get the CabinetExtractor class out of the SharePoint dlls for the CommandLet.

This is an entire set of classes specifcally designed to update CAB files in memory. PERFECT!

Once I had the code working, I simply passed in the list of files, checked the manifest.xml file in the InfoPath CAB package for any offending URLs and if they were found, updated the file and then streamed it back into the CAB file.  Then I just upload to the library.  Everything worked like a charm!

NOTE:  Site and List templates are CAB files…internally, they have a version called "3" in them.  Magically, if you change this value to "4", a majority of them will work in 2010…HINT HINT…

Enjoy,
Chris

See next blog post in this series here