Upgrading to SharePoint 2010 – Excel Workbooks And InfoPath Forms With Data Connections

Blog #4 in How I Successfully Upgraded eBay to SharePoint 2010 – See Previous Blog In Series

So you decided to move your content databases around eh?  The urls for the site collections and webs are different?  Did you ever think about all those lonely lost users that created InfoPath Forms and Excel Workbooks with Data Connections to lists?   Hmm…probably not!

So what would be the steps to start this massive endeavour?  Well, the first would be to identify all the Excel and InfoPath workbooks in your farm, then you should download all of them, check if they have a data connection, then update them if they do.  Wha?  What if you have 1000s of them?  Damn…you will spend months doing that…good luck!

Wait…Microsoft built us a commandlet for it.  It is called "Update-SPInfoPathUserFileUrl".  What does it do you ask?  Well, it will iterate through all the infopath forms and data connection libraries in your farm and update similar to the Url Updater I talked about in the previous post.  The only problem with this commandlet…IT DOESN"T WORK!  Yes, you heard it right, it doesn't work.  After serveral calls with the product support team, they finally conceded to us this fact.  What is wrong with it you ask, here's a run down:

  • If the data connection library has anything else other than an ODC file, it will fail
  • If the ODC file has a CDATA tag in it (or is not a standard everyday XML file), the tool fails when it tries to create an XMLDocument of the file
  • For no reason at all, it won't update all the files

They submitted some internal bug requests and told us they would try to get us a hotfix (but it would take several weeks).  Unfortunately, that didn't fly very well with us as we had to do the upgrade in two weeks!  So, I wrote our own tool to do it programmatically!  I'll summarize the steps:

  • Find all the InfoPath and Data Connection files in the Farm (across all content databases)
  • Run the tool passing in the same input file of replacement URLs
  • Update the InfoPath Files (which are CAB files by the way) in place in their respective libraries
  • Update the Data Connection (ODC) files, which are simply XML in place

So finding the files is the easy part, here's the script:

#export all aspx files to temp directory
$cdbs = get-spcontentdatabase

$count=0
foreach($cdb in $cdbs)
{
"Exporting file list from " + $cdb.Name

$conn = new-object system.data.sqlclient.sqlconnection $cdb.legacydatabaseconnectionstring.replace("Timeout=15","")
$conn.open()
$cmd = $conn.CreateCommand()
$cmd.CommandTimeout = 4000
$cmd.commandtext = "select DirName, leafname from AllDocs where extensionforfile in ('xsn', 'odc')";
$reader = $cmd.executereader()
$i = 0

while ($reader.read())
{
$line = $cdb.WebApplication.Url + $reader["dirname"] + "/" + $reader["leafname"]
add-content "InfoPathFilesToUpDate.txt" $line
$i++
}

$reader.CLose()
$conn.CLose()
}

Once you have the list of the files you need to update, the rest is easy….relatively speaking [:D].  So what's the next step?  You gotta loop through all these files and determine if they have the old URLs in them.  For an xml file, that is pretty easy.  It's just a text file, but most ODC files are not valid XML files…YUK.  But that's beside the point, its easy to replace the ODC file connection info, just look for the <odc:Connection> element.  Inside of it will be a connection string with the url you are looking for.

Now on to the InfoPath files.  How does one update an InfoPath file without opening it in InfoPath and changing the Data Connection?  You could download it and use the "extract.exe" tool to output the file contents, then rebuild it using "makecab.exe", but wow, that is just too much work.  And yes, I tried it, and it sucked.  Basically Windows still has the ability to have a folder to be "tagged".  Did you ever have that problem with hackers hitting your FTP server and "tagging" it?  You couldn't delete their tag without wipping the entire drive.  Guess what…when extracting certain InfoPath files, you can "tag" your directory.  When making a scrip with a "delete all" files in a folder command errors out more RED than you ever care to see.  So how do you successfully update an InfoPath CAB file?

  • Use the awesome Reflector tool to get the CabinetExtractor class out of the SharePoint dlls for the CommandLet.

This is an entire set of classes specifcally designed to update CAB files in memory. PERFECT!

Once I had the code working, I simply passed in the list of files, checked the manifest.xml file in the InfoPath CAB package for any offending URLs and if they were found, updated the file and then streamed it back into the CAB file.  Then I just upload to the library.  Everything worked like a charm!

NOTE:  Site and List templates are CAB files…internally, they have a version called "3" in them.  Magically, if you change this value to "4", a majority of them will work in 2010…HINT HINT…

Enjoy,
Chris

See next blog post in this series here