I have blogged recently
here and here regarding some errors that I've been experiencing with TFS 2012 and local
workspaces. I'm not going to go into details of the
issues, suffice to say that they appeared exhibited the following
- Occurred when doing actions on a large codebase (long running acttions)
- Occurred when I had multiple instances of Visual Studio open connected to the same workspace
- Errors that popped up where:
- TF400017: The workspace properties table for the local workspace XXX;[email protected] could not be opened
- *TF400030: *The local data store is currently in use by another operation. Please wait and then try your operation again. If this error persists, restart the application.
I have been in contact with Microsoft support about this and we have finally come to a solution. They have confirmed the issue as a bug and a fix will be coming out in the next quarterly update for **Visual Studio. **The release cadence information for Visual Studio can be found [here](http://blogs.msdn.com/b/bharry/archive/2012/08/28/tfs-shipping-cadence.aspx) at Brian Harry's blog.It should be noted that if you do have a workspace of over 50,000 files, then you should be using **Server Workspaces**.
I have decided to include some technical details if people are interested.
Firstly we need to understand about **Local Workspaces**.
- Each local workspace has a system-wide lock. A thread needs to obtain the lock in order to perform any action, eg: - Load a new folder in the **Source Control Explorer** - Refresh the list of Pending Changes - Pend changes in a workspace - When a thread wants to obtain the lock, but the lock is unavailble, it will wait (no parallelism). - When workspace operations complete quickly, this turn-based approach works fine. - But if another threads' workspace operations take too long, then the thread might not get the lock before the timeout expires. - An exception is thrown at this point.
Microsoft have said that **Local Workspaces** are not advised for more than 50,000 files. But you might expect that to be just them playing it safe. If you were just using tf.exe for all your commands then you would be right, all that would happen is that your commands would take longer to run, effectively scaling linearly. However, the concurrent nature of **Visual Studio** means that the user experience with **Local Workspaces** doesn't really scale linearly. When you have 2 or more instances of **Visual Studio** and maybe **Shell Extensions** (TFS Power Tools) all competing for the same lock then threads will start being starved and throw exceptions. The data model scales linearly, but the data model doesn't because of the concurrent nature of **Visual Studio.**
**What actually happens to cause the error? (this requires a large codebase, 3000 files +)**
If you have two instances of **Visual Studio** open, each with **Source Control Explorer** open. Start a branch operation from one of the instances. After the pending changes are created, VS starts performing a Get to download the new files. The other instance of Visual Studio will trigger a refresh at some point. This takes a very long time to perform which then causes the branch operation to fail as it could not re-aquire the lock after yielding it to Source Control Explorer.
Microsoft Support found out that the **Source Control Explorer** was in the middle of a web service call called "ReconcileLocalWorkspace" which used to sync up the server's view of the workspace after the client creates or undoes pending changes locally. For example if you start off with a local workspace that has no pending changes in it, but then you “tf add” some items, that’s a local operation and the server doesn’t know about them. When you go to check in, the client calls “ReconcileLocalWorkspace” before “CheckIn” to ensure the server knows about these new pending adds.
Calls to Reconcile are incremental from the point of view of “items in your workspace”, but not for pending changes. The entire set is sent each time. This is a performance concern, especially on Azure, because typically there is less bandwidth between client and server in that environment. Sending up the data for 78,000 pending changes requires sending almost 50MB of data over the wire to the TFS Preview service, and this takes a long time.
The problem here is that Branch is a server-side operation, so there should be no need for the refreshing Source Control Explorer to perform a call to “ReconcileLocalWorkspace”. Everything is already in sync, and the call is just wasted time.
The fix coming out in the next quarterly update, this call to "ReconcileLocalWorkspace" will not occur anymore, thus fixing this issue.
A lot of the technical details have come directly from Microsoft Support, so thanks to them for being so open about that.
I hope some people find this interesting.