I’ve written this post to document a dumbass thing that I did and the error that it caused. Hopefully it might help someone googling in desperation sometime…
The popularity of the “Tribute to the humble leave form” series over at SharePoint Magazine, has meant that we actually get a few gigs implementing – surprise, surprise – InfoPath leave forms with SharePoint Designer workflows! This was actually quite unexpected, as I wrote that series as a joke and for end-user training purposes.
Now, I fully realise that many people don’t like tools like SharePoint Designer in the hands of mere mortals, but I personally think it is a terrific means to encourage buy-in and user evangelism and I encourage its sustainable use. Although “real developers” may disagree with me here, SPD workflows enable quick results without the embarrassing aftermath of premature code compilation (which always leads to frustration, right girls? 😀 ).
Anyhow enough sexual innuendo – here is my error with my lessons learned
Recently, a happily running leave form workflow ground to a halt with an error as shown below. What the screenshot below shows is that the workflow died right after a “Pause for duration” workflow action. (Note the “pausing for 2 minutes” line, just before the actual workflow error).
Now, the interesting thing here is that the pause action never happened. The workflow bombed out straight away and there was no pause at all. Yet, the fact that the pause action logged to the history list meant that it attempted to do so. It seemed that the pause action ran for long enough to log that it was about to pause, but died when actually trying to.
The pause action occurred in step 2 of my workflow, but it is worthwhile showing you the first workflow step first.
Below is a screenshot of step 1 of the workflow. This first step is called “Modify Item Permissions” and we are using the codeplex custom workflow actions to modify permissions of the submitted leave form, ensuring that only the requestor of the leave (and their boss) can see or modify the form.
The step in the workflow, called “Manager Approval”, shows where we pause for 2 minutes. The reason we pause is to get around another workflow error that I will explain at the end of the post. After the pause action was completed and the workflow recommenced, it performed a “Collect data from user” task as shown below. This created a task for the requestors manager to approve the leave request.
We know that the “Collect data from user” action never ran, because the “Employee Leave Approval” task never actually got created in the task list. Thus, we know the error occurred at the pause action. Or, did it?
This error occurred consistently whether the workflow was manually started or auto-triggered. Checked all the usual suspects – timber jobs ok, permissions unchanged and working well, etc. Scanning the ULS logs at this time showed an interesting error with a much less interesting stack trace (that I have pasted at the end of the post for readability).
“Unexpected ERROR: request not found in the TrackedRequests. We might be creating and closing webs on different threads”
Hmm, what is TrackedRequests and why is a request not found? Googling that error message wasn’t much help at all. Although it is quite common, nothing matched my particular context. But then I received a lucky break. When I cancelled the running workflow that was in this error state, I received a new error that made it quite easy to work out what was going on.
WinWF Internal Error, terminating workflow Id# 02334a17-3211-4d30-902f-bf34e20354c6
Unexpected System.Workflow.Runtime.Hosting.PersistenceException: Object reference not set to an instance of an object. —> System.NullReferenceException: Object reference not set to an instance of an object. at DP.Sharepoint.Workflow.Common.RemoveListItemPermissionEntry(SPListItem item, String principalName, Boolean breakRoleInheritance)
It was this error that gave it away. For a start, the method being called was DP.Sharepoint.Workflow.Common.RemoveListItemPermissionEntry and the exception was a null reference. RemoveListItemPermissionEntry sounded suspiciously like the permission based workflow actions of the “Modify Item Permissions” step as highlighted below.
But wait… the workflow never failed at this line at all. It actually failed at the first action of the next step (“Manager Approval”) with the “request not found in the TrackedRequests” error. Interesting, eh?
Let’s put aside the issue of what step and what action the workflow failed on for a moment, and examine the second error message more closely. If you examine the “Modify Item Permissions” action highlighted above, can you think of any reason that I would get a null reference exception?
Ah! What happens if the previous action called “Set Variable”, set a blank or null value? I suspect that the “Grant permissions on an item” workflow action will attempt to change permissions of the item to a blank or null user. The result? That workflow action will die with a null reference exception – precisely what the error states.
When I checked the source of the variable “LineManagerAccount”, it was looking up a contact list for the manager of the requestor. Sure enough, it did turn out that that column was blank for some users. Thus, a simple exercise in input validation combined with a small correction to that list item and the problem was solved.
Further questions and further information
I could stop this post there I suppose and perhaps someone, sometime might find this post and think “Wohoo!”. But this problem raised a couple of issues, as well as made it clear to me that I had been stupid in how I designed this one. Let’s tackle them one by one.
1. When did it die again?
Clearly the error that caused all of this, occurred at the last action of the “Modify Item Permissions” step. We know this because the step before was where the lookup to the contact list happened, and this is where the null value came from that caused the exception. But then why did the workflow not stop at this point? Why did it continue onward and attempt to move to the “Manager Approval” steps and die on the “Pause for duration” action?
Whether this is default behaviour of SPD workflows or a logic fault in the exception handling code in the codeplex-based “Grant Permission on an Item” action, this makes life hard to troubleshoot because the error message never made sense. I only found out the true root cause when I terminated the workflow manually and got lucky because the offending exception suddenly appeared.
So, the conclusion to draw? If your SPD workflow dies with “ERROR: request not found in the TrackedRequests” in the logs, it may be that the real cause of the problem is a previous workflow step and not the step where the workflow actually stopped.
2. Paul is a dumbass
Okay, so let’s talk about lessons learnt. Some people reading this may wonder why I never used Active Directory or the user profile store to store manager details for a user and to do the lookup from there. The answer is that this contact list approach simply made sense for this client and this is actually not the dumb thing that I did. The dumb thing that I did was to forget that all of the necessary business logic and data validation steps could have been performed in the InfoPath form itself.
Remember that InfoPath forms can have data connections to all sorts of sources such as SharePoint lists, web services and relational databases. (I am not going into detail on how to do that here because I have already covered it in quite a lot in part 5 of the “Leave Form Tribute” series). We can have a published InfoPath form connect to any SharePoint list or library across the entire farm to look up business logic details, such as an approver. This is something that SharePoint Designer workflows cannot do out of the box – it can only lookup from the site where the workflow has been created.
But, more important than that, we can easily use InfoPath rules and data validation functionality to ensure that the values are not null before submitting.
What this all means is that your SPD workflows end up being simpler and easier to maintain because InfoPath is providing all of the data to the workflow as well as the validation of the data. Therefore, the workflow now doesn’t have to do the lookup work and have unnecessary steps and actions.
For these reasons I think that, if you can, grab all of the data you need for a business process using the InfoPath form using hidden fields. Then publish the form and ensure these hidden fields are published as site columns.
For this particular purpose, I think that InfoPath is a lesser evil than SharePoint Designer workflows.
3. What are the pause actions there for anyway?
Finally, I promised I’d explain why I had a pause action in the workflow. This is to get around a race condition with workflows that produces an intermittent error message:
“This task is currently locked by a running workflow and cannot be edited”
I’ve seen this occur in several situations and it pretty much boils down to some workflow actions being synchronous and subsequent actions starting before the previous action properly completed. Consider this example.
One workflow creates or updates an item in a list, say a task list. But the task list also has a workflow attached to it. This means that the first workflow creates the task item and then moves onto the next step. Meanwhile, a new workflow has been triggered for that new task list item. This secondary workflow refers to information stored in the main item which causes the race condition. If the information in the main item is not committed before this workflow attempts to use it, you’ll get null values when the main item is a new item. When working with SPD workflows, these null values will show up as ????.
To resolve the race condition, I had to ensure that the main item was committed before the secondary workflow was started by pausing the workflow. Why does pausing the workflow cause the changes to be committed? According to the MSDN article “How Windows SharePoint Services Processes Workflow Activities” http://msdn.microsoft.com/en-us/library/ms442249.aspx
Windows SharePoint Services runs the workflow until it reaches a point where it cannot proceed because it is waiting for some event to occur: for example, a user must designate a task as completed. Only at this "commit point" does Windows SharePoint Services commit the changes made in the previous Windows SharePoint Services-specific workflow activities…
The not-so-clever workaround is to pause the workflow in between actions. When you add a pause, the workflow “cannot proceed because it is waiting for some event to occur” and therefore makes it a commit point. Not pretty – but works.
I hope that you find this article of use in your SharePoint Designer workflow troubleshooting endeavours. Below, I have pasted the complete stack traces (for the search engines).
Thanks for reading
06/25/2009 12:25:20.46 w3wp.exe (0x0BA8) 0x16F8 Windows SharePoint Services General 0 Unexpected ERROR: request not found in the TrackedRequests. We might be creating and closing webs on different threads. ThreadId = 1, Free call stack = at Microsoft.SharePoint.SPRequestManager.Release(SPRequest request) at Microsoft.SharePoint.SPWeb.Invalidate() at Microsoft.SharePoint.SPWeb.Close() at Microsoft.SharePoint.SPSite.Close() at Microsoft.SharePoint.SPSite.Dispose() at Microsoft.SharePoint.Workflow.SPWorkflowManager.<>c__DisplayClass1.<StartWorkflow>b__0() at Microsoft.SharePoint.SPSecurity.CodeToRunElevatedWrapper(Object state) at Microsoft.SharePoint.SPSecurity.<>c__DisplayClass4.<RunWithElevatedPrivileges>b__2() at Microsoft.SharePoint.Utilities.SecurityContext.RunAsProcess(CodeToRunElevated secureCode) at Microsoft.SharePoint.SPSecurity.RunWithEl…
06/25/2009 12:25:20.46* w3wp.exe (0x0BA8) 0x16F8 Windows SharePoint Services General 0 Unexpected …evatedPrivileges(WaitCallback secureCode, Object param) at Microsoft.SharePoint.SPSecurity.RunWithElevatedPrivileges(CodeToRunElevated secureCode) at Microsoft.SharePoint.Workflow.SPWorkflowManager.StartWorkflow(SPListItem item, SPWorkflowAssociation association, SPWorkflowEvent startEvent, Boolean bAutoStart, Boolean bCreateOnly) at Microsoft.SharePoint.Workflow.SPWorkflowManager.StartWorkflow(SPListItem item, SPWorkflowAssociation association, String eventData, Boolean isAutoStart) at Microsoft.SharePoint.Workflow.SPWorkflowManager.StartWorkflow(SPListItem item, SPWorkflowAssociation association, String eventData) at Microsoft.SharePoint.WebControls.SPWorkflowDataSourceView.Insert(IDictionary values) at Microsoft.SharePoint.WebControls.SPWorkflowDataSourceView.Ins…
06/25/2009 12:25:20.46* w3wp.exe (0x0BA8) 0x16F8 Windows SharePoint Services General 0 Unexpected …ert(IDictionary values, DataSourceViewOperationCallback callback) at Microsoft.SharePoint.WebPartPages.DataFormWebPart.FlatCommit() at Microsoft.SharePoint.WebPartPages.DataFormWebPart.PerformCommit() at Microsoft.SharePoint.WebPartPages.DataFormWebPart.HandleOnSave(Object sender, EventArgs e) at Microsoft.SharePoint.WebPartPages.DataFormWebPart.RaisePostBackEvent(String eventArgument) at System.Web.UI.Page.RaisePostBackEvent(IPostBackEventHandler sourceControl, String eventArgument) at System.Web.UI.Page.RaisePostBackEvent(NameValueCollection postData) at System.Web.UI.Page.ProcessRequestMain(Boolean includeStagesBeforeAsyncPoint, Boolean includeStagesAfterAsyncPoint) at System.Web.UI.Page.ProcessRequest(Boolean includeStagesBeforeAsyncPoint, Boolean includ…
06/25/2009 12:25:20.46* w3wp.exe (0x0BA8) 0x16F8 Windows SharePoint Services General 0 Unexpected …eStagesAfterAsyncPoint) at System.Web.UI.Page.ProcessRequest() at System.Web.UI.Page.ProcessRequestWithNoAssert(HttpContext context) at System.Web.UI.Page.ProcessRequest(HttpContext context) at System.Web.HttpApplication.CallHandlerExecutionStep.System.Web.HttpApplication.IExecutionStep.Execute() at System.Web.HttpApplication.ExecuteStep(IExecutionStep step, Boolean& completedSynchronously) at System.Web.HttpApplication.ApplicationStepManager.ResumeSteps(Exception error) at System.Web.HttpApplication.System.Web.IHttpAsyncHandler.BeginProcessRequest(HttpContext context, AsyncCallback cb, Object extraData) at System.Web.HttpRuntime.ProcessRequestInternal(HttpWorkerRequest wr) at System.Web.HttpRuntime.ProcessRequestNoDemand(HttpWorkerRequest wr) at Sys…
06/25/2009 12:25:20.46* w3wp.exe (0x0BA8) 0x16F8 Windows SharePoint Services General 0 Unexpected …tem.Web.Hosting.ISAPIRuntime.ProcessRequest(IntPtr ecb, Int32 iWRType) , Allocation call stack (if present) at Microsoft.SharePoint.Library.SPRequest..ctor() at Microsoft.SharePoint.SPGlobal.CreateSPRequestAndSetIdentity(Boolean bNotGlobalAdminCode, String strUrl, Boolean bNotAddToContext, Byte UserToken, String userName, Boolean bIgnoreTokenTimeout, Boolean bAsAnonymous) at Microsoft.SharePoint.SPWeb.InitializeSPRequest() at Microsoft.SharePoint.SPWeb.EnsureSPRequest() at Microsoft.SharePoint.SPWeb.get_Request() at Microsoft.SharePoint.SPWeb.InitWebPublic() at Microsoft.SharePoint.SPWeb.get_ServerRelativeUrl() at Microsoft.SharePoint.SPWeb.get_Url() at Microsoft.SharePoint.SPUser.InitMember() at Microsoft.SharePoint.SPUser..ctor(SPWeb web, ISecura…
06/25/2009 12:25:20.46* w3wp.exe (0x0BA8) 0x16F8 Windows SharePoint Services General 0 Unexpected …bleObject scope, String strIdentifier, Object[,] arrUsersData, UInt32 index, Int32 iByParamId, String strByParamSID, String strByParamEmail, SPUserCollectionType userCollectionType, Boolean isSiteAuditor) at Microsoft.SharePoint.SPUser..ctor(SPWeb web, ISecurableObject scope, String strIdentifier, Object[,] arrUsersData, UInt32 index, Int32 iByParamId, String strByParamSID, String strByParamEmail, SPUserCollectionType userCollectionType) at Microsoft.SharePoint.SPUserCollection.GetByIDNoThrow(Int32 id) at Microsoft.SharePoint.SPSite.get_SystemAccount() at Microsoft.SharePoint.WorkflowActions.Helper.ResolveUserField(WorkflowContext context, Object fvalue) at Microsoft.SharePoint.WorkflowActions.Helper.LookupUser(WorkflowContext context, Guid listId, Int32 listItem, Strin…
06/25/2009 12:25:20.46* w3wp.exe (0x0BA8) 0x16F8 Windows SharePoint Services General 0 Unexpected …g fieldName) at Microsoft.SharePoint.WorkflowActions.Helper.LookupUser(WorkflowContext context, String listIdOrName, Int32 listItem, String fieldName) at Microsoft.SharePoint.WorkflowActions.LookupActivity.Execute(ActivityExecutionContext provider) at System.Workflow.ComponentModel.ActivityExecutor`1.Execute(T activity, ActivityExecutionContext executionContext) at System.Workflow.ComponentModel.ActivityExecutor`1.Execute(Activity activity, ActivityExecutionContext executionContext) at System.Workflow.ComponentModel.ActivityExecutorOperation.Run(IWorkflowCoreRuntime workflowCoreRuntime) at System.Workflow.Runtime.Scheduler.Run() at System.Workflow.Runtime.WorkflowExecutor.RunScheduler() at System.Workflow.Runtime.WorkflowExecutor.RunSome(Object ignored) …
06/25/2009 12:25:20.46* w3wp.exe (0x0BA8) 0x16F8 Windows SharePoint Services General 0 Unexpected …at System.Workflow.Runtime.Hosting.DefaultWorkflowSchedulerService.WorkItem.Invoke(WorkflowSchedulerService service) at System.Workflow.Runtime.Hosting.DefaultWorkflowSchedulerService.QueueWorkerProcess(Object state) at System.Threading._ThreadPoolWaitCallback.WaitCallback_Context(Object state) at System.Threading.ExecutionContext.runTryCode(Object userData) at System.Runtime.CompilerServices.RuntimeHelpers.ExecuteCodeWithGuaranteedCleanup(TryCode code, CleanupCode backoutCode, Object userData) at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state) at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state) at System.Threading._ThreadPoolW…
06/25/2009 12:25:20.46* w3wp.exe (0x0BA8) 0x16F8 Windows SharePoint Services General 0 Unexpected …aitCallback.PerformWaitCallbackInternal(_ThreadPoolWaitCallback tpWaitCallBack) at System.Threading._ThreadPoolWaitCallback.PerformWaitCallback(Object state)
06/25/2009 12:27:16.49 w3wp.exe (0x0BA8) 0x1CB4 Windows SharePoint Services Workflow Infrastructure 88xr Unexpected WinWF Internal Error, terminating workflow Id# 02334a17-3211-4d30-902f-bf34e20354c6
06/25/2009 12:27:16.49 w3wp.exe (0x0BA8) 0x1CB4 Windows SharePoint Services Workflow Infrastructure 98d4 Unexpected System.Workflow.Runtime.Hosting.PersistenceException: Object reference not set to an instance of an object. —> System.NullReferenceException: Object reference not set to an instance of an object. at DP.Sharepoint.Workflow.Common.RemoveListItemPermissionEntry(SPListItem item, String principalName, Boolean breakRoleInheritance) at DP.Sharepoint.Workflow.PermissionsService.<>c__DisplayClass1.<ProcessGrantRequest>b__0() at Microsoft.SharePoint.SPSecurity.CodeToRunElevatedWrapper(Object state) at Microsoft.SharePoint.SPSecurity.<>c__DisplayClass4.<RunWithElevatedPrivileges>b__2() at Microsoft.SharePoint.Utilities.SecurityContext.RunAsProcess(CodeToRunElevated secureCode) at Microsoft.SharePoint.SPSecurity.RunWithElevatedPrivileges(WaitCallback secureCode, Object param)…
06/25/2009 12:27:16.49* w3wp.exe (0x0BA8) 0x1CB4 Windows SharePoint Services Workflow Infrastructure 98d4 Unexpected … at Microsoft.SharePoint.SPSecurity.RunWithElevatedPrivileges(CodeToRunElevated secureCode) at DP.Sharepoint.Workflow.PermissionsService.ProcessGrantRequest(PermissionRequest pr) at DP.Sharepoint.Workflow.PermissionsService.Commit(Transaction transaction, ICollection items) at System.Workflow.Runtime.WorkBatch.PendingWorkCollection.Commit(Transaction transaction) at System.Workflow.Runtime.WorkBatch.Commit(Transaction transaction) at System.Workflow.Runtime.VolatileResourceManager.Commit() at System.Workflow.Runtime.WorkflowExecutor.DoResourceManagerCommit() at System.Workflow.Runtime.Hosting.WorkflowCommitWorkBatchService.CommitWorkBatch(CommitWorkBatchCallback commitWorkBatchCallback) at System.Workflow.Runtime.Hosting.DefaultWorkflowCommitWorkBatchSer…
06/25/2009 12:27:16.49* w3wp.exe (0x0BA8) 0x1CB4 Windows SharePoint Services Workflow Infrastructure 98d4 Unexpected …vice.CommitWorkBatch(CommitWorkBatchCallback commitWorkBatchCallback) at System.Workflow.Runtime.WorkflowExecutor.CommitTransaction(Activity activityContext) at System.Workflow.Runtime.WorkflowExecutor.Persist(Activity dynamicActivity, Boolean unlock, Boolean needsCompensation) — End of inner exception stack trace — at System.Workflow.Runtime.WorkflowExecutor.Persist(Activity dynamicActivity, Boolean unlock, Boolean needsCompensation) at System.Workflow.Runtime.WorkflowExecutor.ProtectedPersist(Boolean unlock)