Glyma is now open source!

Send to Kindle

Hi all

If you are not aware, my colleagues and I have spend a large chunk of the last few years developing a software tool for SharePoint called Glyma (pronounced “Glimmer”). Glyma is a very powerful  knowledge management solution for SharePoint 2010/2013, that deals with knowledge that is highly valuable, yet difficult to capture in writing – all that hard earned knowledge that tends to walk out the door in organisations.

Glyma was born from Seven Sigma’s Dialogue Mapping skills and it represents a lot of what we do as an organisation, and the culmination of many years of experience in the world of complex problem facilitation. We have been using Glyma as a consultancy value-add for some time, and our clients have gained a lot of benefit from it. Clients have also deployed it in their environments for reasons such as capture of knowledge, lessons learnt, strategic planning, corporate governance as well as business analysis, critical thinking and other knowledge visualisation/knowledge exchange scenarios.


I am very pleased to let people know that we have now decided to release Glyma under an open source license (Apache 2). This means you are free to download the source and use it in any manner you see fit.

You can download the source code from Chris Tomich’s githib site or you can contact me or Chris for the binaries. The install/user and admin manuals can be found from the Glyma web site, which also has a really nice help system, tutorial videos and advice on how to build good Glyma maps.

This is not just some sample code we have uploaded. This is a highly featured, well architected and robust product with some really nice SharePoint integration. In particular for my colleague, Chris Tomich, this represents a massive achievement as a developer/product architect. He has created a highly flexible graph database with some real innovation behind it. Technically, Glyma is a hypergraph database, that sits on SQL/SharePoint. Very few databases of this type exist outside of academia/maths nerds and very few people could pull off what he has done.


For those of you that use/have tried Compendium software, Glyma extends the ideas of Compendium (and can import Compendium maps), while bringing it into the world of enterprise information management via SharePoint.

Below I have embedded a video to give you an idea of what Glyma is capable of. More videos exist on Youtube as well as the Glyma site, so be sure to dig deeper.


I look forward to hearing how organsiations make use of it. Of course, feel free to contact me for training/mentoring and any other value-add services Smile



Paul Culmsee

 Digg  Facebook  StumbleUpon  Technorati  Slashdot  Twitter  Sphinn  Mixx  Google  DZone 

No Tags

Send to Kindle

Trials or tribulation? Inside SharePoint 2013 workflows–Part 5

This entry is part 5 of 13 in the series Workflow
Send to Kindle

Hi and welcome to part 5 of my series of articles that take a peek under the hood of SharePoint 2013 workflows. These articles are pitched at wide audience, and my hope is that they give you some insights into what SharePoint Designer 2013 workflows can do (and cannot). This is a long tale of initially dashed hopes and then redemption, based around what I think is a fairly common business scenario. To that end, the scenario we are using for this series is a basic document approval workflow for a fictitious diversified multinational company called Megacorp. It consist a Documents library and Process Owners list. A managed metadata based site column called Organisation has been added to each of them. In the second post we created a very basic workflow, using the task approval action. In part 3 and part 4, we have been trying to get around various issues encountered. At the end of the last post, we just learnt that Managed Metadata column cannot be filtered via the REST calls used by the built-in SharePoint Designer workflow actions.

…or can they?

In this post, we are going to take a look at two particular capabilities of the new SharePoint 2013 workflow regime and see if we can use them to get out of this pickle we are in. Once again, a reminder that this article is pitched at a wide audience, some of whom are non developers. As a result I am taking things slow…

Capability 1: Dictionaries

SharePoint workflows have always been able to store data in variables, which allows for temporary storage of data within the workflow. When creating a variable, you have to specify what format the data is in, such as string, integer or date. In SharePoint 2013, there is a new variable type called a Dictionary. A dictionary can be used to store quite complex data, because it is in effect, a collection of other variables. Why does this matter? Well, consider this small snippet of XML data below. I could store all of this data in a single dictionary variable and call it CD, rather than multiple stand-alone variables.

Snapshot  <CD>
    <ARTIST>Bob Dylan</ARTIST>

Now storing complex data in a single variable is all well and good, but what about manipulating it once you have it? As it happens, three new workflow actions have been specifically designed to work with dictionary data, namely:

  • Build Dictionary
  • Get an Item from a Dictionary
  • Count Items in a Dictionary

The diagram below illustrates these actions (this figure came from an excellent MSDN article on the dictionary capability). You will be using the “Build Dictionary” and “Get an item from a dictionary” actions quite a bit before we are done with this series.


There is one additional thing worth noting with dictionaries. Something subtle but super important.  A Dictionary can contain any type of variable available in the SharePoint 2013 Workflow platform, including other dictionary variables! If this messes with your head, let’s extend upon the XML example above of the Bob Dylan album. Let’s say you have an entire catalog of CD’s. For example:

     <ARTIST>Bob Dylan</ARTIST>
     <ARTIST>Keith Urban</ARTIST>

Using a dictionary, we can create a single variable to store details about all of the CD’s in the catalog. We could make a Dictionary variable called Catalog, which contains a dictionary variable called CD. The CD variable contains the string and date/time details for each individual CD. This structure enables the Catalog dictionary to store details of many CD’s. Below is a representation on what 3 CD’s would look like in this model…


Okay, so after explaining to you this idea of a dictionary, you might be thinking “What has all of this got to do with our workflow?” To answer that, have another look at the JSON output from part 4 in this series. If you recall, this is the output from talking to SharePoint via REST and asking for all documents in the document library. What do you notice about the information that has come back? To give you a hint, below is the JSON output I will remind you what I said in the last post…

Now let’s take a closer look at Organisation entry in the JSON data. What you might notice is that some of the other data entries have data values specified, such as EditorID = 1, AuthorID =  1 and Modified = 2013-11-10. But not so with Organisation. Rather than have a data value, it has sub entries. Expanding the Organisation section and you can see that we have more data elements within it.

image_thumb14  image_thumb16

In case it is still not clear, essentially we are looking at a data structure that is perfectly suited to being stored in a dictionary variable. In the case of the Organisation column, it is a “dictionary within a dictionary” scenario like my CD catalog example.

Okay, I hear you say – “I get that a dictionary can store complex data structures. Why is this important?”

The answer to that my friends, is that there is a new, powerful workflow action that makes extensive use of dictionaries. You will come to love this particular workflow action, such as its versatility.

Capability 2: The one workflow action to rule them all…


In part 3 and part 4 of this series, I have shown examples of talking to SharePoint via REST webservices and showing what the returning JSON data looks like. This was quite deliberate, because In SharePoint 2013, Microsoft has included a workflow action called Call HTTP Web Service to do exactly the same thing. This is a huge advance on previous versions of SharePoint, because it means the actions that workflows can take are only limited by the webservices that they talk to. This goes way beyond SharePoint too, as many other platforms expose data via a REST API, such as YouTube, Ebay, Twitter, as well as enterprise systems like MySQL. Already, various examples exist online where people have wired up SharePoint workflows to other systems. Fabian Williams in particular has done a brilliant job in this regard.

The workflow action can be seen below. Take a moment to examine all of the bits you need to fill in as there are several parameters you might use. The first parameter (the hyperlink labelled “this”) is the URL of the webservice that you wish to access. The request parameter is a dictionary variable that is sometimes used when making a request to the webservice. The response and responseheaders variables are also dictionaries and they store the response received from the webservice. The responseCode parameter represents the HTTP response code that came back, which enables us to check if there was an error (like a HTTP 400 bad request).


Dictionaries and web services – a simple example…

The best way to understand what we can do with this workflow action (and the dictionary variables that it requires) is via example. So let’s leave our document approval workflow for the time being and quickly make a site workflow that calls a public webservice, grabs some data and displays it in SharePoint. The public webservice we will use is called Feedzilla. Feedzilla is a news aggregator service that lets you search for articles of interest and bring them back as a data feed. For example: the following feedzilla URL will display any top news (category 26) that has SharePoint in the content. It will return this information in JSON format:

Here is a fiddler trace of the above URL, showing the JSON output.  Note the structure of articles, where below the JSON label at the top we have articles –> {} and then the properties of author, publish_date, source, source_url, summary, title and url.


Therefore the string “articles(0)/title” should return us the string “Microsoft Certifications for High School Students in Australia (Slashdot)” as it is the title of the first article. The string articles(1)/title should bring back “Microsoft to deliver Office 2013 SP1 in early ’14 (InfoWorld)” as it is the second article. So with this in mind, let’s see if we can get SharePoint to extract the title of the first article in the feed.

Testing it out…

So let’s make a new site workflow based workflow.

Step 1:

From the ribbon, choose Site Workflow. Call the site workflow “Feedzilla test” as shown below…

image  image

Now we will add our Call HTTP Web Service action. This time, we will add the action a different way.

Step 2:

Click the flashing cursor underneath the workflow stage and type in “Call”. As you type in each letter, SharePoint Designer will suggest matching actions. By the time you write “call”,  there is only the Call HTTP Web Service action to choose from. Pressing enter will add it to the workflow.



Step 3:

Now click on the “this” hyperlink and paste in the feedzilla URL of: Then click OK.


Next, we need to create a dictionary variable to store the JSON data that is going to come back from Feedzilla.

Step 4:

Click on the “response” hyperlink next to the “ResponseContent to” label and choose to Create a new variable…


Step 5:

Call the variable JSONResponse and confirm that it’s type is Dictionary. We are now done with the Call HTTP web service action.

image  image

Step 6:

Next step in the workflow is to extract just the article title from the JSON data returned by the web service call. For this, we need to use the Get an Item from a Dictionary action. We will use this action to extract the title property from the very first article in the feed. Type in the word “get” and press enter – the action we want will be added…


Step 7:

In the “item by name or path” hyperlink next to the Get label, type in this exactly: articles(0)/title as shown below. Then click on the “dictionary” hyperlink next to the from label and choose the JSONResponse variable that was created earlier. Finally, we need to save the extracted article title to a string variable. Click on the “this” hyperlink next to the Output to label and choose Create a new variable… In the edit variable screen, name the variable ArticleTitle and set its Type to String.



Step 8:

The next step is to log the contents of the variable ArticleTitle to the workflow history list. The action is called Log to History List as shown below.  Click the “message” hyperlink for this action and click the fx button. Choose Workflow Variables and Parameters from the Data Source dropdown and in the Field from Source dropdown, choose the variable called ArticleTitle. Click OK.


image  image

Step 9:

Finally, add a Go to End of Workflow action in the Transition to Stage section. The workflow is now complete and ready for testing.


Testing the workflow…

To run a site workflow, navigate to site contents in SharePoint, and click on the SITE WORKFLOWS link to the right of the “Lists, Libraries and other Apps” label. Your newly minted workflow should be listed under the Start a New Workflow link. Click your workflow to run it.

image  image

The workflow will be fired off in the background, and you will be redirected back to the workflow status screen. Click to refresh the page and you should see your workflow listed as completed as shown below…


Click on the workflow link in the “My Completed Workflows” section and examine the detailed workflow output. Look to the bottom where the workflow history is stored. Wohoo! There is our article name! It worked!



It was nice to have a post that was more tribulation than trial eh?

By now you should be more familiar with the idea of calling HTTP web services within workflows and parsing dictionary variables for the output. This functionality is really important because it opens up possibilities for SharePoint workflows that were previously not possible. For citizen developers, the implication (at least in a SharePoint context) is that understanding how to call a web service and parse the result is a must. Therefore all that REST/oData stuff you skipped in part 4 will need to be understood to progress forward in this series.

Speaking of progressing forward, in the next post, we are going to revisit our approval workflow and see if we can use this newfound knowledge to move forward. First up, we need to find out if there is a web service available that can help us look up the process owner for an organisation. To achieve this, we are going to need to learn more about the Fiddler web debugging tool, as well as delve deeper into web services than you ever thought possible. Along the way, SharePoint will continue to put some roadblocks up but fear not, we are turning the corner…

Until then, thanks for reading and I hope these articles have been helpful to you.

Paul Culmsee


 Digg  Facebook  StumbleUpon  Technorati  Slashdot  Twitter  Sphinn  Mixx  Google  DZone 

No Tags

Send to Kindle

Rethinking SharePoint Maturity Part 3: Who moved my cheese?

This entry is part 3 of 5 in the series Maturity
Send to Kindle

Hi all

Welcome to part 3 in this series about rethinking what SharePoint “maturity” looks like. In the first post, I introduced the work of JR Hackman and his notion of trying to create enabling conditions, rather than attribute cause and effect. Hackman, in his examination of leadership and the performance of teams, listed six conditions that he felt led to better results if they were in place. Those conditions were:

  1. A real team: Interdependence among members, clear boundaries distinguishing members from non-members and moderate stability of membership over time
  2. A compelling purpose: A purpose that is clear, challenging, and consequential. It energizes team members  and fully engages their talents
  3. Right people: People who had task expertise, self organised and skill in working collaboratively with others
  4. Clear norms of conduct: Team understands clearly what behaviours are, and are not, acceptable
  5. A supportive organisational context: The team has the resources it needs and the reward system provides recognition and positive consequences for excellent team performance
  6. Appropriate coaching: The right sort of coaching for the team was provided at the right time

I then got interested in how applicable these conditions were to SharePoint projects. The first question I asked myself was “I wonder if Hackman’s conditions apply to collaboration itself, as opposed to teams.” To find out, I utilised some really interesting work done by the Wilder Research Group, that produced a book called “Collaboration: What Makes It Work.” This book distilled the wisdom from 281 research studies on collaborative case studies and their success or failure. They distilled things down to six focus areas (they ended up with the same number as Hackman). Their six were:

  1. Membership characteristics: (Skills, attributes and opinions of individuals as a collaborative group, as well as culture and capacity of orgs that form collaborative groups)
  2. Purpose: (The reasons for the collaborative effort, the result or vision being sought)
  3. Process and structure: (Management, decision making and operational systems of a collaborative context)
  4. Communication: (The channels used by partners to exchange information, keep each-other informed and convey opinions to influence)
  5. Environment: (Geo-location and social context where a collaborative group exists. While they can influence, they cannot control)
  6. Resources: (The financial and human input necessary to develop and sustain a collaborative group)

If you want the fuller detail of Hackman and Wilder, check the first and second posts respectively. But it should be clear from even a cursory look at the above lists, that there is a lot of overlap and common themes between these two research efforts and we can learn from them in our SharePoint work. I strongly believe that this sort of material constitutes a critical gap in a lot of the material out there on what it takes to have a successful SharePoint deployment and offers some excellent ideas in further developing ideas around SharePoint maturity. I started to develop a fairly comprehensive Dialogue Map of both of these research efforts so I could synthesise them to create my own set of “conditions” in the way Hackman describes. While I was doing this, I met a fellow via LinkedIn who opened my mind to further possibilities. Everybody, meet Stephen Duffield

Duffield’s SYLLK model for lessons learnt

I met Steve because we both shared a common interest in organisational knowledge management. In, fact Steve is working on his PhD in this area, focussing on addressing the pitiful record of organisations utilising lesson learnt practices on projects and then embedding them into organisational  culture and practices. If you have ever filled out a lessons learnt form, knowing full-well that it will disappear into a filing cabinet never to be seen again, Steve shares your frustration. For his PhD, he is tackling two research questions:

  1. What are the significant factors that negatively influence the capture, dissemination and application of lessons learned from completed projects within project-based organisations?
  2. Can a systemic knowledge model positively influence the capture, dissemination and application of project management lessons learned between project teams within the organisation?

Now if you think it was impressive that Wilder researched 281 studies on collaboration, Steve topped them by miles. His PhD literature review covered over 500+ papers on the topics of project lessons learned, knowledge management, risk management and the like. 500! Man, that’s crazy – all I can say to that is I am sure as hell glad he did it and I didn’t have to!

So what was the result of Duffield’s work? In a nutshell, he has developed a model called “Systemic Lessons Learned Knowledge” (SYLLK), which was influenced by the Swiss Cheese model for risk management, originally proposed by Dante Orlandella and James T. Reason.

Why SYLLK is important for SharePoint

imageBefore I explain Duffield’s SYLLK model, it is important I briefly explain the Swiss Cheese model for risk management that inspired him. The Swiss Cheese Model (see the image to the left) for risk management is commonly used in aviation and healthcare safety. It is based on the notion that systems have potential for failure in many areas and these are analogous to a stack of slices of Swiss cheese, where the holes in each slice are opportunities for a process to fail. Each of the slices are “defensive layers” and while an error may allow a problem to pass through a hole in one layer, in the next layer the holes are in different places, allowing the problem to be caught before its impact becomes severe.

The key to the Swiss Cheese Model is that it assumes that no single defence layer is sufficient to mitigate risk. It also implies that if risk mitigation strategies exist, yet all of the holes are lined up, this is an inherently flawed system. Why? because it would allow a problem to progress through all controls and adversely affect the organisation. Therefore, its use encourages a more balanced view of how risks are identified and managed.

So think about that for a second… SharePoint projects to this day remain difficult to get right. If you are on your third attempt at SharePoint, then by definition you’ve had previous failed SharePoint projects. The inference when applying the Swiss cheese model is that your delivery approach is inherently flawed and you have not sufficiently learnt from it. In other words, you were – and maybe still are – missing some important slices of cheese from your arsenal. From a SharePoint maturity perspective, we need to know what those missing slices are if we wish to raise the bar.

So the challenge I have for you is this: If you have had a failed or semi-failed SharePoint project or two under your belt, did you or others on your team ever say to yourself “We’ll get it right this time” and then find that the results never met expectations? If you did, then Duffield’s (and my) contention is you might have failed to truly understand the factors that caused the failure.

Back to Duffield…

This is where Duffield’s work gets super interesting. He realised that the original Swiss cheese “slices” that resolved around safety were inappropriate for a typical organisation managing their projects. Like the Wilder work on collaboration, Steve reviewed tons of literature and synthesised from it, what he thinks are the key slices of cheese that are required to enable not only mitigation of project risks, but also focus people on the critical areas that need to be examined to capture the full gamut of lessons learnt on projects.

So how many slices of cheese do you think Steve came up with? If you read the previous two posts then you can already guess at the answer. Six!

There really seems to be something special about the number 6! We have Hackman coming up with 6 conditions for high performing teams, Wilder’s 6 factors that make a difference in successful collaboration and Duffield’s 6 areas that are critical to organisational learning from projects! For the record, here are Duffield’s six areas (the first three are labelled as people factors and the second three are system factors):

  1. Learning: Whether individuals on the team are skilled, have the right skills for their role and whether they are kept up-skilled
  2. Culture: What participants do, what role they fulfil, how an atmosphere of trust is developed in which people are encouraged, even rewarded for truth telling– but in which they are also clear about where the line must be drawn between acceptable and unacceptable behaviour”
  3. Social: How people relate to each-other, their interdependence and how they operate as a team
  4. Technology: Ensuring that technology and data supports outcomes and does not get in the way
  5. Process: Ensuring the appropriate protocols drive people’s behaviour and inform what they do (gate, checklists, etc.)
  6. Infrastructure: Environment (in terms of structure and facilities) that enable project outcomes

Duffield has a diagram that illustrates the SYLLK model, showing how his six identified organisational elements of learning, culture, social, technology, process and infrastructure align as Swiss cheese slices. I have pasted it (with permission), below (click to enlarge).

Duffield states that the SYLLK model represents “the various organisational systems that collectively form the overall behaviour of the organisation. The various modes of social and cultural learning, along with the organisational processes, infrastructure and technology that support them.” Notice in the above diagram how the holes in each slice are not lined up when the project arrow moves right to left. This makes sense because the whole point of the model is the idea of “defence in depth.” But then the holes are aligned when moving from left to right. This is because each slice of cheese need to be aligned to enable the feedback loop – the effective dissemination and application of the identified lessons.


The notion of the Swiss cheese model for mitigating risk makes a heck of a lot of sense for SharePoint projects, given that

  • a) there is a myriad of technical and non technical factors that have to be aligned for sustained SharePoint success, and
  • b) SharePoint success remains persistently illusive for many organisations.

What Duffield has done with the SYLLK model is to take the Swiss Cheese model out of the cloistered confines of safety management and into organisational learning through projects. This is huge in my opinion, and creates a platform for lots of innovative approaches around the capture and use of organisational learning, all the while framing it around the key project management task of identifying and mitigating risk. From a SharePoint maturity perspective, it gives us a very powerful approach to see various aspects of SharePoint project delivery in a whole new light, giving focus to aspects that are often not given due consideration.

Like the Wilder model, I love the fact that Duffield has done such a systematic and rigorous review of literature and I also love the fact that his area of research is quite distinct from Hackman (conditions that enable team efficacy) and the Wilder team (factors influencing successful collaboration). When you think about it, each of the three research efforts focuses on distinct areas of the life-cycle of a project. Hackman looks at the enabling conditions required before you commence a project and what needs to be maintained. Wilder appears to focus more on what is happening during a project, by examining what successful collaboration looks like. Duffield then looks at the result of a project in terms of the lessons learnt and how this can shape future projects (which brings us back to Hackman’s enabling conditions).

While all that is interesting and valuable, the honest truth is that I liked the fact that all three of these efforts all ended up with six “things”. It seemed preordained for me to “munge” them together to see what they collectively tell us about SharePoint maturity.

… and that’s precisely what I did. In the next post we will examine the results.


Thanks for reading


Paul Culmsee

 Digg  Facebook  StumbleUpon  Technorati  Slashdot  Twitter  Sphinn  Mixx  Google  DZone 

No Tags

Send to Kindle

A tribute to the humble “leave form” – Part 5

Send to Kindle

[Note: It appears that SharePoint magazine has bitten the dust and with it went my old series on the “tribute to the humble leave form”. I am still getting requests to a) finish it and b) republish it. So I am reposting it to here on cleverworkarounds. If you have not seen this before, bear in mind it was first published in 2008.]

image_thumb[2]Welcome again students to part 5 of the CleverWorkArounds Institute Body of Knowledge (CIBOK). As you know, the highly prestigious industry certification, the CleverWorkarounds Certified Leave Form Developer (CCLFD) requires candidates to demonstrate proficiency in all aspects of vacation/leave forms. Parts 1-4 of this series covered the introductory material, and now we move into the more advanced topic areas.

Wondering what the hell I am talking about? Perhaps you’d better read part 4.

This post represents a change of tack. the first four articles were written from the point of view of demonstrating InfoPath in a pre-sales capacity. Your “business development managers”, which is a politically correct term for “steak-knife salesmen”, have promised the clients that InfoPath is so good that it can also make your coffee too. Essentially, anything to make the sale and earn their commission. Of course, they don’t have to stick around to actually implement it. They have moved off onto their next victim, and you are left to satisfy the lofty expectations.

Now, we switch into “implementation engineer” mode where a proof of concept trial has been agreed to. At this point we are dealing with three client stakeholders. Monty Burns (project sponsor), Waylon Smithers (process owner) and Homer Simpson (user reference group).

To remind you about parts one to four, we introduced the leave form requirements, and demonstrated how quick and easy it is to create a web based InfoPath form and publish it into SharePoint with no programming whatsoever. Was it a realistic demo? Of course not! But we have sold the client the dream and now we have to turn that dream into reality!

Here are the original requirements.

  • Automatic identification of requestor
  • Reduce data entry
  • Validation of dates, specifically the automatic calculation of days absent (excluding weekends)
  • Mr Burns is not the only approver, we will need the workflow to route the leave approval to the right supervisor
  • We have a central leave calendar to updateSo in this article, we will deal with the first requirement.

Automatic Identification of Homer

Now, some of the staff of the Springfield Nuclear Power aren’t the most computer literate. Technical support staff report that one employee in particular, Homer Simpson, is particularly bad. In between sleeping on the job and chronic body odour, he has been known to hunt in vain for the “any” key on the keyboard. So he was chosen to be the user acceptance test to ensure that the entire process is completely idiot-proof.

The way we will do this is to automatically fill in the full name of the current user, so Homer will not have to type his name in. It would just automagically appear as shown below.


So our process owner, Waylon Smithers, is excited and expectantly asks you, “This is easy to do, right”?

“Sure”, you answer confidently. “It’s built into InfoPath and I do it all the time”.

You then proceed into the properties of the above “Employee Name” textbox and locate the “Default Value” textbox. You then click the magical “fx” button that lets you pick from a bunch of built-in functions as shown below.

image_thumb[7]  image_thumb[9]

Now anybody who has used Excel will be used to the notion of using built-in functions to create dynamically created values. InfoPath provides similar cleverness. There are formulas for mathematical equations, string manipulation, date and time functions as well as a bunch of others.

While I won’t be discussing every built-in function in this article, I encourage you to check out what is available.

Now our intrepid consultant already knows what formula they want to use. The “Insert Function” button is clicked and from the list of functions available (conveniently categorised by type), we choose the function userName as shown below.

image_thumb[11]  image_thumb[13]


So we have now set the default value for the “Employee Name” field to be whatever the data the userName() function decides to give us. So let’s see what our favourite employee Homer Simpson now sees.


“Voila”, you think to yourself. “Homer, please test it.”


Homer: “Herrrrrr Simmmmmpson. Mmmmm. Who is hersimpson”?

Consultant: “Who”?

Homer: “Hersimpson. Is that… Lenny?”

Consultant: “…”

Homer: “Oh, wait! I know! I know! It’s that new sprinkled donut … mmmm donuts….”

Making it Homer Proof

So, it seems we have a problem at the user acceptance testing phase. Apart from drifting off into a donut induced daydream, it is clear that showing the username is not a good idea as Homer has not realised that his userid (hsimpson) represents himself. Clearly it would be more “Homer friendly” if his full name was displayed instead. At this point, we hit our first InfoPath challenge. How do we get the full name? We have no built in function called “fullName()”, so how can we do it?

Hmm, this is a little tougher than everything we have done so far. Perhaps we should get in some additional help from Professor Frink.

image_thumb[20]“Well it’s quite simple, really. We need to create a secondary data source to the SharePoint web service UserProfileService.asmx and call the GetuserprofileByName method, passing it a blank string parameter called AccountName, and then interrogate name value pairs in the response to grab the first and last name and concatenate them into a full name Mm-hai.”

Does anybody want the non-nerd (English) version of the above sentence? :-)

Given that lots of interesting “stuff” lie buried in various applications, databases, files and other “systems”, the designers of InfoPath were well aware that they had to make InfoPath capable of accessing data in these systems. In doing so, electronic forms can reduce duplication, data entry and leverage already entered (and hopefully, sanitised and verified) data.

One of the many different methods of accessing “stuff” is via “web services”. The easiest way to think about web services is to think about Google. When you place a search on Google for say, “teenyboppers”, your browser is making a request to google servers to return “stuff” that relates to your input. In this case, people who actually like Brittany Spears’ songs.

Google is in effect providing a service to you and the protocol that drives the world wide web (HTTP) is the transport mechanism that both parties rely on.

So, a real-life web service is really just a more sophisticated version of this basic idea. One program can chat to another program by “talking” to its web services over HTTP. In this way, two systems can be on the other side of the world, yet be able to communicate with each-other and provide each-other with data and, well … services!

SharePoint is no exception, and happens to have a bunch of web services that allow programs to “talk” to it in many different ways. I am not going to list them all here, but it just so happens that one of those web services, gives us just what we need – the full details of the currently logged-in user. So we are going to get InfoPath to “call” this particular web service and return to us the data we need to make Homer happy.

Okay in theory but…

Now hopefully the basic idea of webservices now makes sense. Actually making the leap to using them does take some learning. Since this is all over HTTP, a web service is simply a website URL.

Real programmers – don’t start getting all anal with me about definitions here. This explanation is for people who speak human!

The URL of the SharePoint web service that we want is this:

http://<address of sharepoint site>/_vti_bin/UserProfileService.asmx



If you point your browser to this URL, you will get a response back, listing all of the various “methods” that this web service provides (click to enlarge). The screenshot below is not exhaustive, but the point is that one web service can actually provide many functions (generally known as methods) to perform all sorts of tasks.


Now, InfoPath likes to make things nice and wizard-based, and all of these above methods operate differently. Some take parameters (I.e. to create a new user account via web service you would have to supply details like name, password and the like). Some do not need parameters but return multiple values while some return a single value. Others still return different values based on parameters that you send to the method.

How can InfoPath possibly know in advance what it needs to send to/receive from our selected method? Fortunately the geeks thought of this and invented an extremely boring language called Web Services Definition Language (or WSDL for short). All you need to know about WSDL is that apart from being a fantastic cure for insomnia if you ever read it, it provides a way for applications, like InfoPath, to find out what it needs to do, to interact with a particular method.

So, using the above web service again, we will actually ask the web service to tell InfoPath all about itself using WSDL by slightly changing the URL as illustrated below


Now, if you try that in a browser you see all sorts of XML crap :-)


But don’t worry, you will never have to actually look at it again – InfoPath digs it – and that’s all you need to know.

InfoPath Data Sources and Data Connections

Now, it is time to actually get InfoPath to talk to this “UserProfileService” web service as described above.

As I said, Web services is one several sources of data that InfoPath can access from. First up, we need to make a connection to a data source (imaginatively called Data Source connections).

From the tools menu, choose “Data Connections” and a dialog box asks you to some information about the sort of data connection that we want to use. We wish to receive Homer’s details from the web service, so we choose to receive data and choose “Web Service” as the source of our data.

image_thumb[23] image_thumb[25] image_thumb[26]

We are next asked for the URL of the web service. We pass it the URL of http://radioactivenet/_vti_bin/UserProfileService.asmx?WSDL


Note: This really should be the exact sub-site where the InfoPath form was published to. So, if say, this form was to be published into a sub site called HR, then the URL would be http://radioactivenet/HR/_vti_bin/UserProfileService.asmx?WSDL.

Clicking NEXT and you are presented with the various methods that this web service provides. The method that we are going to use is “GetUserProfileByName”. We choose it and click “Next”.


At this point, we really start diving deep into talking to web services. InfoPath examines the details of this method by looking at the WSDL information. It determines that this particular method (GetUserProfileByName) expects a parameter to be passed to it called “AccountName”. We are prompted to supply a value for “AccountName”. Fortunately for us, if we do not supply a parameter here, then the web service will use the account name of the currently logged in user! This is exactly what we want. This method will automatically use the username of the current person without us having to manually specify it.

Thus, we can leave this screen as is and click NEXT. The final message asks if we wish to connect to the web service now to retrieve data, but in this case we do not.

image_thumb[32] image_thumb[34]

We have now finished configuration. We are asked to save this connection and give it a name. (As you can see below the name defaults to the method name, so we will leave it unchanged. We also tick the box to connect to the data source as soon as the form is opened).


Using the data connection

We have an InfoPath data connection to this web service. Now what?

Well, let’s go back to our employee name text box and change the default value to something returned by the web service method GetUserProfileByName. As we did earlier, we click the function button, but this time we are not doing to add a function. We will instead use the “Insert Field or Group” button.

image_thumb[38] image_thumb[40]

By default, InfoPath has a “main” Data source, which is the form itself. We now have to tell InfoPath that the default value for the employee name is going to come from a different data source. On the data source drop down, choose secondary data source called “GetUserProfileByName” from the drop down as shown below.

image_thumb[42] image_thumb[44]

Notice that the fields available to choose for the GetUserProfileByName datasource looks very different to the main datasource. This is because the web service returns data in a particular format, and it is now up to us, to interrogate that format to get the information we want.

The GetUserProfileByName method returns a whole bunch of user profile values, way more than the stuff we specifically want. It returns these details in a name/value format as shown below:

Name Value
FirstName Homer
Lastname Simpson
Office Sector 7G
Title Safety Inspector

… and another 42 more name/value pairs like the above!

So we will have to tell InfoPath to examine the full list of “Values”, and find the specific value for “FirstName”.

Note that firstname and lastname are separate items. There is no property called “FullName”. We will deal with this a little later.

Now although telling InfoPath to filter the user profile information is achieved using a wizard, it is not the most intuitive process. Once you have done it a few times, it becomes second nature. But be warned – you may be about to suffer death by screenshot…

Recall in the last screenshot we were looking at the data returned by the web service method called GetUserProfileByName. We need to filter the 46+ possible profile values to the specific one we need. Now you know what that “Filter Data…” button does ;-)

From the list of fields returned by this method, choose “Value” and click “Filter Data”.


We have now told InfoPath that we want one of the “value” fields, but we need to filter it to the specific value that we need. That value is “FirstName”. So in effect we are saying to InfoPath “give me the value for the property where the property called Name = “FirstName”. The sequence of screenshots below shows how I tell Infopath to use the Name field as the filter field.

image_thumb[47] image_thumb[48]

Now we have to set name to equal the property name “FirstName”. The next two screenshots show this.



Click OK and you now have your formula as shown below.


Click a zillion other “OK” buttons and you will be back at your form in design mode. Clicking the “Preview” button and check the Employee name field. Wohoo! It says “Homer!”


But I said “full name” not “first name”…

So, that’s great. Although connecting to a web service and telling it to retrieve the correct data is tricky and requires some training, no custom programming code was written to do this. But alas, poor old Homer still get’s a little confused and we wish to see the full name of the person filling in the form.

Fortunately, this is actually quite easy. We just need to grab the FirstName and the LastName values from the web service and then join them together.

I demonstrate this below (with an adjusted formula – for the sake of article length I’ve not added the screenshots used to create this formula. I am hoping that I gave you enough to figure it out for yourself).

Application developers or Excel people will quickly see that I have retrieved the “LastName” value using the exact same method described in the last section for the “FirstName” value. I then used the built-in function concat (concatenate) to join FirstName and LastName together (with a space in between).


Let’s now preview the form… Wohoo!


So, just to be completely sure, we republish the form, following the steps in part 4 and examine the form in the browser to make sure that it works there as well.


Conclusion (and further notes)

So amazed at your ingenuity, you have managed to get InfoPath to retrieve the details it required so that the user did not have to fill in their name manually. There was certainly a lot more to it than using a built-in InfoPath function, and (for the first time anyway), probably took you a little while. But the main thing is, you have satisfied the first requirement and Waylon Smithers is now happy. He is a little bewildered in all the low-level web service stuff and concerned about how easy it would be for some of his staff to create forms, but regardless, everything was performed via the InfoPath graphical user interface.

So in the next post, we will look at two ways we can deal with the automatic retrieval of the employee number.

Thanks for reading

Paul Culmsee

P.S: Do not attempt to explain the low level details of pulling values from a web service in a presales demo :-) .

 Digg  Facebook  StumbleUpon  Technorati  Slashdot  Twitter  Sphinn  Mixx  Google  DZone 

No Tags

Send to Kindle

Demystifying SharePoint Performance Management Part 6 – The unholy trinity of Latency, IOPS and MBPS

This entry is part 6 of 11 in the series Perf
Send to Kindle

Hi all

Welcome to part 6 on my series in making SharePoint performance management that little more digestible. To recap where we have been, I introduced the series by comparing lead versus lag indicators before launching into an examination of Requests Per Second (RPS) as a performance indicator. I spent 3 posts on RPS and then in the last post, we turned our attention to the notion of latency. We watched a Wiggles Video and then looked at all of the interacting components that work together just to load a SharePoint home page. I spent some time explaining that some forms of latency cannot be reduced because of the laws of physics, but other forms of latency are man made. This is when any one of the interacting components are sub-optimally configured and therefore introduce unnecessary latency into the picture. I then asserted that disk latency was one of the most common area that is ripe for sub-optimal configuration. I then finished that post by looking at how a rotational disk works, the strategies employed to mitigate latency (Cache, RAID, SAN’s etc.)

Now on the note of Cache, RAID and SAN’s Robert Bogue who I mentioned in part 1, has also just published an article on this topic area called Computer Hard Disk Performance – From the Ground Up. You should consider Robert’s article part 5.5 of this series of posts because it expands on what I introduced in the last post and also spans a couple of the things I want to talk about in this one (and goes beyond it too). It is an excellent coverage of many aspects of disk latency and I highly recommend you check it out).

Right! In this post, where will look more closely at latency and understand its relationship with two other commonly cited disk performance measures: IOPS and MBPS. To do so, lets go shopping!

Why groceries help to explain disk performance


Most people dislike having to wait in a line for a check-out at a supermarket and supermarkets know this. So they always try and balance the number of open check-out counters so that they can scale when things are busy, but not pay the operators to standing around when its quiet. Accordingly, it is common to walk into a store when its quiet and only find only one or two check-out counter open, even if the supermarket has a dozen or more of them.

The trend in Australian supermarkets nowadays is to have some modified check-out counters that are labelled as “express.” In these check-outs, you can only use them if you are buying 15 items or less. While the notion of express check-outs has been around forever, the more recent trend is to modify the design of express check-out counters to have very limited counter space and no moving roller that pushes your goods toward the operator. This discourages people with a fully-loaded trolley/cart to use the express lane because there is simply not enough room to unload the goods, have them scanned and put them back in the trolley. Therefore, many more shoppers can go through express counters than regular counters because they all have smaller loads.

This in turn frees up the “regular” check-out counters for shoppers with a large amount of goods. Not only do they have a nice long conveyor belt with plenty of room for shoppers to unload all of their goods onto and rolls to the operator, but often there will be another operator who puts the goods into bags for you as well. Essentially this counter is optimised for people who have a lot of goods.

Now if you were to measure the “performance” of express lanes versus regular lanes, I bet you would see two trends.

  • Express lanes would have more shoppers go through them per hour, but less goods overall
  • Regular lanes would have more goods go through them per hour, but less shoppers overall

With that in mind, lets now delve back into the world of disk IO and see if the trend holds true there as well.

Disk latency and IOPS

In the last post, I specifically focused on disk latency by pointing out that most of the latency in a rotational hard drive is from rotation time and seek time. Rotation time is time taken for the drive to rotate the disk platter to the data being requested and seek time is how long it takes for the hard drive’s read/write head to then be positioned over that data. Depending on how far the rotation and head have to move, latency can vary. Closely related to disk latency is the notion of I/O per second or “IOPS”. IOPS refer to the maximum number of reads and writes that can be performed on a disk in any given second. If we think about our supermarket metaphor, IOPS is equivalent to the number of shoppers that go through a check-out.

The math behind IOPS and how latency affects it is relatively straightforward. Let’s assume a fixed latency for each IO operation for a moment. If for example, your disk has a large latency… say 25 milliseconds between each IO operation, then you would roughly have 40 IOPS. This is because 1 second = 1000 milliseconds. Divide 1000 by 25 and you get 40. Conversely, if you have 5 milliseconds latency, you would get 200 IOPS (1000 / 5 = 200).

Now if you want to see a more detailed examination of IOPS/ latency and the maths behind it, take a look at an excellent post by Ian Atkin. Below I have listed the disk latency and IOPS figures he posted for different speed disks. Note that a 15k RPM disk came in at around 175-210 IOPS which suggests a typical latency average of between 4.7 and 5.7 milliseconds. (1000/175 = 5.7 and 1000/210 = 4.7). Note: Ian’s article explains in depth the maths behind the average calculation in this section of his post.


The big trolley theory of IOPS…

While that math is convenient, the real world is always different to the theoretical reality I painted above. In the world of shopping, imagine if someone with one or two trolleys full of goods like the picture below, decided to use the express check-out. It would mean that all of the other shoppers have to get annoyed and wait around for this shoppers goods to be scanned, bagged and put back into trolley. The net result of this is a reduced number of shoppers going through the check-out too.


While the inefficiencies of a supermarket is something that is easy to visualise for most people, disk infrastructure is less so. So while the size of our trolley has an impact on how many people come through a check-out, in the disk world, the size of the IO request has precisely the same effect. To demonstrate, I ran a basic test using a utility called SQLIO (which I will properly introduce you to in part 7) on one of my virtual machines. Below is the results of writing data randomly to a 500GB disk. In the first test we wrote to the disk using 64KB writes and in the second test we used 4KB writes. The results are below:

Size of Write IOPS Result
64KB 279
4KB 572

Clearly, writing 4KB of data over time resulted in a much higher IOPS than when using 64KB of data. But just because there is a higher IOPS for the 4KB write, do you think that is better performance?

Disk latency and MBPS

So far the discussion has been very IOPS focussed. It is now time to rectify this. In terms of the SQLIO test I performed above, there was one other performance result I omitted to show you – the Megabytes per second (MBPS) of each test. I will now add it to the table below:

Size of Write IOPS Result MBPS Result
64KB 279 17.5
4KB 572 2.25

Interesting eh? This additional performance metric paints a completely different picture. In terms of actual data transferred, the 4KB option did only 2.25 megabytes per second whereas the 64KB transferred almost 8 times that amount! Thus, if you were judging performance based on how much data has been transferred, then the 4KB option has been an epic fail. Imagine the response of 500 SharePoint users, loading the latest 30 megabyte annual report from a document library if SharePoint used 4KB reads … Ouch!

So the obvious question is why did a high IOPS equate to a low MBPS?

The answer is latency again (yup – it always comes back to latency). From the time the disk was given the request to the time it completed, writing 4KB simply doesn’t take as long to write as 64KB does. Therefore there are more IOPS that take place with smaller writes. Add to that, the latency from disk rotation and seek time per IO operation and you start to see why there is such a difference. Eric Slack at Storage Switzerland explains with this simple example:

As an illustration, let’s look at two ways a storage system can handle 7.5GB of data. The first is an application that requires reading ten 750MB files, which may take 100 seconds, meaning the transfer rate is 75MB/s and consumes 10 IOPS. The second application requires reading ten thousand 750KB byte files, the same amount of data, but consumes 10,000 IOPS. Given the fact that a typical disk drive provides less than 200 IOPS, the reads from the second application probably won’t get done in the same 100 seconds that the first application did. This is an example of how different ‘workloads’ can require significantly different performance, while using the same capacity of storage.

Now at this point if I haven’t completely lost you, it should become clear that each of the unholy trinity of latency, IOPS and MBPS should not be judged alone. For example, reporting on IOPS without having some idea of the nature of the IO could seriously mislead. To show you just how much, consider the next example…

Sequential vs. Random IO

Now while we are talking about the IO characteristics of applications, two really important point that I have neglected to mention so far is the range of latency and the impact of sequential IO.

The latency math I did above was deliberately simplified. Seek and rotation time are actually across a range of values because sometimes the disk does not have to rotate the spindle/move the head far. The result is a much reduced seek latency and accordingly, increased IOPS and MPBS. Nevertheless, the IO is still considered random.

Taking that one step further, often we are dealing with large sections of contiguous space on the hard disk. Therefore latency is reduced further because there is virtually no seek time involved. This is known as sequential access. Just to show you how much of a difference sequential access makes, I re-ran the two tests above, but this time writing to sequential areas of the disk and not random. With the reduced seek and rotation time, the difference in IOPS and MBPS is significant.

Size of Write IOPS Result MBPS Result
64KB 2095 131
4KB 4152 16

The IOPS and subsequent MBPS has improved significantly from the previous test to the tune of a 750% improvement. Nevertheless, the size of the request and its relation to IOPS and MPBS still holds true. The smaller the size of the IO request being read or written, the more IOPS requests can be sustained, but the less MBPS throughput can be achieved. The reverse then holds true with larger IO requests.

One conclusion that we can draw from this is that specifying IOPS or MBPS alone has the potential to really distort reality if one does not understand the nature of the IO request in terms of its characteristics. For example: Let’s say that you are told your disk infrastructure has to support 5000 IOPS. If you assumed a 4K IO size that is accessed sequentially, then far fewer disks would be required to achieve the result compared to a 64KB IO accessed randomly. In the 64KB case, you would need many disks in an array configuration.

SQL IO Characteristics

So now we get to the million dollar question. What sort of IO characteristics does SQL and SharePoint have?

I will answer this by again quoting from Ian Atkin’s brilliant “Getting the Hang of IOPS” article. Ian makes a really important point that is relevant to SQL and SharePoint in his article which I quote below:

The problem with databases is that database I/O is unlikely to be sequential in nature. One query could ask for some data at the top of a table, and the next query could request data from 100,000 rows down. In fact, consecutive queries might even be for different databases. If we were to look at the disk level whilst such queries are in action, what we’d see is the head zipping back and forth like mad -apparently moving at random as it tries to read and write data in response to the incoming I/O requests.

In the database scenario, the time it takes for each small I/O request to be serviced is dominated by the time it takes the disk heads to travel to the target location and pick up the data. That is to say, the disk’s response time will now dominate our performance.

Okay, so we know that SQL IO is likely to be random in nature. But what about the typical IO size?

Part of the answer to this question can be found in an appropriately titled article called Understanding Pages and Extents. It is appropriate because as far as SQL server database files and indexes are concerned, the fundamental unit of data storage in SQL Server is an 8KB page. The important point for our discussion is that Disk I/O many read and write operations are performed at the page level. Thus, one might assume that 8KB should be the size assumed when working with IOPS calculations because it is possible for SQL to write 8KB to disk at a time.

Unfortunately though, this is not quite correct for a number of reasons. Firstly, eight contiguous 8KB pages are grouped into something called an extent. Given than an extent is a set of 8 pages, the size of an extent is 64KB. SQL Server generally allocates space in a database on a per-extent basis and performs many reads across extents (64KB). Secondly, SQL Server also has a read-ahead algorithm that means SQL will try and proactively retrieve data pages that are going to be used in the immediate future. A read-ahead is typically from 1 to 128 pages for most editions which translates to between 8KB and 1024KB. (for the record, there is a huge amount of conflicting information online about SQL IO characteristics. Bob Door’s highly regarded SQL Server 2000 I/O basics article is the place to go for more gory detail if you find this stuff interesting).

A read-ahead interlude…

Before we get into SharePoint disk characteristics, it is worthwhile mentioning a great article by Linchi Shea called Performance Impact: Some Data Points on Read-Ahead.  Linchni did an experiment by disabling read-ahead behaviour in SQL Server and measured the performance of a query on 2 million rows. With read-ahead enabled, it took 80 seconds to complete. Without read-ahead it took 210 seconds. The key difference was the size of the IO requests. Without read-ahead the reads were all 8KB as per page size. With read-ahead, it was over 350KB per read. Linchi makes this conclusion:

Clearly, with read-ahead, SQL Server was able to take advantage of large sized I/Os (e.g. ~350KB per read). Large-sized I/Os are generally much more efficient than smaller-sized I/Os, especially when you actually need all the data read from the storage as was the case with the test query. From the table above, it’s evident that the read throughput was significantly higher when read-ahead was enabled than it was when read-ahead was disabled. In other words, without read-ahead, SQL Server was not pushing the storage I/O subsystem hard enough, contributing to a significantly longer query elapsed time.

So for our purposes, lets accept that there will be a range of IO sizes for read/writes to databases between 8KB to 1024KB. For disk IO performance testing purposes, lets assume that much of this is across the extent boundaries of 64KB. Based on our discussion of latency and MBPS where the larger the IO being worked with, the lower the IOPS, we can now get a better sense of just how much disk might need to be put into an array to achieve a particular IOPS target. As we saw with the examples earlier in this post, 64KB IO sizes result in more latency and lower IOPS. Therefore SharePoint components requiring a lot of IOPS may need some pretty serious disk infrastructure.

SharePoint IO Characteristics

This brings us onto our final point for this post. We need to understand what SharePoint components are IO intensive. The best place to start to determine this is page 29 of Microsoft’s capacity planning guide as it supplies a table listing the general performance requirements of SharePoint components. A similar table exists on page 217 of the Planning guide for server farms and environments for Microsoft SharePoint Server 2010. We will finish this post with a modified table that shows all the SharePoint components listed with medium to high IOPS requirements from the capacity planning guide, along with some of the comments from the server farm planning guide. This gives us some direction as to the SharePoint components that should be given particular focus in any sort of planning. Unfortunately, IOPS requirements are inconsistently written about in both documents. Sad smile

Service Application

Service Description


SharePoint Foundation Service

The core SharePoint service for content collaboration.

Almost all of the IOPS occurs in SharePoint content databases. IOPS requirements for content databases vary significantly based on how your environment is being used, and how much disk space and how many servers you have. Microsoft recommends that you compare the predicted workload in your environment to one of the solutions that they have tested. I will be covering this in part 8.


Logging Service

The service that records usage and health indicators for monitoring purposes.

The Usage database can grow very quickly and require significant IOPS. Use one of the following formulas to estimate the amount of IOPS required:
115 × page hits/second
5 × HTTP requests


SharePoint Search Service

The shared service application that provides indexing and querying capabilities. There is a dedicated document that among other things that covers IOPS requirements.

For the Crawl database, search requires from 3,500 to 7,000 IOPS.
For the Property database, search requires 2,000 IOPS.


User Profile Service

The service that powers the social scenarios in SharePoint Server 2010 and enables My Sites, Tagging, Notes, Profile sync with directories and other social capabilities

No mention of IOPS is made in both the planning guides


Web Analytics Service

The service that aggregates and stores statistics on the usage characteristics of the farm.

The planning guide suggests readers consult a dedicated planning guide for web analytics, but unfortunately no mention of IOPS is made, let alone a recommendation 


Project Server Service

The service that enables all the Microsoft Project Server 2010 planning and tracking capabilities in addition to SharePoint Server 2010

No mention of IOPS is made in both the planning guides


PowerPivot Service

The service to display PowerPivot enabled Excel worksheets directly from the browser

No mention of IOPS is made in both the planning guides


(In case it is not obvious, XX – Indicates medium IOPS cost on the resource and XXX indicates high IOPS cost on the resource)

Conclusion (and coming up next)

Whew! I have to say, that was a fairly big post, but I think we have broken the back of latency, IOPS and MBPS. In the next post, we will put all of this theory to the test by looking at the performance counters that allow us to measure it all, as well as play with a couple of very useful utilities that allow us to simulate different scenarios. Subsequent to that, we will look at these measures from a lead indicator perspective and then examine some of Microsoft’s results from their testing.

Until then, thanks very for reading. As always, comments are greatly appreciated.

Paul Culmsee

 Digg  Facebook  StumbleUpon  Technorati  Slashdot  Twitter  Sphinn  Mixx  Google  DZone 

No Tags

Send to Kindle

Demystifying SharePoint Performance Management Part 2 – So what is RPS anyway?

This entry is part 2 of 11 in the series Perf
Send to Kindle

Hi all

I never mentioned it in the first post that the reason I am blogging again is I finally completed most of the game Skyrim. Man – that game is dangerous if you value your time!

Anyway, in the first post, I introduced this series by covering the difference between lead and lag performance indicators. To recap from part 1, a lead indicator is something that can help predict an outcome by measuring an action, whereas a lag indicator measures the result or outcome achieved from taking an action. This distinction is important to understand, because otherwise it is easy to use performance measurements inappropriately or get very confused. Lead indicators in particular sometimes feel wishy washy because it is hard to have a direct correlation to what you are seeing.

In this post, we are going to examine one of the most commonly cited (and abused) lead indicators to measure for performance. Good old Requests Per Second (RPS). Let’s attempt to make this more clear…

Microsoft defines RPS as:

The number of requests received by a farm or server in one second. This is a common measurement of server and farm load. The number of requests processed by a farm is greater than the number of page loads and end-user interactions. This is because each page contains several components, each of which creates one or more requests when the page is loaded. Some requests are lighter than other requests with regard to transaction costs. In our lab tests and case study documents, we remove 401 requests and responses (authentication handshakes) from the requests that were used to calculate RPS because they have insignificant impact on farm resources

So according to this definition, RPS is any interaction between browsers (or any other device or service making web requests) and the SharePoint webserver, excluding authentication traffic. The logic of measuring requests per second is that it provides insight into how much load your SharePoint box can take because, after all, SharePoint at the end of the day is servicing requests from users.

RPS by example

Before we start picking apart RPS and its issues, let’s look at an example. Assuming you are viewing this page in Internet Explorer version 8 or 9, press F12 right now. You should see something like the screen below. If you have not seen it before, it is called the internet explorer developer tools and is bloody handy. Now click on the “Network” link, highlighted below and then click the “Start capturing” button.


Now refresh this page and watch the result. You should see a bunch of activity logged, looking something like the picture below.


What you are looking at is all of the requests that your browser has made to load this very page. While the detail is not overly important for the purpose of this post, the key point is that to load this page, many requests were made. In fact if you look in the left-bottom corner of the above screenshot, a total of 130 individual requests are listed.

So, first pop-quiz for the day: Were all 130 requests made to my cleverworkarounds blog to refresh this page? The answer my friends is no. In actual fact, only 2 items were loaded from my blog!

So why the discrepancy? What happened to the other 128 requests? Two main reasons.

1. Browser cache: First up, many of the items listed above were cached by my browser already. I’ve been to this site before, and so a lot of the page components (CSS style sheets, logos and the like) did not have to be retrieved again. It just happens that the internet explorer developer tool shows requests that were handled by locally cached data as well as actual requests made to SharePoint. If you look closely at the “Result” column in the above screenshot, you will see that some entries are grey colour while others are black. All of the grey entries are cached requests. They never left the confines of the browser. This alone accounts for 95 of the 130 requests.

Now this is worth consideration because if a browser has never accessed this site before, there will be no content in the browser cache. Therefore, on first access, the browser would indeed have made 95 additional requests to load the page. This scenario is most likely on day one of a production SharePoint rollout, where a large chunk of the workforce might load the homepage for the first time.

2. Content from other sites: The second reason for the discrepancy is that some content doesn’t even come from the cleverworkarounds site. Anytime you visit a blog and it has a snazzy widget like Amazon books or Facebook “like” buttons, that content is very likely being retrieved from Amazon or Facebook. In the case of this very article you are reading, 33 requests were made to other sites like Facebook, amazon, feedburner, sharepointads and whoever else happens to grace a widget on the right hand side. In these cases, my server is not handling this traffic at all. This accounts for 33 of the 130 requests.

95 + 33 = 128 of the 130 requests made.

So hopefully now you get what is meant by RPS. Let’s now look at its utility in measuring performance.

Dangers of RPS reliance…

Consider two fairly typical SharePoint transactions: The first example is loading the SharePoint home page and the second example is where a user loads a document from a SharePoint document library. Below I have compared the two transactions by using an Office 365 site of mine and capturing the requests made by each one. (For what its worth, I used a utility called fiddler rather than the developer toolbar because it has some snazzier features).

In example #1, we have loaded the homepage of an Office365 site (assuming for the first time). In all, 36 requests made to the server. If we add up the amount of data returned by the server (summing the “Body” column below), we have a total of 245,322 bytes received.


In request #2, we are looking at the trace of me opening a 7 megabyte document from a document library. Notice that this time, 17 requests were. But compared to the first example, significantly more data was returned from the server: 7,245,876 KB in fact. If you drill down further by examining the “Body” column, you will notice that of those 17 requests, 3 of them were the bulk of the data transferred with 3,149,348, 3,148,008 and 891,069 KB respectively.


So here is my point. Some requests are more significant than others! In the latter example, 3 of the 17 requests transferred 98% of the data. The second transaction also took much longer than the first, and the data was retrieved from the SQL Server database, which meant that this interaction with SharePoint likely had more back-end performance load than the first example when the home page was loaded. When loading the home page, the data may have been served from one of the many SharePoint caches and barely touching the back-end SQL box.

Now with that in mind, consider this: The typical rationale you see around the interweb for utilising RPS as a performance tool is to estimate future scalability requirements. Statements like "This SharePoint farm needs to be capable of 125RPS” are fairly common. Traditionally, the figure was derived from a methodology that looked something like:

  1. Work out the peak times of the day for SharePoint site usage (for example between 10:45am-2:45pm each day)
  2. Estimate the number of concurrent users accessing your SharePoint site during this time
  3. Classify the users via their usage profile (wussy, light, heavy, psycho, etc)
  4. Estimate how many transactions each of these user types might make in the peak hour (a transaction being an operation like browse home page, edit document, and so on)
  5. Multiply concurrent users by the number of expected transactions to derive the total number of transactions for the period
  6. Divide the total by the number of seconds in the period to work out how many transactions per second.

There are lots of issues with this methodology, but here are 4 obvious ones.

  1. The first is that it confuses transactions with requests. While browsing the SharePoint home page might be considered one “transaction”, it will likely consist of more than one request (particularly if the content being served is designed to be fairly dynamic and not rely on cache data). Essentially this methodology may underestimate the number of requests because it assumes a 1:1 relationship between a transaction and a request. My two examples above demonstrate that this is not the case.
  2. The classification of usage profile of users (light, medium, heavy) is crude and overlooks the aforementioned variation in usage patterns. A “heavy user” might continually update a SharePoint calendar, while a “light” user might load 20 megabyte documents or run sophisticated reports. In both cases, the real load on the infrastructure – and the resulting response time – may be quite varied.
  3. It fails to take into consideration the fact that SharePoint 2010 in particular has many new features in the form of Service Applications. These also make requests behind the scenes that have load implications. The most obvious example is the search crawling SharePoint sites.
  4. It also overlooks the fact that SharePoint content is often accessed indirectly. Many non-browser client tools such as SharePoint Workspace, OneNote, Outlook Social Connector, and the like. If Colligo Contributor is deployed to all desktops, does that make all users “heavy?”

So hopefully by now, you can understand the folly of saying to someone “This system should be capable of handling 150RPS.” There is simply far too many variables that contribute to this, and each request can be wildly different in terms of real load on the back-end servers. Now you know why Robert Bogue likened this issue to Drakes Equation in part 1. The RPS target arrived at utilising this sort of methodology is likely to be fairly inaccurate and of questionable value.

So what is RPS good for and how do I get it?

So am I anti RPS? Definitely not!

The one thing RPS has going for it, which makes it incredibly useful, is that it is likely to be the one performance metric that any organisation can tap into straight away (assuming you have an existing deployment). This is because the metric is collected in web server (IIS) logs over time. Each request made to the server is logged with a date and timestamp. For most places, this is the only high fidelity performance data you have access to, because many organisations do not collect and store other stats like CPU and Disk IO performance over time. While its unlikely you would be able to see CPU for a server 6 months ago on Tuesday at 9:53am, chances are you can work out the RPS at that time if you have an existing intranet or portal. The reason for this is that IIS logs are not cleared so you have the opportunity to go back in time and see how a SharePoint site has been utilised.

The benefit is that we have the means to understand past performance patterns of an organisations use of their intranet or portal. We can work out stuff like:

  • peak times of the day for usage of the portal based on previous history
  • the maximum number of requests that the server has ever had to process
  • the rate of increase/decrease of RPS over time (ie “What was peak RPS 6 months ago?  What was it 3 months ago?)
  • the patterns/distribution of requests over a typical day (peaks and troughs – we can see the “shape” of SharePoint usage over a given period)

As an added bonus, the data in web server logs allow for some other fringe benefits including stuff like:

  • the percentage or pattern of requests were “non interactive” (such as % of requests that are search crawls or SharePoint workspace sync’s)
  • identifying usage patterns of certain users (eg top 10 users and their usage usage patterns)

Finally, if you monitor CPU and disk performance, you can compare the RPS peaks against those other performance counters and then interpolate how things might have been in the past (although this has some caveats too).

Coming up next…

Okay so now you are convinced that RPS does not suck – and you want to get your hands on all this RPS goodness. The good news is that its fairly easy to do and Microsoft’s Mike Wise has documented the definitive way to do it. The bad news is, you have to download and learn a yet another utility. Fear not though as the utility (called LogParser) is brilliant and needs to be in your arsenal anyway (especially business oriented SharePoint readers of this blog – this is not one just for the techies). Put simply, LogParser provides the ability to do SQL-like queries to your log files. You can have it open a log file (or series of files), process them via a SQL style language, and then output the results of your query into different formats for reporting.

But, just as I have whetted your appetite, I am going to stop.This post is already getting large and I still have a bit to get through in relation to using LogParser, so I will focus on that in the next post.

Hopefully though at this point, you don’t totally hate RPS, have a much better idea of what RPS is and some of the issues of its use.

Thanks for reading

Paul Culmsee

 Digg  Facebook  StumbleUpon  Technorati  Slashdot  Twitter  Sphinn  Mixx  Google  DZone 

No Tags

Send to Kindle

Demystifying SharePoint Performance Management Part 1 – How to stop developers blaming the infrastructure

This entry is part 1 of 11 in the series Perf
Send to Kindle

Hi all

It seems to me that many SharePoint consultancies think their job is done when recommending a topology based on:

  • Looking up Microsoft’s server recommendations for CPU and RAM and then doubling them for safety
  • Giving the SQL Database Administrators heart palpitations by ‘proactively’ warning them about how big SharePoint databases can get.
  • Recommending putting database files and logs files on different disks with appropriate RAID levels.

Then satisfied that they have done the due diligence required, deploy a SharePoint farm chock-full of dodgy code and poor configuration.

Now if you are more serious about SharePoint performance, then chances are you had a crack at reading all 307 pages of Microsoft’s “Planning guide for server farms and environments for Microsoft SharePoint Server 2010.” If you indeed read this document, then it is even more likely that you worked your way through the 367 pages of Microsoft whitepaper goodness known as “Capacity Planning for Microsoft SharePoint Server 2010”. If you really searched around you might have also taken a look through the older but very excellent 23 pages of “Analysing Microsoft SharePoint Products and Technologies Usage” whitepaper.

Now let me state from the outset that these documents are hugely valuable for anybody interested in building a high performing SharePoint farm. They have some terrific stuff buried in there – especially the insights from Microsoft’s performance measurement of their own very large SharePoint topologies. But nevertheless, 697 pages is 697 pages (and you thought that my blog posts are wordy!). It is a lot of material to cover.

Having read and digested them recently, as well as chatting to SharePoint luminary Robert Bogue on all things related to performance, I was inspired to write a couple of blog posts on the topic of SharePoint performance management with the aim of making the entire topic a little more accessible. As such, all manner of SharePoint people should benefit from these posts because performance is a misunderstood area by geek and business user alike.

Here is what I am planning to cover in these posts.

  • Highlight some common misconceptions and traps for younger players in this area
  • Understand the way to think about measuring SharePoint performance
  • Understand the most common performance indicators and easy ways to measure them
  • Outline a lightweight, but rigorous method for estimating SharePoint performance requirements

In this introductory post, we will start proceedings by clearing up one of the biggest misconceptions about measuring SharePoint performance – and for that matter, many other performance management efforts. As an added bonus, understanding this issue will help you to put a permanent stop to developers who blame the infrastructure when things slow down. Furthermore you will also prevent rampant over-engineering of infrastructure.

Lead vs. lag indicators

Let’s say for a moment that you are the person responsible for road safety in your city. What is your ultimate indicator of success? I bet many readers will answer something like “reduced number of traffic fatalities per year” or something similar. While that is a definitive metric, it is also pretty macabre. It is also suffers from the problem of being measured after something undesirable has happened. (Despite millions of dollars in research, death is still relatively permanent at the time of writing).

Of course, you want to prevent road fatality, so you might create road safety education campaigns, add more traffic lights, improve signage on the roads and so forth. None of these initiatives are guaranteed to make any difference to road fatalities, but they very likely to make a difference nonetheless! Thus, we should also measure these sorts of things because if it contributes to reducing road fatalities, it is a good thing.

So where am I going with this?

In short, the number of road signage is a lead indicator, while the number of road fatalities is a lag indicator. A lead indicator is something that can help predict an outcome. A lag indicator is something that can only be tracked after a result has been achieved (or not). Therefore lag indicators don’t predict anything, but rather, they show the results of an outcome that has already occurred.

Now Robert Bogue made a great point when we were talking about this topic. He said that SharePoint performance and capacity planning is like trying to come up with drakes equation. For those of you not aware, Drakes equation attempts to estimate how much intelligent life might exist in the galaxy. But it is criticised because there are so many variables and assumption made in it. If any of them are wrong, the entire estimate is called into question. Consider this criticism of the equation by Michael Crighton:

The only way to work the equation is to fill in with guesses. As a result, the Drake equation can have any value from "billions and billions" to zero. An expression that can mean anything means nothing. Speaking precisely, the Drake equation is literally meaningless…

Back to SharePoint land…

Roberts point was that a platform like SharePoint can run many different types of applications with different patterns of performance. An obvious example is that saving a 10 megabyte document to SharePoint has a very different performance pattern than rendering a SharePoint page with a lot of interactive web parts on it. Add to that all of the underlying components that an application might use (for example, PowerPivot, Workflows, Information Management Policies, BCS and Search) and it becomes very difficult to predict future SharePoint performance. Accordingly, it is reasonable to conclude that the only way to truly measure SharePoint performance is via measuring SharePoint response times under some load. At least that performance indicator is reasonably definitive. Response time correlates fairly strongly to user experience.

So now that I have explained lead vs. lag indicators, guess which type of indicator response time is? Yup – you guessed it – a lag indicator. In terms of lag indicator thinking, it is completely true that page response time measures the outcome of all your SharePoint topology and design decisions.

But what if we haven’t determined our SharePoint topology yet? What if your manager wants to know what specification of server and storage will be required? What if your response time is terrible and users are screaming at you? How will response time help you to determine what to do? How can we predict the sort of performance that we will need?

Enter the lead indicator.  These provide assurance that the underlying infrastructure is sound and will scale appropriately. But by themselves, they no a guarantee of SharePoint performance (especially when there are developers and excessive use of foreach loops involved!) But what they do ensure is that you have a baseline of performance that can be used to compare with any future custom work. It is the difference between the baseline and whatever the current reality is that is the interesting bit.

So what lead indicators matter?

The three Microsoft documents I referred above list many useful performance monitor counters (particularly at a SQL Server level) that are useful to monitor. Truth be told I was sorely tempted to go through them in this series of posts, but instead I opted to pitch these articles to a wider audience. So rather than rehash what is in those documents, lets look at the obvious ones that are likely to come up in any sort of conversation around SharePoint performance. In terms of lead indicators there are several important metrics

  • Requests per second (RPS)
  • Disk I/O per second (IOPS)
  • Disk Megabytes transferred per second (MBPS)
  • Disk I/O latency

In the next couple of posts, I will give some more details on each of these indicators (their strengths and weaknesses) and how to go about collecting them.

A final Kaizen addendum

Kaizen? What the?

I  mentioned at the start of this post that performance management is not done well in many other industries. Some of you may have experienced the pain of working for a company that chases short term profit (lag indicator) at the expense of long term sustainability (measured by lead indicators). To that end, I recently read an interesting book on the Japanese management philosophy of Kaizen by Masaaki Imai. Imai highlighted the difference between Western attitudes to management in terms of “process-oriented management vs. result-oriented management”. The contention in the book was that western attitudes to management is all about results whereas Japanese approaches are all about the processes used to deliver the result.

In the United States, generally speaking, no matter how hard a person works, lack of results will result in a poor personal rating and lower income or status. The individuals contribution is valued only for its concrete results. Only results count in a result oriented society.

So as an example, a result society would look at the revenue from sales made over a given timeframe – the short term profit focused lag indicator. But according to the Kaizen philosophy, process oriented management would consider factors like:

  • Time spent calling new customers
  • Time spent on outside customer calls versus time devoted to clerical work

What sort of indicators are these? To me they are clearly lead indicators as they do not guarantee a sale in themselves.

It’s food for thought when we think about how to measure performance across the board. Lead and lag indicators are two sides of the same coin. You need both of them.

Thanks for reading

Paul Culmsee

 Digg  Facebook  StumbleUpon  Technorati  Slashdot  Twitter  Sphinn  Mixx  Google  DZone 

No Tags

Send to Kindle

Why can’t users find stuff on the intranet? An IBIS synthesis–Part 4

Send to Kindle

Hi and welcome to my final post on the linkedin discussion on why users cannot find what they are looking for on intranets. This time the emphasis is on synthesis… so let’s get the last few comments done shall we?

Michael Rosager • @ Simon. I agree.
Findability and search can never be better than the content available on the intranet.
Therefore, non-existing content should always be number 1
Some content may not be published with the terminology or language used by the users (especially on a multilingual intranet). The content may lack the appropriate meta tags. – Or maybe you need to adjust your search engine or information structure. And there can be several other causes…
But the first thing that must always be checked is whether they sought information / data is posted on the intranet or indexed by the search engine.

Rasmus Carlsen • in short:
1: Too much content (that nobody really owns)
2: Too many local editors (with less knowledge of online-stuff)
3: Too much “hard-drive-thinking” (the intranet is like a shared drive – just with a lot of colors = a place you keep things just to say that you have done your job)

Nick Morris • There are many valid points being made here and all are worth considering.
To add a slightly different one I think too often we arrange information in a way that is logical to us. In large companies this isn’t necessarily the same for every group of workers and so people create their own ‘one stop shop’ and chaos.
Tools and processes are great but somewhere I believe you need to analyse what information is needed\valued and by whom and create a flexible design to suit. That is really difficult and begins to touch on how organisations are structured and the roles and functions of employees.

Taino Cribb • Hi everyone
What a great discussion! I have to agree to any and all of the above comments. Enabling users to find info can definately be a complicated undertaking that involves many facets. To add a few more considerations to this discussion:
Preference to have higher expectations of intranet search and therefore “blame” it, whereas Google is King – I hear this too many times, when users enter a random (sometimes misspelled) keyword and don’t get the result they wish in the first 5 results, therefore the “search is crap, we should have Google”. I’ve seen users go through 5 pages of Google results, but not even scroll down the search results page on the intranet.
Known VS Learned topics – metadata and user-tagging is fantastic to organise content we and our users know about, but what about new concepts where everyone is learning for the first time? It is very difficult to be proactive and predict this content value, therefore we often have to do so afterwards, which may very well miss our ‘window of opportunity’ if the content is time-specific (ie only high value for a month or so).
Lack of co-ordination with business communications/ training etc (before the fact). Quite often business owners will manage their communications, but may not consider the search implications too. A major comms plan will only go so far if users cannot search the keywords contained in that message and get the info they need. Again, we miss our window if the high content value is valid for only a short time.
I very much believe in metadata, but it can be difficult to manage in SP2007. Its good to see the IM changes in SP2010 are much improved.

Of the next four comments most covered old ground (a sure sign the conversation is now fairly well saturated). Nick says he is making a “a slightly different” point, but I think issues of structure not suiting a particular audience has been covered previously. I thought Taino’s reply was interesting because she focused on the issue of not accounting for known vs. learned topics and the notion of a “window of opportunity” in relation to appropriate tagging. Perhaps this reply was inspired by what Nick was getting at? In any event, adding it was a line call between governance and information architecture and for now, I chose the latter (and I have a habit of changing my mind with this stuff :-).


I also liked Taino’s point about user expectations around the “google experience” and her examples. I also loved earlier Rasmus’s point about “hard-drive thinking” (I’m nicking that one for my own clients Rasmus Smile). Both of these issues are clearly people aspects, so I added them as examples around that particular theme.


Finally, I added Taino’s “lack of co-ordination” comments as another example of inadequate governance.


Anne-Marie Low • The one other thing I think missing from here (other than lack of metadata, and often the search tool itself) is too much content, particularly out of date information. I think this is key to ensuring good search results, making sure all the items are up to date and relevant.

Andrew Wright • Great discussion. My top 3 reasons why people can’t find content are:
* Lack of meta data and it’s use in enabling a range of navigation paths to content (for example, being able to locate content by popularity, ownership, audience, date, subject, etc.) See articles on faceted classification:
Contextual integration
* Too much out-of-date, irrelevant and redundant information
See slide 11 from the following presentation (based on research of over 80 intranets)
* Important information is buried too far down in the hierarchy
Bonus 2 reasons 🙂
* Web analytics and measures not being used to continuously improve how information is structured
* Over reliance on Search instead of Browsing – see the following article for a good discussion about this
Browse Versus Search: Stumbling into the Unknown

Both Anne and Andrew make good points and Andrew supplies some excellent links too, but all of these issues have been covered in the map so nothing more has been added from this part of the discussion.

Juan Alchourron • 1) that particular, very important content, is not yet on the intranet, because “the” director don’t understand what the intranet stands for.
2) we’re asuming the user will know WHERE that particular content will be placed on the intranet : section, folder and subfolder.
3) bad search engines or not fully configured or not enough SEO applied to the intranet

John Anslow • Nowt new from me
1. Search ineffective
2. Navigation unintuitive
3. Useability issues
Too often companies organise data/sites/navigation along operational lines rather than along more practical means, team A is part of team X therefore team A should be a sub section of team X etc. this works very well for head office where people tend to have a good grip of what team reports where but for average users can cause headaches.
The obvious and mostly overlooked method of sorting out web sites is Multi Variant Testing (MVT) and with the advent of some pretty powerful tools this is no longer the headache that it once was, why not let the users decide how they want to navigate, see data, what colour works best, what text encourages them to follow what links, in fact how it works altogether?
Divorcing design, usability, navigation and layout from owners is a tough step to take, especially convincing the owners but once taken the results speak for themselves.

Most of these points are already well discussed, but I realised I had never made a reference to John’s point about organisational structures versus task based structures for intranets. I had previously captured rationale around the fact that structures were inappropriate, so I added this as another example to that argument within information architecture…


Edwin van de Bospoort • I think one of the main reasons for not finding the content is not poor search engines or so, but simply because there’s too much irrelevant information disclosed in the first place.
It’s not difficult to start with a smaller intranet, just focussing on filling out users needs. Which usually are: how do I do… (service-orientated), who should I ask for… (corporate facebok), and only 3rd will be ‘news’.
So intranets should be task-focussed instead if information-focussed…
My 2cnts 😉

Steven Kent • Agree with Suzanne’s suggestion “Old content is not deleted and therefore too many results/documents returned” – there can be more than one reason why this happens, but it’s a quick way to user frustration.

Maish Nichani • It is interesting to see how many of us think metadata and structure are key to finding information on the intranet. I agree too. But come to think of it, staff aren’t experts in information management. It’s all very alien to them. Not too long ago, they had their desktops and folders and they could find their information when they wanted. All this while it was about “me and my content”. Now we have this intranet and shared folders and all of a sudden they’re supposed to be thinking about how “others” would like to find and use the information. They’ve never done this before. They’ve never created or organized information for “others”. Metadata and structure are just “techie” stuff that they have to do as part of their publishing, but they don’t know why they’re doing it or for what reason. They real problem, in my opinion, is lack of empathy.

Barry Bassnett • * in establishing a corporate taxonomy.1. Lack of relevance to the user; search produces too many documents.3. Not training people in the concept that all documents are not created by the individual for the same individual but as a document that is meant to be shared. e.g. does anybody right click PDFs to add metadata to its properties? Emails with a subject line stat describe what is in it.

Luc de Ruijter • @Maish. Good point about information management.
Q: Who’d be responsible to oversee the management of information?
Shouldn’t intranet managers/governors have that responsibility?
I can go along with (lack of) empathy as an underlying reason why content isn’t put away properly. This is a media management legacy reason: In media management content producers never had to have empathy with participating users, for there were only passive audiences.
If empathy is an issue. Then it proves to me that communication strategies are still slow to pick up on the changes in communication behaviour and shift in mediapower, in the digital age.
So if we step back from technological reasons for not finding stuff (search, meta, office automation systems etc.) another big reason looks around the corner of intranet management: those responsible for intranet policies and strategy.

Most of this discussion covers stuff already represented in the map, although I can see that in this part of the conversation there is a preoccupation with content and its relevance. Maish also makes a couple of good points. First up he makes the point that staff are not experts in information management and don’t tend to think about how someone else might wish to find the information later. He also concludes by stating the real problem is a lack of empathy. I liked this and felt that this was a nice supporting argument to the whole conjecture that “people issues” is a major theme in this discussion, so I added it as a pro.



Now we have an interesting bit in the conversation (for me anyway). Terry throws a curveball question. (Side note: Curveball questions are usually asked with genuine intent, but tend to have a negative effect on live meetings. Dialogue Mapping loves curveball questions as it is often able to deflect its negative impacts).

Terry Golding • Can I play devils advocate and ask WHY you feel meta data is so vital? Dont misunderstand me I am not saying that it is not important, but I cant help feeling that just saying meta data as a reason for not finding things is rather a simplification. Let me ask it another way, what is GOOD meta data, can you give examples please ?

Luc de Ruijter • @Terry. Good questions which can have many answers (see all comments above where you’ll find several answers already). Why do library books have labels on their covers? Those labels are in fact metadata (avant la lettre) which help library people ordering their collection, and clients to find titles. How do you create tag clouds which offer a more intuitive and user centered way to navigate a website/blog? By tagging all content with (structured) meta tags.Look around a bit and you’ll see that metadata are everywhere and that they serve you in browsing and retrieving content. That’s why metadata are vital these days.I think there are no strict right and good meta structures. Structures depend on organisational contexts. Some metastructures are very complex and formal (see comments about taxonomies above), others are quite simple.Metadata can enable users to browse information blocks. By comparisson navigation schemes can only offer rigid sender driven structures to navigate to pages.

Andrew Wright • @Terry. Meta data enables content to be found in a number of different ways – not just one as is typical of paper based content (and many intranets as well unfortunately).
For instance, if you advertise a house for sale you may have meta data about the house such as location, number of rooms and price. This then allows people to locate the house using this meta data (eg. search by number of bedrooms, price range, location). Compare this with how houses are advertised in newspapers (ie. by location only) and you can see the benefits of meta data.
For a good article about the benefits of meta data, read Card Sorting Doesn’t Cut the Custard:
To read a more detailed example about how meta data can be applied to intranets, read:
Contextual integration: how it can transform your intranet

Terry questions the notion of metadata. I framed it as a con against the previous metadata arguments. Both Luc and Andrew answer and I think the line that most succinctly captures the essence of than answer is Andrew’s “Meta data enables content to be found in a number of different ways”. So I reframe that slightly as a pro supporting the notion that lack of metadata is one of the reasons why users can;t find stuff on the intranet.


Next is yours truly…

Paul Culmsee • Hi all
Terry a devils advocate flippant answer to your devils advocate question comes from Corey Doctrow with his dated, but still hilarious essay on the seven insurmountable obstacles to meta-utopia 🙂 Have a read and let me know what you think.
Further to your question (and I *think* I sense the undertone behind your question)…I think that the discussion around metadata can get a little … rational and as such, rational metadata metaphors are used when they are perhaps not necessarily appropriate. Yes metadata is all around us – humans are natural sensemakers and we love to classify things. BUT usually the person doing the information architecture has a vested interest in making the information easy for you. That vested interest drives the energy to maintain the metadata.
In user land in most organisations, there is not that vested interest unless its on a persons job description and their success is measured on it. For the rest of us, the energy required to maintain metadata tends to dissipate over time. This is essentially entropy (something I wrote about in my SharePoint Fatigue Syndrome post)

Bob Meier • Paul, I think you (and that metacrap post) hit the nail on the head describing the conflict between rational, unambiguous IA vs. the personal motivations and backgrounds of the people tagging and consuming content. I suspect it’s near impossible to develop a system where anyone can consistently and uniquely tag every type of information.
For me, it’s easy to get paralyzed thinking about metadata or IA abstractly for an entire business or organization. It becomes much easier for me when I think about a very specific problem – like the library book example, medical reports, or finance documents.

Taino Cribb • @Terry, brilliant question – and one which is quite challenging to us that think ‘metadata is king’. Good on you @Paul for submitting that article – I wouldn’t dare start to argue that. Metadata certainly has its place, in the absence of content that is filed according to an agreed taxonomy, correctly titled, the most recent version (at any point in time), written for the audience/purpose, valued and ranked comparitively to all other content, old and new. In the absence of this technical writer’s utopia, the closest we can come to sorting the wheat from the chaff is classifcation. It’s not a perfect workaround by any means, though it is a workaround.
Have you considered that the inability to find useful information is a natural by-product of the times? Remember when there was a central pool to type and file everything? It was the utopia and it worked, though it had its perceived drawbacks. Fast forward, and now the role of knowledge worker is disseminated to the population – people with different backgrounds, language, education and bias’ all creating content.
It is no wonder there is content chaos – it is the price we pay for progress. The best we as information professionals can do is ride the wave and hold on the best we can!

Now my reply to Terry was essentially speaking about the previously spoken of issue around lack of motivation on the part of users to make their information easy to use. I added a pro to that existing idea to capture my point that users who are not measured on accurate metadata have little incentive to put in the extra effort. Taino then refers to pace of change more broadly with her “natural by-product of the times” comment. This made me realise my meta theme of “people aspects” was not encompassing enough. I retitled it “people and change aspects” and added two of Taino’s points as supporting arguments for it.


At this point I stopped as enough had been captured the the conversation had definitely reached saturation point. It was time to look at what we had…

For those interested, the final map had 139 nodes.

The second refactor

At this point is was time to sit back and look at the map with the view of seeing if my emergent themes were correct and to consolidate any conversational chaff. Almost immediately, the notion of “content” started to bubble to the surface of my thinking. I had noticed that a lot of conversation and re-iteration by various people related to the content being searched in the first place. I currently had some of that captured in Information Architecture and in light of the final map, I felt that this wasn’t correct. The evidence for this is that Information Architecture topics dominated the maps. There were 55 nodes for information architecture, compared to 34 for people and change and 31 for governance.

Accordingly, I took all of the captured rationale related to content and made it its own meta-theme as shown below…


Within the “Issues with the content being searched” map are the following nodes…


I also did another bit of fine tuning too here and there and overall, I was pretty happy with the map in its current form.

The root causes

If you have followed my synthesis of what the dialogue from the discussion told me, it boiled down to 5 key recurring themes.

  1. Poor Information Architecture
  2. Issues with the content itself
  3. People and change aspects
  4. Inadequate governance
  5. Lack of user-centred design

I took the completed maps, exported the content to word and then pared things back further. This allowed me to create the summary table below:

Poor Information Architecture Issues with content People and change aspects Inadequate governance Lack of user-centred design
Vocabulary and labelling issues

· Inconsistent vocabulary and acronyms

· Not using the vocabulary of users

· Documents have no naming convention

Poor navigation

Lack of metadata

· Tagging does not come naturally to employees

Poor structure of data

· Organisation structure focus instead of user task focussed

· The intranet’s lazy over-reliance on search

Old content not deleted

Too much information of little value

Duplicate or “near duplicate” content

Information does not exist or an unrecognisable form

People with different backgrounds, language, education and bias’ all creating content

Too much “hard drive” thinking

People not knowing what they want

Lack of motivation for contributors to make information easier to use

Google inspired inflated expectations on search functionality on intranet

Adopting social media from a hype driven motivation

Lack of governance/training around metadata and tagging

Not regularly reviewing search analytics

Poor and/or low cost search engine is deployed

Search engine is not set up properly or used to full potential

Lack of “before the fact” coordination with business communications and training

Comms and intranet don’t listen and learn from all levels of the business.

Ambiguous, under-resourced or misplaced Intranet ownership

The wrong content is being managed

There are easier alternatives available

Content is structured according to the view of the owners rather than the audience

Not accounting for two types of visitors… task-driven and browse-based

No social aspects to search

Not making the search box available enough

A failure to offer an entry level view

Not accounting for people who do not know what they are looking for versus those who do

Not soliciting feedback from a user on a failed search about what was being looked for

The final maps

The final map can be found here (for those who truly like to see full context I included an “un-chunked” map which would look terrific when printed on a large sized plotter). Below however, is a summary as best I can do in a blog post format (click to enlarge). For a decent view of proceedings, visit this site.

Poor Information Architecture


Issues with the content itself


People and change aspects


Inadequate governance


Lack of user-centred design


Thanks for reading.. as an epilogue I will post a summary with links to all maps and discussion.

Paul Culmsee

 Digg  Facebook  StumbleUpon  Technorati  Slashdot  Twitter  Sphinn  Mixx  Google  DZone 

No Tags

Send to Kindle

Why can’t users find stuff on the intranet? An IBIS synthesis–Part 1

Send to Kindle


There was an interesting discussion on the Intranet Professionals group on LinkedIn recently where Luc De Ruijter asked the question:

What are the main three reasons users cannot find the content they were looking for on intranet?

As you can imagine there were a lot of responses, and a lot more than three answers. As I read through them, I thought it might be a good exercise to use IBIS (the language behind issue mapping) to map the discussion and see what the collective wisdom of the group has to say. So in these posts, I will illustrate the utility of IBIS and Issue mapping for this work, and make some comments about the way the conversation progressed.

So what is IBIS and Issue/Dialogue Mapping?

Issue Mapping captures the rationale behind a conversation or dialogue—the emergent ideas and solutions that naturally arise from robust debate. This rationale is graphically represented using a simple, but powerful, visual structure called IBIS (Issue Based Information System). This allows all elements and rationale of a conversation, and subsequent decisions, to be captured in a manner that can be easily reflected upon.

The elements of the IBIS grammar are below. Questions give rise to ideas, or potential answers. Ideas have pros or cons arguing for or against those ideas.


Dialogue Mapping is essentially Issue Mapping a conversation live, where the mapper is also a facilitator. When it is done live it is powerful stuff. As participants discuss a problem, they watch the IBIS map unfold on the screen. This allows participants to build shared context, identify patterns in the dialogue and move from analysis to synthesis in complex situations. What makes this form of mappingcompelling is that everything is captured. No idea, pro or con is ignored. In a group scenario, this is an extremely efficient way of meeting what social psychologist Hugh Mackay says is the first of the ten human desires which drives us – this being the desire to be taken seriously. Once an idea is mapped, the idea and the person who put it forth are taken seriously. This process significantly reduces “wheel spinning” in meetings where groups get caught up in a frustrating tangled mess of going over the same old ground. It also allows the dialogue to move more effectively to decision points (commitments) around a shared understanding.

In this case though, this was a long discussion on a LinkedIn group so we do not get the benefit of being able to map live. So in this case I will create a map to represent the conversation as it progresses and make some comments here and there…

So let’s kick off with the first reply from Bob Meier.

Bob Meier • Don’t know if these are top 3, but they’re pretty common find-ability issues:
1. Lack of metadata. If there are 2000 documents called “agenda and minutes” then a search engine, fancy intranet, or integrated social tool won’t help.
2. Inconsistent vocabulary and acronyms. If you’ve branded the expense report system with some unintuitive name (e.g. a vendor name like Concur) then I’ll scan right past a link looking for “expense reports” or some variation.
3. Easier alternatives. If it’s easier for me to use phone/email/etc. to find what I want, then I won’t take the time to learn how to use the intranet tools. Do grade schools still teach library search skills? I don’t think many companies do…

In IBIS this was fairly straightforward. Bob listed his three answers with some supporting arguments. I reworded his supporting argument of point 2, but otherwise it pretty much reflects what was said…


Nigel Williams (LION) • I agree with Bob but I’d add to point two not speaking our user base’s language. How many companies offer a failure to find for example (i.e.if you fail to find something in a search you submit a brief form which pops up automatically stating what you were looking for and where you expected to find it? Lots of comms and intranet teams are great at telling people and assuming we help them to learn but don’t listen and learn from all levels of the business.
If I make that number 1 I’ll also add:
2) Adopting social media because everyone else is, not because our business or users need it. This then ostracises the technophobics and concerns some of our less confident regular users. They then form clans of anti-intranetters and revert to tried and tested methods pre-intranet (instant messaging, shared drives, email etc.)
3) Not making the search box available enough. I’m amazed how many users in user testing say they’ve never noticed search hidden in the top right of the banner – “ebay has their’s in the middle of the screen, so does Google. Where’s ours?” is a typical response. If you have a user group at your mercy ask them to search for an item on on Google, then eBay, then Amazon, then finally your intranet. Note whether they search in the first three and then use navigation (left hand side or top menu) when in your intranet.

Nigel’s starts out by supporting Bob’s answer and I therefore add them as pros in the map. Having done this though, I can already see some future conversational patterns. Bob’s two supporting arguments for “not using the vocabulary of users”, actually are two related issues. One is about user experience and the other is about user engagement/governance. Nevertheless, I have mapped it as he stated it at this point and we see what happens.


Luc de Ruijter • @Bob. I recognise your first 2 points. The third however might be a symptom or result, not a cause. Or is it information skills you are refering to?
How come metadata are not used? Clearly there is a rationale to put some effort in this?
@Nigel. Is the situation in which Comm. depts don’t really listen to users a reason for not finding stuff? Or would it be a lack of rapport with users before and while building intranets? Is the cause concepetual, rather than editorial for instance?
(I’m really looking for root causes, the symptoms we all know from daily experience).
Adding more media is something we’ve seen for years indeed. Media tend to create silo’s.
Is your third point about search or about usability/design?

In following sections I will not reproduce the entire map in the blog post – just relevant sections.

In this part of the conversation, Luc doesn’t add any new answers to the root question, but queries three that have been put forward thus far. Also note at this point I believe one of Luc’s answers is for a different question. Bob’s “easier alternatives” point was never around metadata. But Luc asks “how come metadata is not used?”. I have added it to the map here, changing the framing from a rhetorical question to an action. Having said that, if I was facilitating this conversation, I would have clarified that point before committing it to the map.


Luc also indicates that the issue around communications and intranet teams not listening might be due to a lack of rapport.


Finally, he adds an additional argument why social media may not be the utopia it is made out to be, by arguing that adding more media channels creates more information silos. He also argues against the entire notion on the grounds that this is a usability issue, rather than a search issue.


Nigel Williams (LION) • Hi Luc, I think regarding Comms not listening that it is two way. If people are expecting to find something with a certain title or keyword and comms aren’t recognising this (or not providing adequate navigation to find it) then the item is unlikely to be found.
Similarly my third point is again both, it is an issue of usability but if that stops users conducting searches then it would impact daily search patterns and usage.

I interpret this reply as Nigel arguing against Luc’s assertion around lack of rapport being the reason behind intranet and comms teams not listening and learning from all levels of the user base.


Nigel finishes by arguing that even if social media issues are usability issues, they might still impede search and the idea is therefore valid.


Bob Meier • I really like Nigel’s point about the importance of feedback loops on Intranets, and without those it’s hard to build a system that’s continually improving. I don’t have any data on it, but I suspect most companies don’t regularly review their search analytics even if they have them enabled. Browse-type searching is harder to measure/quantify, but I’d argue that periodic usability testing can be used in place of path analysis.
I also agree with Luc – my comment on users gravitating from the Intranet to easier alternatives could be a symptom rather than a cause. However, I think it’s a self-reinforcing symptom. When you eliminate other options for finding information, then the business is forced to improve the preferred system, and in some cases that can mean user training. Not seeing a search box is a great example of something that could be fixed with a 5-minute Intranet orientation.
If I were to replace my third reason, I’d point at ambiguous or mis-placed Intranet ownership . Luc mentions Communications departments, but in my experience many of those are staffed for distributing executive announcements rather than facilitating collective publishing and consumption. I’ve seen many companies where IT or HR own the Intranet, and I think the “right” department varies by company. Communications could be the right place depending on how their role is defined.

Bob makes quite a number of points in this answer, right across various elements of the unfolding discussion. Firstly, he makes a point about analytics and the fact that a lack of feedback loops makes it hard to build a system that continually improves.


In term of the discussion around easier alternatives, Bob offers some strategies to mitigate the issue. He notes that there are training implications when eliminating the easier alternatives.


Finally, Bob identifies issues around the ownership of the intranet as another answer to the original question of people not being able to find stuff on the intranet. He also lists a couple of common examples.


Karen Glynn • I think the third one listed by Bob is an effect not a cause.
Another cause could be data being structured in ways that employees don’t understand – that might be when it is structured by departments, so that users need to know who does what before they can find it, or when it is structured by processes that employees don’t know about or understand. Don’t forget intranet navigations trends are the opposite to the web – 80% of people will try and navigate first rather than searching the intranet.

In this answer, Karen’s starts by agreeing with the point Luc made about “easier alternatives” being a symptom rather than a cause, so there is no need to add it to the map as it is already there. However she provides a new answer to the original question: the structure of information (this by the way is called top-down information architecture – and it was bound to come out of this discussion eventually). She also makes a claim that 80% of people will navigate prior to search on the intranet. I wonder if you can tell what will happen next? Smile


Luc de Ruijter • @Nigel Are (customer) keywords the real cause for not finding stuff? In my opinion this limits the chalenge (of building effective intranet/websites) to building understandable navigation patters. But is navigation the complete story? Where do navigation paths lead users to?
@Bob Doesn’t an investiment in training in order to have colleagues use the search function sound a bit like attacking the symptom? Why is search not easy to locate in the first place? I’d argue you’re looking at a (functional) design flaw (cause) for which the (where is the search?) training is a mere remedy, but not a solution.
@Karen You mention data. How does data relate to what we conventionally call content, when we need to bring structure in it?
Where did you read the 80% intranet-users navigate before searching?

Okay, so this is the first time thus far where I do a little bit of map restructuring. In the discussion so far, we had two ideas offered around the common notion of vocabulary. In this reply, Luc states “Are (customer) keywords the real cause for not finding stuff?” I wasn’t sure which vocabulary issue he was referring to, so this prompted me to create a “meta idea” called “Vocabulary and labelling issues”, of which there are two examples cited thus far. This allowed me to capture the essence of Luc’s comment as a con against the core idea of issues around vocabulary and labelling.


Luc then calls into question Bob’s suggestion of training and eliminating the easier alternatives. Prior to Luc’s counter arguments, I had structured Bob’s argument like this:


To capture Luc’s argument effectively, I restructured the original argument and made a consolidated idea to “eliminate other options and provide training”. This allowed me to capture Luc’s counter argument as shown below.


Finally, Luc asked Karen for the source of her contention that 80% of users navigate intranets, rather than use the search engine first up.


In this final bit of banter for now, the next three conversations did not add too many nodes to the map, so I have grouped them below…

Karen Glynn • Luc, the info came from the Neilsen group.

Helen Bowers • @Karen Do you know if the Neilsen info is available for anyone to look at?

Karen Glynn • I don’t know to be honest – it was in one of the ‘paid for’ reports if I remember correctly.

Luc de Ruijter • @Karen. OK in that case, could you provide us with the title and page reference of the source? Than it can become usable as a footnote (in a policy for instance).Thanks
Reasons so far for not finding stuff:
1. Lack of metadata (lack of content structure).
2. Inconsistent vocabulary and acronyms (customer care words).
3. Adopting social media from a hype-driven motivation (lack of coherence)
4. Bad functional design (having to search for the search box)
5. Lack of measuring and feedback on (quality, performance of) the intranet
6. Silo’s. Site structures suiting senders instead of users

So for all that Banter, here is what I added to what has already been captured.


Where are we at?

At this point, let’s take a breath and summarise what has been discussed so far. Below is the summary map with core answers to the question so far. I have deliberately tucked away the detail into sub maps so you can see what is emerging. Please note I have not synthesised this map yet (well … not too much anyway). I’ll do that in the next post.


If you want to take a look at the entire map as it currently stands, take a look at the final image at the very bottom of this post. (click to enlarge). I have also exported the entire map so far for you to view things in more context. Please note that the map will change significantly as we continue to capture and synthesise the rationale, so as we continue to unpack the discussion, expect this map to change quite a bit..

Thanks for reading

Paul Culmsee



 Digg  Facebook  StumbleUpon  Technorati  Slashdot  Twitter  Sphinn  Mixx  Google  DZone 

No Tags

Send to Kindle

It’s email integration captain, but not as we know it (problems with incoming email handling on SharePoint 2010)

Send to Kindle

Hi everyone. This is my last post for 2010, and I am going out on a troubleshooting note. See you all next year with lots of new content and cool stuff!

I had some interesting experiences recently with SharePoint 2010, specifically the Content Organiser feature and leveraging it with incoming email. I thought they might be worth sharing, but first I need to set some context via the use-case where this started.

Where would we be without the photocopier?


Most organisations large or small, have one of those multifunction photocopier/scanner/fax/coffee maker gizmos (okay so maybe not the coffee maker). You know the types – they are large, noisy and the paper feeders frequently jam, where the tech guy who comes to fix it has to be on site so often that he’s considered a staff member. They usually have a document feeder, can scan to PDF and email it straight through to you. If you have a really fancy-schmancy one, it might even OCR the content for you so the resultant PDF is not an image but text based.

While all that is good, the real benefit of these devices is more subtle. Employees like to congregate nearby for a little bit of office gossip, and to quietly bitch to each-other about how much their boss or co-workers annoy them. The conversations around the photocopier are usually some of the most insightful and valuable conversations you might have at times.

So I believe that anything we can do to encourage photocopier conversations is a good thing. For some organisations, this is about as close as it gets to cross departmental collaboration! Smile with tongue out. If we also leverage the fact that these devices offer this “scan to PDF and email” function at the press of a button, then SharePoint has a nice story to tell here – especially with SharePoint 2010 and the Content Organiser feature.

The premise: Content Organiser coolness

I will spend a few moments to introduce the Content Organiser feature for readers who have not seen much of SharePoint 2010. If you know all about this feature, skip to the next section.

For those of you who may not be aware, SharePoint 2010 has an interesting new feature called the Content Organiser. The Content Organizer feature is quite a powerful document routing solution that makes it easier to store documents consistently, according to administrator defined rules that can copy or move a document from one place in SharePoint to another place. I will get to the rules in a minute, but the content organiser feature is important for several reasons;

  • It makes the saving of documents easier because users do not necessarily have to worry about knowing the destination when uploading new content.
  • It is a highly flexible method for routing documents between sites and site collections around the SharePoint farm.
  • It underpins a solid compliance and records management file plan capability.

So before anything else can be done, we need to turn it on. The Content Organiser feature has to be activated on each site for the functionality to be enabled. In other words, it is a site scoped feature. Below is an illustration of the feature to activate.


Once the Content Organiser feature is activated, SharePoint 2010 makes several changes to the site configuration.

  • It creates a new document library called the Drop-off Library.
  • It creates a new custom list called Content Organizer Rules.
  • It adds two new Site Administration links in the Site Settings page to manage the Content Organiser for the site.


The description for the feature says “Create metadata based rules…” and it these Content Organiser rules that allow you to automatically route documents from the newly created Drop-off Library to some other location. It is important to know that the Drop-off Library is fixed – it is the first point of call for files that need to be moved or copied somewhere. Consider the drop-off library like a bellboy of a hotel. You give him your luggage and he will ensure it gets to its correct location (and unlike a bellboy you don’t need to tip).

So if the drop-off library is the starting point, where can documents be routed to? The location can be;

  • A document library and/or a folder within a document library on the site.
  • The Drop-off library of another site, which allows inter and intra site collection routing.

The Content Organiser rules are managed from Site Administration which is accessed via Site Settings. New rules are added in the same manner as adding new items to any SharePoint list. In the following example, we will create a rule where a content types of invoice will be routed to a document library called Finance.


The conditions section of the rule allows for multiple conditions to be defined to determine matching content and then, where that content should be routed to. The properties available are any columns assigned to the content type being routed. In the example below, we have added two conditions that have to be satisfied before the rule will fire.


Let’s take a closer look at this beast known as the Drop-off library. This is a special document library is added to the site upon feature activation, imaginatively called the Drop Off Library. As stated earlier, this library is really a temporary staging area for items that do not have all required metadata to satisfy any routing rules.

The sequence of events for the Content Organiser is;

  1. Documents with the correct content type, metadata, and matching rules are automatically routed to the final library and folder.
  2. Documents that lack the amount of metadata required to match a rule or that are missing required metadata are retained in the "Drop-Off Library" so that the user can enter metadata to satisfy a rule.
  3. After a user has edited a staging document with the appropriate metadata required to match a rule, the document is automatically routed to the target library and folder.

As an example, if we assume that a Content Organiser role will route any document with “Finance” in its name to document library called Finance, the behaviour will be as follows:

  • If the file uploaded has the word “finance” in its name, SharePoint indicates that the document has been successfully routed
  • If the file uploaded does not have the word “finance” in its name, SharePoint will indicate to the user that the content organiser has placed it into the Drop-off library.

Now, before you rush off and start to mess with the content organiser, I’d better tell you about a couple of caveats

  1. The Content Organizer will only work on content types that are, or derive from, the Document content type. So it does not work for automatically organizing large lists.
  2. When uploading documents via Windows explorer view, Content Organiser rules are ignored and the document will not be redirected to the Drop-off library. (through the browser if a document is uploaded to a destination library, SharePoint will move it to the Drop-off library for classification)
  3. There is a limit of six conditions per rule. After six conditions are added, the "Add new condition" link disappears.
  4. If you wish to route the document to another site, the Content Organizer feature has to be installed on that site for the Drop-off library to be created as that is the destination. Additionally, you need to add the configuration information in Central Administration by adding the destination to the list of send to connections for the web application (that is beyond the scope of this article, but easy enough to do).
  5. Ruven Gotz tells me there are also some potential risks around the fact that you can route files to any destination even if you do not have permission and potentially overwrite content. As Scott says, the content organizer will move content to the new location whether or not the contributing user has access to the destination location.

Content Organiser and email integration

So if we go back to where we started with our fancy photocopier. SharePoint has offered email integration on document libraries since the 2007 version. In effect, we give each document library or list an email address, set up a few parameters and any attachment will be in effect, uploaded to a document library.

On a semi-related note, my company actually developed a version of the content organiser for SharePoint 2007 that allowed the routing of documents based on business rules. Contact us if you want to now more.

Given the routing capabilities of the content organiser in SharePoint 2010, one would think that by email-enabling the Drop-off library that is created when you activate the content organiser feature, that we can have all scanned correspondence end up in the Drop-off library, ready for classification by an administrator and routed in accordance to specified routing rules.

Sounds logical enough – so logical in fact that I gave it a try.

Problem 1: Race condition?

Email enabling a document library is pretty easy, provided you have set up incoming email in SharePoint Central Administration first. In this case, I set up the incoming email on the Drop-off Library with the settings below: Note that I specified not to overwrite attachments with the same name.


I then programmed the photocopier to use this email address as a profile. That way, a user would scan incoming correspondence, then choose this profile as the destination. The photocopier would scan to PDF and then email those PDF’s to the Drop-off library. The problem was – not all of the scans arrived.

We noticed that individual scans (ie one document at a time), would work fine, but for some reason, bulk scans would not. Typically, if a user scanned say, 10 items, only 6 of them would make it to the document library. A trawl through the diagnostic logs was therefore required. Luckily, SharePoint 2010 has been built upon PowerShell, and there is a PowerShell command to get at the diagnostic logs. I have become a huge fan of PowerShell just from this one command, as it has eliminated the need for me to install additional tools to view logs on SP2010 boxes. Taking a punt, I assumed if there was to be an error message, it would have the word “E-mail” in it. So I issued the following PowerShell command:

Get-SPLogEvent | Where-Object ( $_.message -like "*E-Mail*" ) | Out-GridView

This will return any logs in a graphical format (the gridview) as shown below. Immediately I saw warning messages, telling that an error occurred while attempting to create an attachment for an item sent via email.


This was clearly related to my issue, so I adjusted the PowerShell script to be a little more specific so I can see the full message

Get-SPLogEvent | Where-Object { $_.message -like "*create an att*" } | Select-Object -Property message | Out-GridView

The email was sent to the list “Drop Off Library”, and the error was: the file DropOffLibrary/<filename> has been modified by SHAREPOINT\SYSTEM on <date>


Problem 2: A poorly named feature?

This time, I saw that it claims that the attachment was modified by SHAREPOINT/system. Hmm – that sort of error message is very similar to race conditions seen with SharePoint Designer workflows. Thus, my first thought was that email enabling the Drop-off library was possibly unsupported. I figured that the Drop-off library likely used event-receivers, workflow or scheduled tasks to do the document routing that might get in the way with processing incoming email attachments.

My suspicion was further given weight when I recalled that there was another feature to do with content organiser that I could activate: Email Integration with Content Organiser. According to the description, it enable a site’s content organizer to accept and organize email messages. Cool – this seemed logical enough, so I activated it.


Upon activating this feature, a small change is made to the Content Organiser Settings page, found under Site Settings. An e-mail address was assigned, and a link was provided to configure the organizers incoming email  settings as shown below:


This is where things started to get interesting. As expected, I was taken to the incoming email settings screen, but instead of it being the drop-off library, it was a hidden list called Submitted E-mail Records. At the time, this suggested that my initial conclusion that incoming email on the drop-off library was unsupported was correct. After all, why else would incoming email be redirected to another list instead of the drop-off library? I couldn’t think of any other logical explanation.


At the time, I searched the web to see if anybody else had mentioned the Submitted E-mail Records hidden list and problems with incoming emails. I was then surprised to learn that this hidden list was there in SharePoint 2007, but I’d never seen it before because Records Management in 2007 is crappy and I never used it much.

Anyway, try as I might, I could never get items emailed to the Submitted E-mail Records list, to ever route to the Drop-off library. The photocopier would happily mail items to Submitted E-mail Records list, but they would stay there and never get processed. Grrr!

After turning up some debug logs to verbose, specifically:

  • Document management Server: Content organiser
  • SharePoint foundation: E-Mail

We saw the following error message: “Cannot resolve mailbox (null) to a valid user”. Additionally, my colleague Peter Chow used reflector to examine the underlying code behind this feature. From what we could tell, a method, GetOfficialFilePropsFromBody, in the class: Microsoft.Office.RecordsManagement.RecordsRepository.EmailRecordsHandler, was attempting to extract a mailbox value and getting a null.  Unfortunately for us, the code eventually lead to obfuscated classes so we were not able to dig any deeper.

At this point, with nothing on the net to guide us and twitter going quiet, I logged a support call with Microsoft. Pretty quickly, our issue was reproduced and escalated to the local team and then to the US. A few weeks later, we got the following answer:

The E-mail Integration with Content Organizer feature is a there for legacy Exchange 2007 deployments that can journal mail to SharePoint.  Such functionality is not available in Exchange 2010 and we do not (and never have) supported directly e-mailing a content organizer as the steps below show.  It just won’t work – that mail submitted to that list needs to be in a special format that Exchange 2007 can create (or some other system who has implemented that documented protocol). The users should consider leaving the mail in Exchange and using Exchange’s record management and compliance functionality or buying a third party add-in if they need to get mail into a SharePoint content library. has a list of partners.

Workaround : For this  scenario we found a workaround to enable incoming email on the “Drop Off library” and which in turn routed the emails/attachments to the destination library as per the rule. The envelope data or other metadata will NOT be carried along with the item to its final location with this workaround. So the customer should expect to lose sender, sent, received, et cetera

If the customer wants to check on how to get this working using Exchange 2007:

So the format is generated by the backend transport layer via a policy on a Managed Folder.  So even in 2007, you can’t email a content organizer.  But you can set up a journaling rule on a managed folder in Exchange to journal every item dropped into that folder to SharePoint.

We should not encourage users to email Content organizer since we have that has a legacy feature to support Exchange 2007. If the users wants email to be sent to SharePoint then use Email enabled Doc Lib.

So there you have it. Do not activate the Email Integration with Content Organiser feature!. It is a legacy designed specifically for Exchange 2007, yet the description for the feature in SharePoint makes no mention of this! If this was explicitly mentioned in the description of this feature, two weeks worth of needless troubleshooting and a long support call would have been avoided. Even my local (and excellent) Microsoft escalation team were not aware of this. Oh well, you live and learn.

Back to square one. (shut-up and apply the latest cumulative update)

So now that we confirmed that the Email Integration with Content Organiser feature was a giant red herring and never going to fly, we returned focus to why some attachments were not being correctly processed by the Drop-off Library. As it happened, the August cumulative update for SharePoint 2010 had a fix in it. This technet thread describes the issue in more detail. There appeared to be a bug where even if you set the email-enable feature option called “Overwrite files with the same name”, to no, it was ignored and an exception was logged when attachments with the same file-name arrived. Turns out the photocopier file names were not 100% unique! It used a timestamp as part of the filename that was unique down to the minute, not the second.

So after a journey that included a needless support call, we applied the CU and what do you know, problem solved!

To conclude this rather long and rambling post, I feel kind of bad that I got so side tracked on the Email Integration with Content Organiser feature. That’s part of troubleshooting life I guess. The only consolation I can really take from it, is that it also fooled Microsoft engineers too. I guess this issue is yet one more of the many caveats that we all have to learn about the hard way.


Thanks for reading

Paul Culmsee

 Digg  Facebook  StumbleUpon  Technorati  Slashdot  Twitter  Sphinn  Mixx  Google  DZone 

No Tags

Send to Kindle