Rewriting the knowledge management rulebook… The story of “Glyma” for SharePoint

Send to Kindle

“If Jeff ever leaves…”

I’m sure you have experienced the “Oh crap” feeling where you have a problem and Jeff is on vacation or unavailable. Jeff happens to be one of those people who’s worked at your organisation for years and has developed such a deep working knowledge of things, it seems like he has a sixth sense about everything that goes on. As a result, Jeff is one of the informal organisational “go to guys” – the calming influence amongst all the chaos. An oft cited refrain among staff is “If Jeff ever leaves, we are in trouble.”

In Microsoft’s case, this scenario is quite close to home. Jeff Teper, who has been an instrumental part of SharePoint’s evolution is moving to another area of Microsoft, leaving SharePoint behind. The implications of this are significant enough that I can literally hear Bjorn Furuknap’s howls of protest all the way from here in Perth.

So, what is Microsoft to do?

Enter the discipline of knowledge management to save the day. We have SharePoint, and with all of that metadata and search, we can ask Jeff to write down his knowledge “to get it out of his head.” After all, if we can capture this knowledge, we can then churn out an entire legion of Jeffs and Microsoft’s continued SharePoint success is assured, right?

Right???

There is only one slight problem with this incredibly common scenario that often underpins a SharePoint business case… the entire premise of “getting it out of your head” is seriously flawed. As such, knowledge management initiatives have never really lived up to expectations. While I will save a detailed explanation as to why this is so for another post, let me just say that Nonaka’s SECI model has a lot to answer for as it is based on a misinterpretation of what tacit knowledge is all about.

Tacit knowledge is expert knowledge that is often associated with intuition and cannot be transferred to others by writing it down. It is the “spider senses” that experts often seem to have when they look at a problem and see things that others do not. Little patterns, subtleties or anomalies that are invisible to the untrained eye. Accordingly, it is precisely this form of knowledge that is of the most value in organisations, yet is the hardest to codify and most vulnerable to knowledge drain. If tacit knowledge could truly be captured and codified in writing, then every project manager who has ever studied PMBOK would have flawless projects, because the body of knowledge is supposed to be all the codified wisdom of many project managers and the projects they have delivered. There would also be no need for Agile coaches, Microsoft’s SharePoint documentation should result in flawless SharePoint projects and reading Wictor’s blog would make you a SAML claims guru.

The truth of tacit knowledge is this: You cannot transfer it, but you acquire it. This is otherwise known as the journey of learning!

Accountants are presently scratching their heads trying to figure out how to measure tacit knowledge. They call it intellectual capital, and the reason it is important to them is that most of the value of organisations today is classified on the books as “intangibles”. According to the book Balanced Scorecard, a company’s physical assets accounted for 62% of its market value in 1982, 38% of its market value in 1992 and only 21% in 2003. This is in part a result of the global shift toward knowledge economies and the resulting rise in the value of intellectual capital. Intellectual capital is the sum total of the skills, knowledge and experience of staff and is critical to sustaining competitiveness, performance and ultimately shareholder value. Organisations must therefore not only protect, but extract maximum value from their intellectual capital.

image

Now consider this. We are in an era where baby boomers are retiring, taking all of their hard-earned knowledge with them. This is often referred to as “the knowledge tsunami”, “the organisational brain drain” and the more nerdy “human capital flight”. The issue of human capital flight is a major risk area for organisations. Not only is the exodus of baby boomers an issue, but there are challenges around recruitment and retention of a younger, technologically savvy and mobile workforce with a different set of values and expectations. One of the most pressing management problems of the coming years is the question of how organisations can transfer the critical expertise and experience of their employees before that knowledge walks out the door.

The failed solutions…

After the knowledge management fad of the late 1990’s, a lot of organisations did come to realise that asking experts to “write it down” only worked in limited situations. As broadband came along, enabling the rise of rich media services like YouTube, a digital storytelling movement arose in the early 2000’s. Digital storytelling is the process by which people share stories and reflections while being captured on video.

Unfortunately though, digital storytelling had its own issues. Users were not prepared to sit through hours of footage of an expert explaining their craft or reflecting on a project. To address this, the material was commonly edited down to create much smaller mini-documentaries lasting a few minutes – often by media production companies, so the background music was always nice and inoffensive. But this approach also commonly failed. One reason for failure was well put by David Snowden when he saidInsight cannot be compressed”. While there was value in the edited videos, much of the rich value within the videos was lost. After all, how can one judge ahead of time what someone else finds insightful. The other problem with this approach was that people tended not to use them. There was little means for users to find out these videos existed, let alone watch them.

Our Aha moment

In 2007, my colleagues and I started using a sensemaking approach called Dialogue Mapping in Perth. Since that time, we have performed dialogue mapping across a wide range of public and private sector organisations in areas such as urban planning, strategic planning, process reengineering, organisational redesign and team alignment. If you have read my blog, you would be familiar with dialogue mapping, but just in case you are not, it looks like this…

Dialogue Mapping has proven to be very popular with clients because of its ability to make knowledge more explicit to participants. This increases the chances of collective breakthroughs in understanding. During one dialogue mapping session a few years back, a soon-to-be retiring, long serving employee relived a project from thirty years prior that he realised was relevant to the problem being discussed. This same employee was spending a considerable amount of time writing procedure manuals to capture his knowledge. No mention of this old project was made in the manuals he spent so much time writing, because there was no context to it when he was writing it down. In fact, if he had not been in the room at the time, the relevance of this obscure project would never have been known to other participants.

My immediate thought at the time when mapping this participant was “There is no way that he has written down what he just said”. My next thought was “Someone ought to give him a beer and film him talking. I can then map the video…”

This idea stuck with me and I told this story to my colleagues later that day. We concluded that the value of asking our retiring expert to write his “memoirs” was not making the best use of his limited time. The dialogue mapping session illustrated plainly that much valuable knowledge was not being captured in the manuals. As a result, we seriously started to consider the value of filming this employee discussing his reflections of all of the projects he had worked on as per the digital storytelling approach. However, rather than create ‘mini documentaries’, utilise the entire footage and instead, visually map the rationale using Dialogue Mapping techniques. In this scenario, the map serves as a navigation mechanism and the full video content is retained. By clicking on a particular node in the map, the video is played from the time that particular point was made. We drew a mock-up of the idea, which looked like the picture below.

image

While thinking the idea would be original and cool to do, we also saw several strategic advantages to this approach…

  • It allows the user to quickly find the key points in the conversation that is of value to them, while presenting the entire rationale of the discussion at a glance.
  • It significantly reduces the codification burden on the person or group with the knowledge. They are not forced to put their thoughts into writing, which enables more effective use of their time
  • The map and video content can be linked to the in-built search and content aggregation features of SharePoint.
    • Users can enter a search from their intranet home page and retrieve not only traditional content such as documents, but now will also be able to review stories, reflections and anecdotes from past and present experts.
  • The dialogue mapping notation when stored in a database, also lends itself to more advanced forms of queries. Consider the following examples:
    • “I would like any ideas from lessons learnt discussions in the Calgary area”
    • “What pros or cons have been made about this particular building material?”
  • The applicability of the approach is wide.
    • Any knowledge related industry could take advantage of it easily because it fits into exiting information systems like SharePoint, rather than creating an additional information silo.

This was the moment the vision for Glyma (pronounced “glimmer”) was born…

Enter Glyma…

Glyma (pronounced ‘glimmer’) is a software platform for ‘thought leaders’, knowledge workers, organisations, and other ‘knowledge economy participants’ to capture and trade their knowledge in a way that reduces effort but preserves rich context. It achieves this by providing a new way for users to visually capture and link their ideas with rich media such as video, documents and web sites. As Glyma is a very visually oriented environment, it’s easier to show Glyma rather than talk to it.

Ted

image

What you’re looking at in the first image above are the concepts and knowledge that were captured from a TED talk on education augmented with additional information from Wikipedia. The second is a map that brings together the rationale from a number of SPC14 Vegas videos on the topic of Hybrid SharePoint deployments.

Glyma brings together different types of media, like geographical maps, video, audio, documents etc. and then “glues” them together by visualising the common concepts they exemplify. The idea is to reduce the burden on the expert for codifying their knowledge, while at the same time improving the opportunity for insight for those who are learning. Glyma is all about understanding context, gaining a deeper understanding of issues, and asking the right questions.

We see that depending on your focus area, Glyma offers multiple benefits.

For individuals…

As knowledge workers our task is to gather and learn information, sift through it all, and connect the dots between the relevant information. We create our knowledge by weaving together all this information. This takes place through reading articles, explaining on napkins, diagramming on whiteboards etc. But no one observes us reading, people throw away napkins, whiteboards are wiped clean for re-use. Our journey is too “disposable”, people only care about the “output” – that is until someone needs to understand our “quilt of information”.

Glyma provides end users with an environment to catalogue this journey. The techniques it incorporates helps knowledge workers with learning and “connecting the dots”, or as we know it synthesising. Not only does it help us with doing these two critical tasks, it then provides a way for us to get recognition for that work.

For teams…

Like the scenario I started this post with, we’ve all been on the giving and receiving end of it. That call to Jeff who has gone on holiday for a month prior to starting his promotion and now you need to know the background to solving an issue that has arisen on your watch. Whether you were the person under pressure at the office thinking, “Jeff has left me nothing of use!”, or you are Jeff trying to enjoy your new promotion thinking, “Why do they keep on calling me!”, it’s an uncomfortable situation for all involved.

Because Glyma provides a medium and techniques that aid and enhance the learning journey, it can then act as the project memory long after the project has completed and the team members have moved onto their next challenge. The context and the lessons it captures can then be searched and used both as a historical look at what has happened and, more importantly, as a tool for improving future projects.

For organisations…

As I said earlier, intangible assets now dominate the balance sheets of many organisations. Where in the past, we might have valued companies based on how many widgets they sold and how much they have in their inventory, nowadays intellectual capital is the key driver of value. Like any asset, organisations need to extract maximum value from intellectual capital and in doing so, avoid repeat mistakes, foster innovation and continue growth. Charles G. Sieloff summed this up well in the name of his paper, “if only HP knew what HP knows”.

As Glyma aids, enhances, and captures an individual’s learning journey, that journey can now be shared with others. With Glyma, learning is no longer a silo, it becomes a shared journey. Not only does it do this for individuals but it extends to group work so that the dynamics of a group’s learning is also captured. Continuous improvement of organisational processes and procedures is then possible with this captured knowledge. With Glyma, your knowledge assets are now tangible.

Lemme see it!

So after reading this post this far, I assume that you would like to take a look. Well as luck would have it, we put out a public Glyma site the other day that contains some of my own personal maps. The maps on the SP2013 apps model and hybrid SP2013 deployments in particular represent my own learning journey, so hopefully should help you if you want a synthesis of all the pros and cons of these issues. Be sure to check the videos on the getting started area of the site, and check the sample maps! Smile

glymasite

I hope you like what you see. I have a ton of maps to add to this site, and very soon we will be inviting others to curate their own maps. We are also running a closed beta, so if you want to see this in your organisation, go to the site and then register your interest.

All in all, I am super proud of my colleagues at Seven Sigma for being able to deliver on this vision. I hope that this becomes a valuable knowledge resource for the SharePoint community and that you all like it. I look forward to seeing how history judges this… we think Glyma is innovative, but we are biased! 🙂

 

Thanks for reading…

Paul Culmsee

www.glyma.co

www.hereticsguidebooks.com

 Digg  Facebook  StumbleUpon  Technorati  Deli.cio.us  Slashdot  Twitter  Sphinn  Mixx  Google  DZone 

No Tags

Send to Kindle

A lesser known way to fine-tune SharePoint search precision…

Send to Kindle

Hi all

While I’d like to claim credit for the wisdom in this post, alas I cannot. One of Seven Sigma’s consultants (Daniel Wale) worked this one out and I thought that it was blog-worthy. Before I get into the issue and Daniel’s resolution, let me give you a bit of search engine theory 101 with a concept that I find is useful to help understand search optimisation.

Precision vs. recall

Each time a person searches for information, there is an underlying goal or intended outcome. While there has been considerable study of information seeking behaviours in academia and beyond, they boil down to three archetype scenarios.

  1. “I know exactly what I am looking for” – The user has a particular place in mind, either because they visited it in the past or because they assume it exists. This known as known item seeking, but is also referred to as navigational seeking or refinding.
  2. “I’m not sure what I am looking for but I’ll know it when I find it” – This is known as exploratory seeking and the purpose is to find information assumed to be available. This is characterised by
    • – Looking for more than one answer
    • – No expectation of a “right” answer
    • – Open ended
    • – Not necessarily knowing much about what is being looking for
    • – Not being able to articulate what is being looked for
  3. “Gimme gimme gimme!” – A detailed research type search known as exhaustive seeking, leaving no stone unturned in topic exploration. This is characterised by;
    • – Performing multiple searches
    • – Expressing what is being looked for in many ways

Now among other things, each of these scenarios would require different search results to meet the information seeking need. For example: If you know what you are looking for, then you would likely prefer a small, highly accurate set of search results that has the desired result at the top of the list. Conversely if you are performing an exploratory or exhaustive search, you would likely prefer a greater number of results since any of them are potentially relevant to you.

In information retrieval, the terms precision and recall are used to measure search efficiency. Google’s Tim Bray put it well when he said “recall measures how well a search system finds what you want and precision measures how well it weeds out what you do not want”. Sometimes recall is just what the doctor ordered, whereas other times, precision is preferred.

The scenario and the issue…

That said, recently, Seven Sigma worked on a knowledgebase project for a large customer contact centre. The vast majority of the users of the system are customer centre operators who deal directly with all customer enquiries and have worked there for a long time. Thus most of the search behaviours are in the known item seeking category as they know the content pretty well – it is just that there is a lot of it. Additionally, picture yourself as one of those operators and then imagine the frustration a failed or time consuming search with an equally frustrated customer on the end of the phone and a growing queue of frustrated callers waiting their turn. In this scenario, search results need to be as precise as possible.

Thus, we invested a lot of time in the search and navigation experience on this project and that investment paid off as the users were very happy with the new system and particularly happy with the search experience. Additionally, we created a mega menu solution to the current navigation that dynamically builds links from knowledgebase article metadata and a managed metadata term set. This was done via the data view web part, XSLT, JavaScript and Marc’s brilliant SPServices. We were very happy with it because there was no server side code at all, yet it was very easy to administer.

So what was the search related issue? In a nutshell, we forgot that the search crawler doesn’t differentiate between your pages content and items in your custom navigation. As a result, we had an issue where searches did not have adequate precision.

To explain the problem, and the resolution, I’ll take a step back and let Daniel continue the story… Take it away Dan…

The knowledgebase that Paul described above contained thousands of articles, and when the search crawler accessed each article page, it also saw the titles of many other articles in the dynamic menu code embedded in the page. As a result, this content also got indexed. When you think about it, the search crawler can’t tell whether content is real content versus when it is a dynamic menu that drops down/slides out when you hover over the menu entry point. The result was that when users searched for any term that appeared in the mega menu, they would get back thousands of results (a match for every page) even when the “actual content” of the page doesn’t contain any references to the searched term.

There is a simple solution however, for controlling what the SharePoint search crawler indexes and what it ignores. SharePoint knows to exclude content that exists inside of <div> HTML tags that have the class noindex added to them. Eg

<div class=”menu noindex> 
  <ul> 
    <li>Article 1</li> 
    <li>Article 2</li> 
  </ul> 
</div>

There is one really important thing to note however. If your <div class=”noindex”> contains a nested <div> tag that doesn’t contain the noindex class, everything inside of this inner <div> tag will be included by the crawler. For example:

<div class=”menu noindex> 
  <ul> 
    <li>Article 1</li> 

      <div class=”submenu>
        <ul>
          <li>Article 1.1</li>
          <li>Article 1.2</li>
        </ul>
      </div>

    <li>Article 2</li> 
  </ul> 
</div>

In the code above the nested <div> to surround the submenu items does not contain the noindex class. So the text “Article 1.1” and “Article 1.2” will be crawled, while the “Article 1” and “Article 2” text in the parent <div> will still be excluded.

Obviously the example above its greatly simplified and like our solution, your menu is possibly making use of a DataViewWebPart with an XSL transform building it out. It’s inside your XSL where you’ll need to include the <div> with the noindex class because the Web Part will generate its own <div> tags that will encapsulate your menu. (Use the browser Developer Tools and inspect the code that it inserts if you aren’t familiar with the code generated, you’ll find at least one <div> elements that is nested inside any <div class=”noindex”> you put around your web part thinking you were going to stop the custom menu being crawled).

Initially looking around for why our search results were being littered with so many results that seemed irrelevant, I found the way to exclude the custom menu using this method rather easily, I also found a lot of forum posts of people having the same issue but reporting that their use of <div> tags with the noindex class was not working. Some of these posts people had included snippets of their code, each time they had nested <div> tags and were baffled by why their code wasn’t working. I figured most people were having this problem because they simply don’t read the detail in the solutions about the nesting or simply don’t understand that the web part will generate its own HTML into their page and quite likely insert a <div> that surrounds the content they are wanting to hide. As any SharePoint developer quickly finds out a lot of knowledge in SharePoint won’t come from well set out documentation library with lots of code examples that developers get used to with other environments, you need to read blogs (like this one), read forums, talk to colleagues and just build up your own experience until these kinds of gotchas are just known to you. Even the best SharePoint developer can overlook simple things like this and by figuring them out they get that little bit better each time.

Being a SharePoint developer is really about being the master of self-learning, the master of using a search engine to find the knowledge you need and most importantly the master of knowing which information you’re reading is actually going to be helpful and what is going to lead you down the garden path. The MSDN blog post by Mark Arend (http://blogs.msdn.com/b/markarend/archive/2010/06/07/control-search-indexing-crawling-within-a-page-with-noindex.aspx) gives a clear description of the problem and the solution, he also states that it is by design that nested <div> tags are re-evaluated for the noindex class. He also mentions the product team was considering changing this…  did this create the confusion for people or was it that they read the first part of the solution and didn’t read the note about nested <div> tags? In any case it’s a vital bit of the solution that it seems a lot of people overlook still.

In case you are wondering, the built in SharePoint navigation menu’s already have the correct <div> tags with the noindex class surrounding them so they aren’t any concern. This problem only exists if you have inserted your own dynamic menu system.

Other Search Provider Considerations

It is more common that you think that some sites do not just use SharePoint Search. The <div class=”noindex”> is a SharePoint specific filter for excluding content within a page, what if you have a Google Search Appliance crawling your site as well? (Yep… we did in this project)

You’re in luck, the Google documents how to exclude content within a page from their search appliance. There are a few different options but the equivalent blanket ignore of the contents between the <div class=”noindex”> tags would be to encapsulate the section between the following two comments

<!–googleoff: all–>

and

<!–googleon: all–>

If you want to know more about the GSA googleoff/googleon tags and the various options you have here is the documentation: http://code.google.com/apis/searchappliance/documentation/46/admin_crawl/Preparing.html#pagepart

Conclusion

(… and Paul returns to the conversation).

I think Dan has highlighted an easy to overlook implication of custom designing not only navigational content, but really any type of dynamically generated content on a page. While the addition of additional content can make a page itself more intuitive and relevant, consider the implication on the search experience. Since the contextual content will be crawled along with the actual content, sometimes you might end up inadvertently sacrificing precision of search results without realising.

Hope this helps and thanks for reading (and thanks Dan for writing this up)

 

Paul Culmsee

www.sevensigma.com.au

h2bp2013

 Digg  Facebook  StumbleUpon  Technorati  Deli.cio.us  Slashdot  Twitter  Sphinn  Mixx  Google  DZone 

No Tags

Send to Kindle

How not to troubleshoot SharePoint

Send to Kindle

Most SharePoint blogs tend to tell you cool stuff that the author did. Sometimes telling the dumb stuff is worthwhile too. I am in touch with my inner Homer Simpson, so I will tell you a quick story about one of my recent stupider moments…

This is a story about anchoring bias – an issue that many of us can get tripped up by. In case you are not aware, Anchoring is the tendency to be over-reliant on the some information (the “anchor”) when making subsequent decisions. Once an anchor is set in place, subsequent judgments are made by interpreting other information around the anchor.

So I had just used content deployment, in combination with some PowerShell, to push a SharePoint environment from the development environment to the test environment and it had all gone well. I ran through test cases and was satisfied that all was cool. Then another team member brought to my attention that search was not returning the same results in test as in development. I took a look and sure enough, one of the search scopes was reporting way less results than I was expecting. The issue was confined to one pages library in particular, and I accessed the library and confirmed that the pages had successfully migrated and were rendering fine.

Now I had used a PowerShell script to export the exclusions, crawled/managed properties and best bets of the development farm search application, subsequently import into test. So given the reported issue was via search results, the anchor was well and truly set. The issue had to be search right? Maybe the script had a fault?

So as one would do, I checked the crawl logs and confirmed that some items in the affected library were being crawled OK. I then double checked the web app policy for the search crawl account and made sure it had the appropriate permissions. it was good. I removed the crawl exclusions just in case they were excluding more than what they reported to be and I also I removed any proxy configuration from the search crawl account as I have seen proxy issues with crawling before.

I re-crawled and the problem persisted… hmm

I logged into the affected site as the crawl account itself and examined this problematic library. I immediately noticed that I could not see a particular folder where significant content resided. This accounted for the search discrepancy, but checking permissions confirmed that this was not an issue. The library inherited its permissions. So I created another view on the library that was set to not show folders, and when I checked that view, I could see all the affected files and their state was set to “Approved”. Damn! This really threw me. Why the hell would search account not see a folder but see the files within it when I changed the view not to include folders?

Still well and truly affected by my anchoring bias towards search, I started to consider possibilities that defied rational logic in hindsight. I wondered if there was some weird issue with the crawl account, so I had another search crawl account created and retested the issue and still the problem persisted. Then I temporarily granted the search account site owner permission and was finally able to view the missing folder content when browsing to it, but I then attempted a full crawl and the results stubbornly refused to appear. I even reset the index in desperation.

Finally, I showed the behaviour of the library to a colleague, and he said “the folder is not approved”. (Massive clunk as the penny drops for me). Shit – how can I be so stupid?

For whatever reason, the folder in question was not approved, but the files were. The crawler was dutifully doing precisely what it was configured to do for an account that has read permission to the site. When I turned on the “no folder” view, of course I saw the files inside the folder because they were approved. Argh! So bloody obvious when you think about it. Approving the folder and running a crawl immediately made the problem go away.

What really bruised my tech guy ego even more was that I have previously sorted out this exact issue for others – many times in fact! Everybody knows that when content is visible for one party and not others, its usually approvals or publishing. So the fact that I got duped by the same issue I  have frequently advised on was a bit deflating…  except that this all happened on a Friday and as all geeks know, solving a problem on a Friday always trumps tech guy ego. Smile

Thanks for reading

Paul Culmsee

 Digg  Facebook  StumbleUpon  Technorati  Deli.cio.us  Slashdot  Twitter  Sphinn  Mixx  Google  DZone 

No Tags

Send to Kindle

Why can’t people find stuff on the intranet?–Final summary

Send to Kindle

Hi

Those of you who get an RSS feed of this blog might have noticed it was busy over last week. This is because I pushed out 4 blog posts that showed my analysis using IBIS of a detailed linear discussion on LinkedIn. To save people getting lost in the analysis, I thought I’d quickly post a bit of an executive summary from the exercise.

To set context, Issue Mapping is a technique of visually capturing rationale. It is graphically represented using a simple, but powerful, visual structure called IBIS (Issue Based Information System). IBIS allows all elements and rationale of a conversation to be captured in a manner that can be easily reflected upon. Unlike prose, which is linear, the advantage of visually representing argument structure is it helps people to form a better mental model of the nature of a problem or issue. Even better, when captured this way, makes it significantly easier to identify emergent themes or key aspects to an issue.

You can find out all about IBIS and Dialogue Mapping in my new book, at the Cognexus site or the other articles on my blog.

The challenge…

On the Intranet Professionals group on LinkedIn recently, the following question was asked:

What are the main three reasons users cannot find the content they were looking for on intranet?

In all, there were more than 60 responses from various people with some really valuable input. I decided that it might be an interesting experiment to capture this discussion using the IBIS notion to see if it makes it easier for people to understand the depth of the issue/discussion and reach a synthesis of root causes.

I wrote 4 posts, each building on the last, until I had covered the full conversation. For each post, I supplied an analysis of how I created the IBIS map and then exported the maps themselves. You can follow those below:

Part 1 analysis: http://www.cleverworkarounds.com/2012/01/15/why-cant-users-find-stuff-on-the-intranet-in-ibis-synthesispart-1/
Part 2 analysis: http://www.cleverworkarounds.com/2012/01/15/why-cant-users-find-stuff-on-the-intranet-an-ibis-synthesispart-2/
Part 3 analysis: http://www.cleverworkarounds.com/2012/01/16/why-cant-users-find-stuff-on-the-intranet-an-ibis-synthesispart-3/
Part 4 analysis: http://www.cleverworkarounds.com/2012/01/16/why-cant-users-find-stuff-on-the-intranet-an-ibis-synthesispart-4/

Final map: http://www.cleverworkarounds.com/maps/findstuffpart4/Linkedin_Discussion__192168031326631637693.html

For what its worth, the summary of themes from the discussion was that there were 5 main reasons for users not finding what they are looking for on the intranet.

  1. Poor information architecture
  2. Issues with the content itself
  3. People and change aspects
  4. Inadequate governance
  5. Lack of user-centred design

Within these areas or “meta-themes” there were varied sub issues. These are captured in the table below.

Poor information architecture Issues with content People and change aspects Inadequate governance Lack of user-centred design
Vocabulary and labelling issues

· Inconsistent vocabulary and acronyms

· Not using the vocabulary of users

· Documents have no naming convention

Poor navigation

Lack of metadata

· Tagging does not come naturally to employees

Poor structure of data

· Organisation structure focus instead of user task focussed

· The intranet’s lazy over-reliance on search

Old content not deleted

Too much information of little value

Duplicate or “near duplicate” content

Information does not exist or an unrecognisable form

People with different backgrounds, language, education and bias’ all creating content

Too much “hard drive” thinking

People not knowing what they want

Lack of motivation for contributors to make information easier to use

Google inspired inflated expectations on search functionality on intranet

Adopting social media from a hype driven motivation

Lack of governance/training around metadata and tagging

Not regularly reviewing search analytics

Poor and/or low cost search engine is deployed

Search engine is not set up properly or used to full potential

Lack of “before the fact” coordination with business communications and training

Comms and intranet don’t listen and learn from all levels of the business.

Ambiguous, under-resourced or misplaced Intranet ownership

The wrong content is being managed

There are easier alternatives available

Content is structured according to the view of the owners rather than the audience

Not accounting for two types of visitors… task-driven and browse-based

No social aspects to search

Not making the search box available enough

A failure to offer an entry level view

Not accounting for people who do not know what they are looking for versus those who do

Not soliciting feedback from a user on a failed search about what was being looked for

So now you have seen the final output, be sure to visit the maps and analysis and read about the journey on how this table emerged. One thing is for sure, it sure took me a hell of a lot longer to write about it than to actually do it!

Thanks for reading

Paul Culmsee

www.sevensigma.com.au

www.hereticsguidebooks.com

 Digg  Facebook  StumbleUpon  Technorati  Deli.cio.us  Slashdot  Twitter  Sphinn  Mixx  Google  DZone 

No Tags

Send to Kindle

Why can’t users find stuff on the intranet? An IBIS synthesis–Part 4

Send to Kindle

Hi and welcome to my final post on the linkedin discussion on why users cannot find what they are looking for on intranets. This time the emphasis is on synthesis… so let’s get the last few comments done shall we?

Michael Rosager • @ Simon. I agree.
Findability and search can never be better than the content available on the intranet.
Therefore, non-existing content should always be number 1
Some content may not be published with the terminology or language used by the users (especially on a multilingual intranet). The content may lack the appropriate meta tags. – Or maybe you need to adjust your search engine or information structure. And there can be several other causes…
But the first thing that must always be checked is whether they sought information / data is posted on the intranet or indexed by the search engine.

Rasmus Carlsen • in short:
1: Too much content (that nobody really owns)
2: Too many local editors (with less knowledge of online-stuff)
3: Too much “hard-drive-thinking” (the intranet is like a shared drive – just with a lot of colors = a place you keep things just to say that you have done your job)

Nick Morris • There are many valid points being made here and all are worth considering.
To add a slightly different one I think too often we arrange information in a way that is logical to us. In large companies this isn’t necessarily the same for every group of workers and so people create their own ‘one stop shop’ and chaos.
Tools and processes are great but somewhere I believe you need to analyse what information is needed\valued and by whom and create a flexible design to suit. That is really difficult and begins to touch on how organisations are structured and the roles and functions of employees.

Taino Cribb • Hi everyone
What a great discussion! I have to agree to any and all of the above comments. Enabling users to find info can definately be a complicated undertaking that involves many facets. To add a few more considerations to this discussion:
Preference to have higher expectations of intranet search and therefore “blame” it, whereas Google is King – I hear this too many times, when users enter a random (sometimes misspelled) keyword and don’t get the result they wish in the first 5 results, therefore the “search is crap, we should have Google”. I’ve seen users go through 5 pages of Google results, but not even scroll down the search results page on the intranet.
Known VS Learned topics – metadata and user-tagging is fantastic to organise content we and our users know about, but what about new concepts where everyone is learning for the first time? It is very difficult to be proactive and predict this content value, therefore we often have to do so afterwards, which may very well miss our ‘window of opportunity’ if the content is time-specific (ie only high value for a month or so).
Lack of co-ordination with business communications/ training etc (before the fact). Quite often business owners will manage their communications, but may not consider the search implications too. A major comms plan will only go so far if users cannot search the keywords contained in that message and get the info they need. Again, we miss our window if the high content value is valid for only a short time.
I very much believe in metadata, but it can be difficult to manage in SP2007. Its good to see the IM changes in SP2010 are much improved.

Of the next four comments most covered old ground (a sure sign the conversation is now fairly well saturated). Nick says he is making a “a slightly different” point, but I think issues of structure not suiting a particular audience has been covered previously. I thought Taino’s reply was interesting because she focused on the issue of not accounting for known vs. learned topics and the notion of a “window of opportunity” in relation to appropriate tagging. Perhaps this reply was inspired by what Nick was getting at? In any event, adding it was a line call between governance and information architecture and for now, I chose the latter (and I have a habit of changing my mind with this stuff :-).

image_thumb[12]

I also liked Taino’s point about user expectations around the “google experience” and her examples. I also loved earlier Rasmus’s point about “hard-drive thinking” (I’m nicking that one for my own clients Rasmus Smile). Both of these issues are clearly people aspects, so I added them as examples around that particular theme.

image_thumb[14]

Finally, I added Taino’s “lack of co-ordination” comments as another example of inadequate governance.

image_thumb[18]

Anne-Marie Low • The one other thing I think missing from here (other than lack of metadata, and often the search tool itself) is too much content, particularly out of date information. I think this is key to ensuring good search results, making sure all the items are up to date and relevant.

Andrew Wright • Great discussion. My top 3 reasons why people can’t find content are:
* Lack of meta data and it’s use in enabling a range of navigation paths to content (for example, being able to locate content by popularity, ownership, audience, date, subject, etc.) See articles on faceted classification:
http://en.wikipedia.org/wiki/Faceted_classification
and
Contextual integration
http://cibasolutions.typepad.com/wic/2011/03/contextual-integration-how-it-can-transform-your-intranet.html#tp
* Too much out-of-date, irrelevant and redundant information
See slide 11 from the following presentation (based on research of over 80 intranets)
http://www.slideshare.net/roowright/intranets2011-intranet-features-that-staff-love
* Important information is buried too far down in the hierarchy
Bonus 2 reasons 🙂
* Web analytics and measures not being used to continuously improve how information is structured
* Over reliance on Search instead of Browsing – see the following article for a good discussion about this
Browse Versus Search: Stumbling into the Unknown
http://idratherbewriting.com/2010/05/26/browse-versus-search-organizing-content-9/

Both Anne and Andrew make good points and Andrew supplies some excellent links too, but all of these issues have been covered in the map so nothing more has been added from this part of the discussion.

Juan Alchourron • 1) that particular, very important content, is not yet on the intranet, because “the” director don’t understand what the intranet stands for.
2) we’re asuming the user will know WHERE that particular content will be placed on the intranet : section, folder and subfolder.
3) bad search engines or not fully configured or not enough SEO applied to the intranet

John Anslow • Nowt new from me
1. Search ineffective
2. Navigation unintuitive
3. Useability issues
Too often companies organise data/sites/navigation along operational lines rather than along more practical means, team A is part of team X therefore team A should be a sub section of team X etc. this works very well for head office where people tend to have a good grip of what team reports where but for average users can cause headaches.
The obvious and mostly overlooked method of sorting out web sites is Multi Variant Testing (MVT) and with the advent of some pretty powerful tools this is no longer the headache that it once was, why not let the users decide how they want to navigate, see data, what colour works best, what text encourages them to follow what links, in fact how it works altogether?
Divorcing design, usability, navigation and layout from owners is a tough step to take, especially convincing the owners but once taken the results speak for themselves.

Most of these points are already well discussed, but I realised I had never made a reference to John’s point about organisational structures versus task based structures for intranets. I had previously captured rationale around the fact that structures were inappropriate, so I added this as another example to that argument within information architecture…

image

Edwin van de Bospoort • I think one of the main reasons for not finding the content is not poor search engines or so, but simply because there’s too much irrelevant information disclosed in the first place.
It’s not difficult to start with a smaller intranet, just focussing on filling out users needs. Which usually are: how do I do… (service-orientated), who should I ask for… (corporate facebok), and only 3rd will be ‘news’.
So intranets should be task-focussed instead if information-focussed…
My 2cnts 😉

Steven Kent • Agree with Suzanne’s suggestion “Old content is not deleted and therefore too many results/documents returned” – there can be more than one reason why this happens, but it’s a quick way to user frustration.

Maish Nichani • It is interesting to see how many of us think metadata and structure are key to finding information on the intranet. I agree too. But come to think of it, staff aren’t experts in information management. It’s all very alien to them. Not too long ago, they had their desktops and folders and they could find their information when they wanted. All this while it was about “me and my content”. Now we have this intranet and shared folders and all of a sudden they’re supposed to be thinking about how “others” would like to find and use the information. They’ve never done this before. They’ve never created or organized information for “others”. Metadata and structure are just “techie” stuff that they have to do as part of their publishing, but they don’t know why they’re doing it or for what reason. They real problem, in my opinion, is lack of empathy.

Barry Bassnett • * in establishing a corporate taxonomy.1. Lack of relevance to the user; search produces too many documents.3. Not training people in the concept that all documents are not created by the individual for the same individual but as a document that is meant to be shared. e.g. does anybody right click PDFs to add metadata to its properties? Emails with a subject line stat describe what is in it.

Luc de Ruijter • @Maish. Good point about information management.
Q: Who’d be responsible to oversee the management of information?
Shouldn’t intranet managers/governors have that responsibility?
I can go along with (lack of) empathy as an underlying reason why content isn’t put away properly. This is a media management legacy reason: In media management content producers never had to have empathy with participating users, for there were only passive audiences.
If empathy is an issue. Then it proves to me that communication strategies are still slow to pick up on the changes in communication behaviour and shift in mediapower, in the digital age.
So if we step back from technological reasons for not finding stuff (search, meta, office automation systems etc.) another big reason looks around the corner of intranet management: those responsible for intranet policies and strategy.

Most of this discussion covers stuff already represented in the map, although I can see that in this part of the conversation there is a preoccupation with content and its relevance. Maish also makes a couple of good points. First up he makes the point that staff are not experts in information management and don’t tend to think about how someone else might wish to find the information later. He also concludes by stating the real problem is a lack of empathy. I liked this and felt that this was a nice supporting argument to the whole conjecture that “people issues” is a major theme in this discussion, so I added it as a pro.

image

 

Now we have an interesting bit in the conversation (for me anyway). Terry throws a curveball question. (Side note: Curveball questions are usually asked with genuine intent, but tend to have a negative effect on live meetings. Dialogue Mapping loves curveball questions as it is often able to deflect its negative impacts).

Terry Golding • Can I play devils advocate and ask WHY you feel meta data is so vital? Dont misunderstand me I am not saying that it is not important, but I cant help feeling that just saying meta data as a reason for not finding things is rather a simplification. Let me ask it another way, what is GOOD meta data, can you give examples please ?

Luc de Ruijter • @Terry. Good questions which can have many answers (see all comments above where you’ll find several answers already). Why do library books have labels on their covers? Those labels are in fact metadata (avant la lettre) which help library people ordering their collection, and clients to find titles. How do you create tag clouds which offer a more intuitive and user centered way to navigate a website/blog? By tagging all content with (structured) meta tags.Look around a bit and you’ll see that metadata are everywhere and that they serve you in browsing and retrieving content. That’s why metadata are vital these days.I think there are no strict right and good meta structures. Structures depend on organisational contexts. Some metastructures are very complex and formal (see comments about taxonomies above), others are quite simple.Metadata can enable users to browse information blocks. By comparisson navigation schemes can only offer rigid sender driven structures to navigate to pages.

Andrew Wright • @Terry. Meta data enables content to be found in a number of different ways – not just one as is typical of paper based content (and many intranets as well unfortunately).
For instance, if you advertise a house for sale you may have meta data about the house such as location, number of rooms and price. This then allows people to locate the house using this meta data (eg. search by number of bedrooms, price range, location). Compare this with how houses are advertised in newspapers (ie. by location only) and you can see the benefits of meta data.
For a good article about the benefits of meta data, read Card Sorting Doesn’t Cut the Custard:
http://www.zefamedia.com/websites/card-sorting-doesnt-cut-the-custard/
To read a more detailed example about how meta data can be applied to intranets, read:
Contextual integration: how it can transform your intranet
http://cibasolutions.typepad.com/wic/2011/03/contextual-integration-how-it-can-transform-your-intranet.html

Terry questions the notion of metadata. I framed it as a con against the previous metadata arguments. Both Luc and Andrew answer and I think the line that most succinctly captures the essence of than answer is Andrew’s “Meta data enables content to be found in a number of different ways”. So I reframe that slightly as a pro supporting the notion that lack of metadata is one of the reasons why users can;t find stuff on the intranet.

image

Next is yours truly…

Paul Culmsee • Hi all
Terry a devils advocate flippant answer to your devils advocate question comes from Corey Doctrow with his dated, but still hilarious essay on the seven insurmountable obstacles to meta-utopia 🙂 Have a read and let me know what you think.
http://www.well.com/~doctorow/metacrap.htm
Further to your question (and I *think* I sense the undertone behind your question)…I think that the discussion around metadata can get a little … rational and as such, rational metadata metaphors are used when they are perhaps not necessarily appropriate. Yes metadata is all around us – humans are natural sensemakers and we love to classify things. BUT usually the person doing the information architecture has a vested interest in making the information easy for you. That vested interest drives the energy to maintain the metadata.
In user land in most organisations, there is not that vested interest unless its on a persons job description and their success is measured on it. For the rest of us, the energy required to maintain metadata tends to dissipate over time. This is essentially entropy (something I wrote about in my SharePoint Fatigue Syndrome post)
http://www.cleverworkarounds.com/2011/10/12/sharepoint-fatigue-syndrome/

Bob Meier • Paul, I think you (and that metacrap post) hit the nail on the head describing the conflict between rational, unambiguous IA vs. the personal motivations and backgrounds of the people tagging and consuming content. I suspect it’s near impossible to develop a system where anyone can consistently and uniquely tag every type of information.
For me, it’s easy to get paralyzed thinking about metadata or IA abstractly for an entire business or organization. It becomes much easier for me when I think about a very specific problem – like the library book example, medical reports, or finance documents.

Taino Cribb • @Terry, brilliant question – and one which is quite challenging to us that think ‘metadata is king’. Good on you @Paul for submitting that article – I wouldn’t dare start to argue that. Metadata certainly has its place, in the absence of content that is filed according to an agreed taxonomy, correctly titled, the most recent version (at any point in time), written for the audience/purpose, valued and ranked comparitively to all other content, old and new. In the absence of this technical writer’s utopia, the closest we can come to sorting the wheat from the chaff is classifcation. It’s not a perfect workaround by any means, though it is a workaround.
Have you considered that the inability to find useful information is a natural by-product of the times? Remember when there was a central pool to type and file everything? It was the utopia and it worked, though it had its perceived drawbacks. Fast forward, and now the role of knowledge worker is disseminated to the population – people with different backgrounds, language, education and bias’ all creating content.
It is no wonder there is content chaos – it is the price we pay for progress. The best we as information professionals can do is ride the wave and hold on the best we can!

Now my reply to Terry was essentially speaking about the previously spoken of issue around lack of motivation on the part of users to make their information easy to use. I added a pro to that existing idea to capture my point that users who are not measured on accurate metadata have little incentive to put in the extra effort. Taino then refers to pace of change more broadly with her “natural by-product of the times” comment. This made me realise my meta theme of “people aspects” was not encompassing enough. I retitled it “people and change aspects” and added two of Taino’s points as supporting arguments for it.

image

At this point I stopped as enough had been captured the the conversation had definitely reached saturation point. It was time to look at what we had…

For those interested, the final map had 139 nodes.

The second refactor

At this point is was time to sit back and look at the map with the view of seeing if my emergent themes were correct and to consolidate any conversational chaff. Almost immediately, the notion of “content” started to bubble to the surface of my thinking. I had noticed that a lot of conversation and re-iteration by various people related to the content being searched in the first place. I currently had some of that captured in Information Architecture and in light of the final map, I felt that this wasn’t correct. The evidence for this is that Information Architecture topics dominated the maps. There were 55 nodes for information architecture, compared to 34 for people and change and 31 for governance.

Accordingly, I took all of the captured rationale related to content and made it its own meta-theme as shown below…

image

Within the “Issues with the content being searched” map are the following nodes…

image

I also did another bit of fine tuning too here and there and overall, I was pretty happy with the map in its current form.

The root causes

If you have followed my synthesis of what the dialogue from the discussion told me, it boiled down to 5 key recurring themes.

  1. Poor Information Architecture
  2. Issues with the content itself
  3. People and change aspects
  4. Inadequate governance
  5. Lack of user-centred design

I took the completed maps, exported the content to word and then pared things back further. This allowed me to create the summary table below:

Poor Information Architecture Issues with content People and change aspects Inadequate governance Lack of user-centred design
Vocabulary and labelling issues

· Inconsistent vocabulary and acronyms

· Not using the vocabulary of users

· Documents have no naming convention

Poor navigation

Lack of metadata

· Tagging does not come naturally to employees

Poor structure of data

· Organisation structure focus instead of user task focussed

· The intranet’s lazy over-reliance on search

Old content not deleted

Too much information of little value

Duplicate or “near duplicate” content

Information does not exist or an unrecognisable form

People with different backgrounds, language, education and bias’ all creating content

Too much “hard drive” thinking

People not knowing what they want

Lack of motivation for contributors to make information easier to use

Google inspired inflated expectations on search functionality on intranet

Adopting social media from a hype driven motivation

Lack of governance/training around metadata and tagging

Not regularly reviewing search analytics

Poor and/or low cost search engine is deployed

Search engine is not set up properly or used to full potential

Lack of “before the fact” coordination with business communications and training

Comms and intranet don’t listen and learn from all levels of the business.

Ambiguous, under-resourced or misplaced Intranet ownership

The wrong content is being managed

There are easier alternatives available

Content is structured according to the view of the owners rather than the audience

Not accounting for two types of visitors… task-driven and browse-based

No social aspects to search

Not making the search box available enough

A failure to offer an entry level view

Not accounting for people who do not know what they are looking for versus those who do

Not soliciting feedback from a user on a failed search about what was being looked for

The final maps

The final map can be found here (for those who truly like to see full context I included an “un-chunked” map which would look terrific when printed on a large sized plotter). Below however, is a summary as best I can do in a blog post format (click to enlarge). For a decent view of proceedings, visit this site.

Poor Information Architecture

part4map1

Issues with the content itself

part4map2

People and change aspects

part4map3

Inadequate governance

part4map4

Lack of user-centred design

part4map5

Thanks for reading.. as an epilogue I will post a summary with links to all maps and discussion.

Paul Culmsee

www.sevensigma.com.au

 Digg  Facebook  StumbleUpon  Technorati  Deli.cio.us  Slashdot  Twitter  Sphinn  Mixx  Google  DZone 

No Tags

Send to Kindle

Why can’t users find stuff on the intranet? An IBIS synthesis–Part 3

Send to Kindle

Hi all

This is the third post in a quick series that attempts to use IBIS to analyse an online discussion. The map is getting big now, but luckily, we are halfway through the discussion and will have most of the rationale captured by the end of this post. We finished the part 2 with a summary map that has grouped the identified reasons why it is hard to find information on intranets into core themes. Right now there are 4 themes that have emerged. In this post we see if there are any more to emerge and fully flesh out the existing ones. Below is our starting point for part 3.

part3map1_thumb5

Our next two responses garnered more nodes in the map than most others. I think this is a testament to the quality of their input to the discussion. First up Dan…

Dan Benatan • Having researched this issue across many diffferent company and departmental intranets, my most frequent findings are:
1. A complete lack of user-centred design. Content that many members of the organization need to access is structured according to the view of the content owners rather than the audience. This should come as no surprise, it remains the biggest challenge in public websites.
2. A failure to offer an entry level view. Much of the content held on departmental intranets is at a level of operational detail that is meaningless to those outside the team. The information required is there, but it is buried so deep in the documents that people outside the team can’t find it.
3. The intranet’s lazy over-reliance on search. Although many of us have become accustomed to using Google as our primary entry point to find content across the web, we may do this because we know we have no hope of finding the content through traditional navigation. The web is simply far too vast. We do not, however, rely purely on search once we are in the website we’ve chosen. We expect to be able to navigate to the desired content. Navigation offers context and enables us to build an understanding of the knowledge area as we approach the destination. In my research I found that most employees (>70%) try navigation first because they feel they understand the company well enough to know where to look.
4. Here I agree with many of the comments posted above. Once the user does try search, it still fails. The search engine returns too many results with no clear indication of their relative validity. There is a wealth of duplicate content on most intranets and , even worse, there is a wealth of ‘near duplicate’ content; some of which is accurate and up-to-date and much that is neither. The user has no easy way to know which content to trust. This is where good intranet management and good metadata can help.

My initial impression was that this was an excellent reply and Dan’s experience shone through it. I thought this was one of the best contributions to the discussion thus far. Let’s see what I added shall we?

First up, Dan returned to the user experience issue, which was one of the themes that had emerged. I liked his wording of the issue, so I also changed the theme node of “Inadequate user experience design” to Dan’s framing of “Lack of user-centred design”, which I thought was better put. I then added his point about content structured to the world view of owner, rather than audience. His second point about an “entry level view” relates to the first point in the sense that both are user centred design issues. So I added the entry level view point as an example…

image_thumb14

I added Dan’s point about the intranet’s lazy over-reliance on search to the information architecture theme. I did this because he was discussing the relationship between navigation and search, and navigation had already come up as an information architecture issue.

image_thumb23

Dan’s final point about too many results returned was already covered previously, but he added a lot of valuable arguments around it. I restructured that section of the map somewhat and incorporated his input.

image_thumb6

Next we have Rob, who also made a great contribution (although not as concise as Dan)

Rob Faulkner • Wow… a lot of input, and a lot of good ideas. In my experience there can be major liabilities with all of these more “global” concepts, however.
No secret… Meta data is key for both getting your site found to begin with, as well as aiding in on-site search. The weak link in this is the “people aspect” of the exercise, as has been alluded to. I’ve worked on interactive vehicles with ungodly numbers of pages and documents that rely on meta data for visibility and / or “findability” (yes, I did pay attention in English class once in a while… forgive me), and the problem — more often than not — stems from content managers either being lazy and doing a half ass job of tagging, if at all, or inconsistency of how things are tagged by those that are gung-ho about it. And, as an interactive property gets bigger, so too does the complexity tagging required to make it all work. Which circles back to freaking out providers into being lazy on the one hand, or making it difficult for anyone to get it “right” on the other. Vicious circle. Figure that one out and you win… from my perspective.
Another major issue that was also alluded to is organization. For an enterprise-class site, thorough taxonomy / IA exercises must be hammered out by site strategists and THEN tested for relevance to target audiences. And I don’t mean asking targets what THEY want… because 9 times out of 10 you’re either going to get hair-brained ideas at best, or blank stares at worst. You’ve just got to look at the competitive landscape to figure out where the bar has been set, what your targets are doing with your product (practical application, OEMing, vertical-specific use, etc… then Test the result of your “informed” taxonomy and IA to ensure that it does, in fact, resonate with your targets once you’ve gotten a handle on it.
Stemming from the above, and again alluded to, be cautious about how content is organized in order to reflect how your targets see it, not how internal departments handle it. Most of the time they are not one in the same. Further, you’ve got to assume that you’re going to have at least two types of visitors… task-driven and browse-based. Strict organization by product or service type may be in order for someone that knows what they’re looking for, but may not mean squat to those that don’t. Hence, a second axis of navigation that organizes your solutions / products by industry, pain point, what keeps you up at night, or whatever… will enable those that are browsing, or researching, a back door into the same ultimate content. Having a slick dynamic back-end sure helps pull this off
Finally, I think a big mistake made across all verticals is what the content consists of to begin with. What you may think is the holy grail, and the most important data or interactive gadget in the world may not mean a hill-of-beans to the user. I’ve conducted enough focus groups, worldwide, to know that this is all typically out of alignment. I never cease to be amazed at exactly what it is that most influences decision makers.
I know a lot of this was touched upon by many of you. Sorry about that… damn thread is just getting too long to go back and figure out exactly who said what!
Cheers…

Now Rob was the first to explicitly mention “People aspects”, and I immediately realised this was the real theme that “Lack of motivation on the part of contributors…”was getting at. So I restructured the map so that “people aspects” was the key theme and the previous point of “Lack of motivation” was an example. I then added Rob’s other examples.

image_thumb27

After making his points around people aspects, Rob then covers some areas well covered already (metadata, content organsiation), so I did not add any more nodes. But at the end, he added a point about browse oriented vs. search oriented users, which I did add to the user-centred design discussion.

image_thumb33

Rob also made a point about users who know what they want when searching for information vs. those who do not. (In Information Architecture terms, this is called “Known item seeking” vs “exploratory seeking”). That had not been covered previously, so I added it to the Information Architecture discussion.

image_thumb31

Finally, I captured Rob’s point about the wrong content being managed in the first place. This is a governance issue since the best Information architecture or user experience design won;t matter a hoot if you’re not making the right content available in the first place.

image_thumb32

Hans Leijström • Great posts! I would also like to add lack of quality measurements (e.g. number of likes, useful or not) and the fact that the intranets of today are not social at all…

Caleb Lamz • I think everyone has provided some great reasons for users not being able to find what they are looking for. I lean toward one of the reasons Bob mentions above – many intranets are simply not actively managed, or the department managing it is not equipped to do so.
Every intranet needs a true owner (no matter where it falls in the org chart) that acts a champion of the user. Call it the intranet manager, information architect, collaboration manager, or whatever you want, but their main job needs to be make life easier for users. Responsibilities include doing many of the things mentioned above like refining search, tweaking navigation, setting up a metadata structure, developing social tools (with a purpose), conducting usability tests, etc.
Unfortunately, with the proliferation of platforms like SharePoint, many IT departments roll out a system with no true ownership, so you end up with content chaos.

There is no need to add anything from Hans as he was re-iterating a previous comment about analytics which was captured already. Caleb makes a good point about ownership of content/intranet which is a governance issue in my book. So I added his contribution there…

image

Dena Gazin • @Suzanne. Yes, yes, yes – a big problem is old content. Spinning up new sites (SharePoint) and not using, or migrating sites and not deleting old or duplicative content. Huge problem! I’m surprised more people didn’t mention this. Here’s my three:
1. Metadata woes (@Erin – lack of robust metadata does sound better as improvements can be remedied on multiple levels)
2. Old or duplicate content (Data or SharePoint Governance)
3. Poorly configured search engine
Bonus reason: Overly complicated UIs. There’s a reason people like Google. Why do folks keep trying to mess up a good thing? Keep it as simple as you possibly can. Create views for those who need more. 80/20 rule!

Dena’s points are a reiteration of previous points, but I did like her “there is a reason people like google” point, which I considered to be a nice supporting argument of the entire user-centric design theme.

image

Next up we have another group of discussions. What is interesting here is that there is some disagreement – and a lot of prose – but not a lot of information was added to the map from it.

Luc de Ruijter • @Rob. Getting information and metastructures in place requires interactions with the owners of information. I doubt whether they are lazy or blank staring people – I have different experiences with engaging users in preparing digital working environments. People may stare back at you when you offer complete solutions they can say “yea” or “nay” to. And this is still common practice amogst Communication specialists (who like creating stuff themselves first and then communicate about them to others later). And if colleagues stare blank at your proposal, they obviously are resisting change and in need of some compelling communciation campaign…
Communication media legacy models are a root cause for failing intranets.
Tagging is indeed a complex excercise. And we come from a media-age in which fun predominated and we were all journalists and happy bunnies writing post after post, page after page, untill the whole cluttered intranet environment was ready again for a redesign.
Enterprise content is not media content, but enterprise content. Think about it (again please 🙂 ). If you integrate the storage process of enterprise content into the “saving as” routine, you’ll have no problems anymore with keeping your content clean and consistent. All wil be channeled through consistent routines. This doesn’t kill adding free personal meta though, it just puts the content in a enterprise structure. Think enterprise instead of media and tagging solutions are for grabs.
I agree that working on taxonomies can become a drag. Leadership and vison can speed up the process. And mandate of course.
I believe that the whole target rationale behind websites is part of the Communication media legacy we need to loose in order to move forward in better communication services to eployees. Target-thinking hampers the construction of effectve user centered websites, for it focusses on targets, persona, audiences, scenario’s and the whole extra paper media works.
While users only need flexibility, content in context, filters and sorting options. Filtering and sorting are much more effective than adding one navigation tree ater another. And they require a 180° turn in conventional communciation thinking.
@Caleb. Who manages an intranet?
Is that a dedicated team of intranet managers, previously known as content managers, previously known as communciation advisors, previously known as mere journalists? Or is intranet a community affair in which the community IS the manager of content? Surely you want a metamodel to be managed by a specialist. And make general management as much a non-silo activity as possible. Collaboration isn’t confined to silo’s so intranet shouldn’t be either.
A lot of intranets are run by a small group of ‘experts’ whose basic reasoning is that intranet is a medium like an employee magazine. If you want content management issues, start making such groups responsible for intranet.
In my experience intranets work once you integrate them in primary processes. Itranet works for you if you make intranet part of your work. De-medialise the intranet and you have more chance for sustainable success.
Rolling out Sharepoints is a bit like rolling back time. We’ll end up somewhere where we already were in 2001, when digital IC was IT policy. The fact that we are turning back to that situation is a good and worrying illustration of the fact that strategy on digital communications is lacking in the Communications department – otherwise they wouldn’t loose out to IT.
@Dena. I think your bonus reason is a specific Sharepoint reason. Buy Sharepoint and get a big bonus bag of bad design stuff with it – for free! An offer you can’t refuse 🙂

Luc de Ruijter • @Dena. My last weeks tweet about search: Finding #intranet content doesn’t start with #search #SEO. It starts with putting information in a content #structure which is #searchable. Instead of configuring your search engine, think about configuring your content first.

Once again Luc is playing the devils advocate role with some broader musings. I might have been able to add some of this to the map, but it was mostly going over old ground or musings that were not directly related to the question being asked. This time around, Rob takes issue with some of his points and Caleb agrees…

Rob Faulkner • @Luc, Thanks for your thoughtful response, but I have to respectfully disagree with you on a few points. While my delivery may have been a bit casual, the substance of my post is based on experience.
First of all, my characterizations of users being 1) lazy or 2) blank staring we’re not related to the same topic. Lazy: in reference to tagging content. Blank Staring: related to looking to end users for organizational direction.
Lazy, while not the most diplomatic means of description, I maintain, does occur. I’ve experienced it, first hand. A client I’m thinking of is a major technology, Fortune 100 player with well over 100K tech-focused, internet savvy (for the most part) employees. And while they are great people and dedicated to their respective vocation, they don’t always tag documents and / or content-chunks correctly. It happens. And, it IS why a lot of content isn’t being located by targets — internally or externally. This is especially the case when knowledge or content management grows in complexity as result of content being repurposed for delivery via different vehicles. It’s not as simple as a “save as” fix. This is why I find many large sites that provide for search via pre-packed variables, — i.e. drop-downs, check-boxes, radio-buttons, etc — somewhat suspect, because if you elect to also engage in keyword index search you will, many times, come up with a different set of results. In other words, garbage in, garbage out. That being said, you asked “why,” not “what to do about it” and they are two completely different topics. I maintain that this IS definitely a potential “why.”
As far as my “blank stare” remark, it had nothing to do with the above, which you tied it to… but I am more than fluent in engaging and empowering content owners in the how’s and why’s of content tagging without confusing them or eliciting blank stares. While the client mentioned above is bleeding-edge, I also have vast experience with less tech-sophisticated entities — i.e. 13th-century country house hotels — and, hence, understand the need to communicate with contributors appropriate to what will resonate with them. This is Marketing 101.
In regard to the real aim of my “blank stare” comment, it is very germane to the content organization conversation in that it WILL be one of your results if you endeavour to ask end-users for direction. It is, after all, what we as experts should be bringing to the table… albeit being qualified by user sounding boards.
Regarding my thoughts on taxonomy exercises… I don’t believe I suggested it was a drag, at all. The fact is, I find this component of interactive strategy very engaging… and a means to create a defensible, differentiated marketing advantage if handled with any degree of innovation.
In any event, I could go on and on about this post and some of the assumptions, or misinterpretations, you’ve made, but why bother? When I saw your post initially, it occurred to me you were looking for input and perhaps insight into what could be causing a problem you’re encountering… hence the “why does this happen” tone. Upon reviewing the thread again, it appears you’re far more interested in establishing a platform to pontificate. If you want to open a discussion forum you may want to couch your topic in more of a “what are your thoughts about x, y, z?”… rather than “what could be causing x, y, z?” As professionals, if we know the causes we’re on track to address the problem.

Caleb Lamz • I agree with Rob, that this thread has gone from “looking for input” to “a platform to pontificate”. You’re better off making this a blog post rather than asking for input and then making long and sometimes off the cuff remarks on what everyone else has graciously shared. It’s unproductive to everyone when you jump to conclusions based on the little information that other users can provide in a forum post.

Luc de Ruijter • The list:
Adopting social media from a hype-driven motivation (lack of coherence)
big problem with people just PDFing EVERYTHING instead of posting HTML pages
Comms teams don’t listen and learn from all levels of the business
Content is not where someone thought it would be or should be or its not called what they thought it was called or should be called.
content is titled poorly
content managers either being lazy and doing a half ass job of tagging
content they are trying to find is out of date, cannot be trusted or isn’t even available on the intranet.
Documents have no naming convention
failure to offer an entry level view
inconsistency of how things are tagged
Inconsistent vocabulary and acronyms
info is organised by departmental function rather than focussed on end to end business process.
information being searched does not actually exist or exists only in an unrecognisable form and therefore cannot be found!
intranet’s lazy over-reliance on search
intranets are simply not actively managed, or the department managing it is not equipped to do so.
intranets of today are not social at all
just too much stuff
Lack of content governance, meta-data and inconsistent taxonomy, resulting in poor search capability.
Lack of measuring and feedback on (quality, performance of) the intranet
Lack of metadata
lack of motivation on the part of contributors to make their information easy to use
lack of quality measurements (e.g. number of likes, useful or not
lack of robust metadata
lack of robust metadata, resulting in poor search results;
lack of user-centred design
main navigation is poor
not fitting the fact that there are at least two types of visitors… task-driven and browse-based
Not making the search box available enough
Old content is not deleted and therefore too many results/documents returned
Old or duplicate content (Data or SharePoint Governance)
Overly complicated UIs
Poor navigation, information architecture and content sign-posting
Poorly configured search engine
proliferation of platforms like SharePoint
relevance of content (what’s hot for one is not for another)
Search can’t find it due to poor meta data
Search engine is not set up correctly
search engine returns too many results with no clear indication of their relative validity
structure is not tailored to the way the user thinks

Luc de Ruijter • This discussion has produced a qualitative and limited list of root causes for not finding stuff. I think we can all work with this.
@Rob & @Caleb My following question is always what to do after digesting and analysing information. I’m after solutions, that;s why I asked about root causes (and not symptoms). Reading all the comments triggers me in sharing some points of view. Sometimes that’s good to fuel the conversation sometimes. For if there is only agreement, there is no problem. And if there is no problem, what will we do in our jobs? If I came across clerical, blame it on Xmas.
Asking the “what to do with this input?” is pehaps a question for another time.

The only thing I added to the map from this entire exchange is Rob’s point of no social aspects to search. I thought this was interesting because of an earlier assertion that applying social principles to an intranet caused more silos. Seems Luc and Rob have differing opinions on this point.

image

Where are we at now?

At this point, we are almost at the end of the discussion. In this post, I added 25 nodes against 10 comments. Nevertheless, we are not done yet. In part 4 I will conclude the synthesis of the discussion and produce a final map. I’ll also export the map to MSWord, summarising the discussion as it happened. Like the last three posts, you can click here to see the maps exported in more detail.

There are four major themes that have emerged. Information Architecture, People aspects, Inadequate governance and lack of user-centred design. The summary maps for each of these areas are below (click to enlarge):

Information Architecture

part3map2[5]

People aspects

part3map3[5]

Inadequate Governance

part3map4[5]

Lack of user-centred design

part3map5[5]

Thanks for sticking with me thus far – almost done now…

Paul Culmsee

CoverProof29

www.sevensigma.com.au

 Digg  Facebook  StumbleUpon  Technorati  Deli.cio.us  Slashdot  Twitter  Sphinn  Mixx  Google  DZone 

No Tags

Send to Kindle

Why can’t users find stuff on the intranet? An IBIS synthesis–Part 2

Send to Kindle

Hi all

This is the second post in a quick series that attempts to use IBIS to analyse an online discussion. Strange as it may sound, but I believe that issue mapping and IBIS is one of the most pure forms of information architecture you can do. This is because a mapper, you are creating a navigable mental model of speech as it is uttered live. This post is semi representative of this. I am creating an IBIS based issue map, but I’m not interacting live with participants. nevertheless, imagine if you will, you sitting in a room with a group of stakeholders answering the question on why users cannot find what they are looking for on the intranet. Can you see its utility in creating shared understanding of a multifaceted issue?

Where we left off…

We finished the previous discussion with a summary map that identified several reasons why it is hard to find information on intranets. In this post we will continue our examination of this topic. What you will notice in this post is that the number of nodes that I capture are significantly less than in part 1. This is because some topics start to become saturated and people’s contributions are the same as what is already captured. In Part 1, I captured 55 nodes from the first 11 replies to the question. In this post, I capture an additional 33 nodes from the next 15 replies.

image_thumb72

So without further adieu, lets get into it!

Suzanne Thornley • Just another few to add (sorry 5 not 3 :-):
1. Search engine is not set up correctly or used to full potential
2. Old content is not deleted and therefore too many results/documents returned
3. Documents have no naming convention and therefore it is impossible to clearly identify what they are and if they are current.
4. Not just a lack of metadata but also a lack of governance/training around metadata/meta tagging so that less relevant content may surface because the tagging and metadata is better.
5. Poor and/or low cost search engine is deployed in the mistaken belief that users will be happy/capable of finding content by navigating through a complex intranet structure.

Suzanne offered 5 additional ideas to the original map from where we last left off. She was also straight to the point too, which always makes a mappers job of expressing it in IBIS easier. You might notice that I reversed “Old content is not deleted and therefore too many results/documents returned” in the resulting map. This is because I felt that old content not being deleted was one of a number arguments supporting why too many results are returned.

image_thumb[26]

My first map refactor

With the addition of Suzanne’s contributions, I felt that it was a good time to take stock and adjust the map. First up, I felt that a lot of topics were starting to revolve around the notion of information architecture, governance and user experience design. So I grouped the themes of vocabulary, lack of metadata, excessive results and issues around structure of data as part of a meta theme of “information architecture”. I similarly grouped a bunch of answers into “governance” and “user experience design”. These for me, seemed to be the three meta-themes that were emerging so far…

For the trainspotters, Suzanne’s comment about document naming conventions was added to the “Vocabulary and labelling issues” sub-map. You can’t see it here because I collapsed the detailed so you can see the full picture of themes as they are at this point.

part2map1

Patrik Bergman • Several of you mention the importance of adding good metadata. Since this doesn’t come natural to all employees, and the wording they use can differ – how do you establish a baseline for all regarding how to use metadata consistently? I have seen this in a KM product from Layer 2 for example, but it can of course be managed without this too, but maybe to a higher cost, or?

Patrick’s comment was a little hard to map. I captured his point that metadata does not come natural to employees as a pro, supporting the idea that lack of metadata is an example of poor information architecture. The other points I opted to leave off, because they were not really related to the core question on why people can’t find stuff on the intranet.

image_thumb[7]

Luc de Ruijter • @Patrik. Metadata are crucial. I’ve been using them since 2005 (Tridion at that time).You can build a lot of functionality with it. And it requires standardisation. If everyone adds his own meta, this will not enable you to create solutions. You can standardize anything in any CMS. So use your CMS to include metadata. If you have a DMS the same applies. (DMS are a more logical tool for intranets, as most enterprise content exists as documenst. Software such as LiveLink can facilitate adding meta in the save as process. You just have to tick some fields before you can save a document on to the intranet.)
@Suzanne. There’s been a lot of buzz about governance. You don’t need governance over meta, you just need a sound metastructure (and a dept of function to manage it – such as library of information management). Basically a lot of ‘governance’ can be automated instead of being discussed all the time :-).

Like Patricks comment, much of what Luc said here wasn’t really related to the question at hand or has been captured already. But I did acknowledge his contribution to the governance debate, and he specifically argued against Suzanne’s point about lack of governance around metadata tagging.

image_thumb[11]

Next we have a series of answers, but you will notice that most of the points are re-iterating points that have already been made.

Patrik Bergman • Thanks Luc. It seems SharePoint gives us some basic metadata handling, but perhaps we need something strong in addition to SharePoint later.

Simon Evans • My top three?
1) The information being searched does not actually exist or exists only in an unrecognisable form and therefore cannot be found!
2) As Karen says above, info is organised by departmental function rather than focussed on end to end business process.
3) Lack of metadata as above

Mahmood Ahmad • @Simon evan. I want to also add Poor Information Structure in the list. Therefore Information Management should be an important factor.

Luc de Ruijter • @Patrik. Sharepoint 2010 is the first version that does something with it. Ms is a bit slow in pushing the possibilities with it.
@Simon @Mahmood Let’s say that information structure is the foundation for an intranet (or any website), and that a lack of metadata is only a symptom of a bad foundation?

Patrik Bergman • Good thing we use the 2010 version then 😀 I will see how good it handles it, and see if we need additional software.

Erin Dammen • I believe 1) lack of robust metadata, resulting in poor search results; 2) structure is not tailored to the way the user thinks; 3) lack of motivation on the part of contributors to make their information easy to use (we have a big problem with people just PDFing EVERYTHING instead of posting HTML pages.) I like that in SP 2010, users have the power to add their own keywords and flag pages as "I like it." Let your community do some of the legwork, I think it helps!

Simon’s first point that the information searched may not exist or may not be in the right format was new, so that was captured under governance. (After all, its hard to architect information when its not there!).

image_thumb[16]

I also added Erin’s third point about lack of motivation on the part of contributors. I mulled over this and decided it was a new theme, so I added it to the root question, rather than trying to make it fit into information architecture, governance or user experience design. I also captured her point on letting the community do the legwork through user tagging (known as folksonomy).

image_thumb[19]

Luc de Ruijter • @all. The list of root causes remains small. This is not surprising (it would be really worrying if the list of causes would be a long list). And it is good to learn that we encounter the same (few but not so easy to solve) issues.
Still, in our line of work these root causes lack overall attention. What could be the reason for that? 🙂
@Erin Motivation is not the issue, I think; and facilitation is. If it is easier to PDF everything, than everyone will do so. And apparently everyone has the tools to do so. (If you don’t want people to PDF stuff, don’t offer them the quick fix.)
If another method of sharing documents is easier, then people will migrate. How easy is it to find PDF’s through search? How easy is it to add metadata to PDF’s? And are colleagues explained why consistent(!) meta is so relevant? Can employees add their own meta keywords? How do you maintain the quality and integrity of your keywords?
Of course it depends on your professional usergroup whether professionals will use "I like" buttons. Its a bit on the Facebook consumer edge if you’d ask me. Very en vogue perhaps, but in my view not so business ‘like’.

Luc, who is playing the devils advocate role as this discussion progresses, provides three counter arguments to Erin’s argument around user motivation. They are all captured as con’s.

image_thumb[21]

Steven Osborne • 1) Its not there and never was
2) Its there but inactive so can no longer be accessed
3) Its not where someone thought it would be or should be or its not called what they thought it was called or should be called.

Marcus Hamilton-Mills • 1) The main navigation is poor
2) The content is titled poorly (e.g internal branding, uncommon wording, not easy to differentiate from other content etc.)
3) Search can’t find it due to poor meta data

patrick c walsh • 1) Navigation breaks down because there’s too much stuff
2) There’s too much crap content hidden away because there’s just too much stuff
and
3) er…there’s just too much stuff

Mark Smith • 1. Poor navigation, information architecture and content sign-posting
2. Lack of content governance, meta-data and inconsistent taxonomy, resulting in poor search capability.
3. The content they are trying to find is out of date, cannot be trusted or isn’t even available on the intranet

Luc de Ruijter • @Steven Had a bit of a laugh there
@all Am I right in making the connection between
– the huge amount of content is an issue
– that internal branding causes confusion (in labeling and titles).
and
the fact that – in most cases – these causes can be back tracked to the owners of intranet, the comms department? They produce most content clutter.
Or am I too quick in drawing that conclusion?

Now the conversation is really starting to saturate. Most of the contributions above are captured already in the map as it is, so I only added two nodes: Patrick’s point about navigation (an information architecture issue) and too much information.

image_thumb[25]

Where are we at now?

We will end part 2 with a summary below. Like the first post, you can click here to see the maps exported in more detail. In part 3, the conversation got richer again, so the maps will change once again.

Until then, thanks for reading

Paul Culmsee

CoverProof29

www.sevensigma.com.au

part2map2

 Digg  Facebook  StumbleUpon  Technorati  Deli.cio.us  Slashdot  Twitter  Sphinn  Mixx  Google  DZone 

No Tags

Send to Kindle

Why can’t users find stuff on the intranet? An IBIS synthesis–Part 1

Send to Kindle

Hi

There was an interesting discussion on the Intranet Professionals group on LinkedIn recently where Luc De Ruijter asked the question:

What are the main three reasons users cannot find the content they were looking for on intranet?

As you can imagine there were a lot of responses, and a lot more than three answers. As I read through them, I thought it might be a good exercise to use IBIS (the language behind issue mapping) to map the discussion and see what the collective wisdom of the group has to say. So in these posts, I will illustrate the utility of IBIS and Issue mapping for this work, and make some comments about the way the conversation progressed.

So what is IBIS and Issue/Dialogue Mapping?

Issue Mapping captures the rationale behind a conversation or dialogue—the emergent ideas and solutions that naturally arise from robust debate. This rationale is graphically represented using a simple, but powerful, visual structure called IBIS (Issue Based Information System). This allows all elements and rationale of a conversation, and subsequent decisions, to be captured in a manner that can be easily reflected upon.

The elements of the IBIS grammar are below. Questions give rise to ideas, or potential answers. Ideas have pros or cons arguing for or against those ideas.

image_thumb81

Dialogue Mapping is essentially Issue Mapping a conversation live, where the mapper is also a facilitator. When it is done live it is powerful stuff. As participants discuss a problem, they watch the IBIS map unfold on the screen. This allows participants to build shared context, identify patterns in the dialogue and move from analysis to synthesis in complex situations. What makes this form of mappingcompelling is that everything is captured. No idea, pro or con is ignored. In a group scenario, this is an extremely efficient way of meeting what social psychologist Hugh Mackay says is the first of the ten human desires which drives us – this being the desire to be taken seriously. Once an idea is mapped, the idea and the person who put it forth are taken seriously. This process significantly reduces “wheel spinning” in meetings where groups get caught up in a frustrating tangled mess of going over the same old ground. It also allows the dialogue to move more effectively to decision points (commitments) around a shared understanding.

In this case though, this was a long discussion on a LinkedIn group so we do not get the benefit of being able to map live. So in this case I will create a map to represent the conversation as it progresses and make some comments here and there…

So let’s kick off with the first reply from Bob Meier.

Bob Meier • Don’t know if these are top 3, but they’re pretty common find-ability issues:
1. Lack of metadata. If there are 2000 documents called “agenda and minutes” then a search engine, fancy intranet, or integrated social tool won’t help.
2. Inconsistent vocabulary and acronyms. If you’ve branded the expense report system with some unintuitive name (e.g. a vendor name like Concur) then I’ll scan right past a link looking for “expense reports” or some variation.
3. Easier alternatives. If it’s easier for me to use phone/email/etc. to find what I want, then I won’t take the time to learn how to use the intranet tools. Do grade schools still teach library search skills? I don’t think many companies do…

In IBIS this was fairly straightforward. Bob listed his three answers with some supporting arguments. I reworded his supporting argument of point 2, but otherwise it pretty much reflects what was said…

image_thumb2

Nigel Williams (LION) • I agree with Bob but I’d add to point two not speaking our user base’s language. How many companies offer a failure to find for example (i.e.if you fail to find something in a search you submit a brief form which pops up automatically stating what you were looking for and where you expected to find it? Lots of comms and intranet teams are great at telling people and assuming we help them to learn but don’t listen and learn from all levels of the business.
If I make that number 1 I’ll also add:
2) Adopting social media because everyone else is, not because our business or users need it. This then ostracises the technophobics and concerns some of our less confident regular users. They then form clans of anti-intranetters and revert to tried and tested methods pre-intranet (instant messaging, shared drives, email etc.)
3) Not making the search box available enough. I’m amazed how many users in user testing say they’ve never noticed search hidden in the top right of the banner – “ebay has their’s in the middle of the screen, so does Google. Where’s ours?” is a typical response. If you have a user group at your mercy ask them to search for an item on on Google, then eBay, then Amazon, then finally your intranet. Note whether they search in the first three and then use navigation (left hand side or top menu) when in your intranet.

Nigel’s starts out by supporting Bob’s answer and I therefore add them as pros in the map. Having done this though, I can already see some future conversational patterns. Bob’s two supporting arguments for “not using the vocabulary of users”, actually are two related issues. One is about user experience and the other is about user engagement/governance. Nevertheless, I have mapped it as he stated it at this point and we see what happens.

image_thumb3

Luc de Ruijter • @Bob. I recognise your first 2 points. The third however might be a symptom or result, not a cause. Or is it information skills you are refering to?
How come metadata are not used? Clearly there is a rationale to put some effort in this?
@Nigel. Is the situation in which Comm. depts don’t really listen to users a reason for not finding stuff? Or would it be a lack of rapport with users before and while building intranets? Is the cause concepetual, rather than editorial for instance?
(I’m really looking for root causes, the symptoms we all know from daily experience).
Adding more media is something we’ve seen for years indeed. Media tend to create silo’s.
Is your third point about search or about usability/design?

In following sections I will not reproduce the entire map in the blog post – just relevant sections.

In this part of the conversation, Luc doesn’t add any new answers to the root question, but queries three that have been put forward thus far. Also note at this point I believe one of Luc’s answers is for a different question. Bob’s “easier alternatives” point was never around metadata. But Luc asks “how come metadata is not used?”. I have added it to the map here, changing the framing from a rhetorical question to an action. Having said that, if I was facilitating this conversation, I would have clarified that point before committing it to the map.

image_thumb27

Luc also indicates that the issue around communications and intranet teams not listening might be due to a lack of rapport.

image_thumb25

Finally, he adds an additional argument why social media may not be the utopia it is made out to be, by arguing that adding more media channels creates more information silos. He also argues against the entire notion on the grounds that this is a usability issue, rather than a search issue.

image_thumb80

Nigel Williams (LION) • Hi Luc, I think regarding Comms not listening that it is two way. If people are expecting to find something with a certain title or keyword and comms aren’t recognising this (or not providing adequate navigation to find it) then the item is unlikely to be found.
Similarly my third point is again both, it is an issue of usability but if that stops users conducting searches then it would impact daily search patterns and usage.

I interpret this reply as Nigel arguing against Luc’s assertion around lack of rapport being the reason behind intranet and comms teams not listening and learning from all levels of the user base.

image_thumb34

Nigel finishes by arguing that even if social media issues are usability issues, they might still impede search and the idea is therefore valid.

image_thumb84

Bob Meier • I really like Nigel’s point about the importance of feedback loops on Intranets, and without those it’s hard to build a system that’s continually improving. I don’t have any data on it, but I suspect most companies don’t regularly review their search analytics even if they have them enabled. Browse-type searching is harder to measure/quantify, but I’d argue that periodic usability testing can be used in place of path analysis.
I also agree with Luc – my comment on users gravitating from the Intranet to easier alternatives could be a symptom rather than a cause. However, I think it’s a self-reinforcing symptom. When you eliminate other options for finding information, then the business is forced to improve the preferred system, and in some cases that can mean user training. Not seeing a search box is a great example of something that could be fixed with a 5-minute Intranet orientation.
If I were to replace my third reason, I’d point at ambiguous or mis-placed Intranet ownership . Luc mentions Communications departments, but in my experience many of those are staffed for distributing executive announcements rather than facilitating collective publishing and consumption. I’ve seen many companies where IT or HR own the Intranet, and I think the “right” department varies by company. Communications could be the right place depending on how their role is defined.

Bob makes quite a number of points in this answer, right across various elements of the unfolding discussion. Firstly, he makes a point about analytics and the fact that a lack of feedback loops makes it hard to build a system that continually improves.

image_thumb45

In term of the discussion around easier alternatives, Bob offers some strategies to mitigate the issue. He notes that there are training implications when eliminating the easier alternatives.

image_thumb41

Finally, Bob identifies issues around the ownership of the intranet as another answer to the original question of people not being able to find stuff on the intranet. He also lists a couple of common examples.

image_thumb69

Karen Glynn • I think the third one listed by Bob is an effect not a cause.
Another cause could be data being structured in ways that employees don’t understand – that might be when it is structured by departments, so that users need to know who does what before they can find it, or when it is structured by processes that employees don’t know about or understand. Don’t forget intranet navigations trends are the opposite to the web – 80% of people will try and navigate first rather than searching the intranet.

In this answer, Karen’s starts by agreeing with the point Luc made about “easier alternatives” being a symptom rather than a cause, so there is no need to add it to the map as it is already there. However she provides a new answer to the original question: the structure of information (this by the way is called top-down information architecture – and it was bound to come out of this discussion eventually). She also makes a claim that 80% of people will navigate prior to search on the intranet. I wonder if you can tell what will happen next? Smile

image_thumb49

Luc de Ruijter • @Nigel Are (customer) keywords the real cause for not finding stuff? In my opinion this limits the chalenge (of building effective intranet/websites) to building understandable navigation patters. But is navigation the complete story? Where do navigation paths lead users to?
@Bob Doesn’t an investiment in training in order to have colleagues use the search function sound a bit like attacking the symptom? Why is search not easy to locate in the first place? I’d argue you’re looking at a (functional) design flaw (cause) for which the (where is the search?) training is a mere remedy, but not a solution.
@Karen You mention data. How does data relate to what we conventionally call content, when we need to bring structure in it?
Where did you read the 80% intranet-users navigate before searching?

Okay, so this is the first time thus far where I do a little bit of map restructuring. In the discussion so far, we had two ideas offered around the common notion of vocabulary. In this reply, Luc states “Are (customer) keywords the real cause for not finding stuff?” I wasn’t sure which vocabulary issue he was referring to, so this prompted me to create a “meta idea” called “Vocabulary and labelling issues”, of which there are two examples cited thus far. This allowed me to capture the essence of Luc’s comment as a con against the core idea of issues around vocabulary and labelling.

image_thumb52

Luc then calls into question Bob’s suggestion of training and eliminating the easier alternatives. Prior to Luc’s counter arguments, I had structured Bob’s argument like this:

image_thumb58

To capture Luc’s argument effectively, I restructured the original argument and made a consolidated idea to “eliminate other options and provide training”. This allowed me to capture Luc’s counter argument as shown below.

image_thumb55

Finally, Luc asked Karen for the source of her contention that 80% of users navigate intranets, rather than use the search engine first up.

image_thumb61

In this final bit of banter for now, the next three conversations did not add too many nodes to the map, so I have grouped them below…

Karen Glynn • Luc, the info came from the Neilsen group.

Helen Bowers • @Karen Do you know if the Neilsen info is available for anyone to look at?

Karen Glynn • I don’t know to be honest – it was in one of the ‘paid for’ reports if I remember correctly.

Luc de Ruijter • @Karen. OK in that case, could you provide us with the title and page reference of the source? Than it can become usable as a footnote (in a policy for instance).Thanks
Reasons so far for not finding stuff:
1. Lack of metadata (lack of content structure).
2. Inconsistent vocabulary and acronyms (customer care words).
3. Adopting social media from a hype-driven motivation (lack of coherence)
4. Bad functional design (having to search for the search box)
5. Lack of measuring and feedback on (quality, performance of) the intranet
6. Silo’s. Site structures suiting senders instead of users

So for all that Banter, here is what I added to what has already been captured.

image_thumb65

Where are we at?

At this point, let’s take a breath and summarise what has been discussed so far. Below is the summary map with core answers to the question so far. I have deliberately tucked away the detail into sub maps so you can see what is emerging. Please note I have not synthesised this map yet (well … not too much anyway). I’ll do that in the next post.

image_thumb72

If you want to take a look at the entire map as it currently stands, take a look at the final image at the very bottom of this post. (click to enlarge). I have also exported the entire map so far for you to view things in more context. Please note that the map will change significantly as we continue to capture and synthesise the rationale, so as we continue to unpack the discussion, expect this map to change quite a bit..

Thanks for reading

Paul Culmsee

CoverProof29

www.sevensigma.com.au

Map25

 Digg  Facebook  StumbleUpon  Technorati  Deli.cio.us  Slashdot  Twitter  Sphinn  Mixx  Google  DZone 

No Tags

Send to Kindle

Troubleshooting SharePoint (People) Search 101

Send to Kindle

I’ve been nerding it up lately SharePointwise, doing the geeky things that geeks like to do like ADFS and Claims Authentication. So in between trying to get my book fully edited ready for publishing, I might squeeze out the odd technical SharePoint post. Today I had to troubleshoot a broken SharePoint people search for the first time in a while. I thought it was worth explaining the crawl process a little and talking about the most likely ways in which is will break for you, in order of likelihood as I see it. There are articles out on this topic, but none that I found are particularly comprehensive.

Background stuff

If you consider yourself a legendary IT pro or SharePoint god, feel free to skip this bit. If you prefer a more gentle stroll through SharePoint search land, then read on…

When you provision a search service application as part of a SharePoint installation, you are asked for (among other things), a windows account to use for the search service. Below shows the point in the GUI based configuration step where this is done. First up we choose to create a search service application, and then we choose the account to use for the “Search Service Account”. By default this is the account that will do the crawling of content sources.

image    image

Now the search service account is described as so: “.. the Windows Service account for the SharePoint Server Search Service. This setting affects all Search Service Applications in the farm. You can change this account from the Service Accounts page under Security section in Central Administration.”

In reading this, suggests that the windows service (“SharePoint Server Search 14”) would run under this account. The reality is that the SharePoint Server Search 14 service account is the farm account. You can see the pre and post provisioning status below. First up, I show below where SharePoint has been installed and the SharePoint Server Search 14 service is disabled and with service credentials of “Local Service”.

image

The next set of pictures show the Search Service Application provisioned according to the following configuration:

  • Search service account: SEVENSIGMA\searchservice
  • Search admin web service account: SEVENSIGMA\searchadminws
  • Search query and site settings account: SEVENSIGMA\searchqueryss

You can see this in the screenshots below.

image

imageimage

Once the service has been successfully provisioned, we can clearly see the “Default content access account” is based on the “Search service account” as described in the configuration above (the first of the three accounts).

image

Finally, as you can see below, once provisioned, it is the SharePoint farm account that is running the search windows service.

image

Once you have provisioned the Search Service Application, the default content access (in my case SEVENSIGMA\searchservice), it is granted “Read” access to all web applications via Web Application User Policies as shown below. This way, no matter how draconian the permissions of site collections are, the crawler account will have the access it needs to crawl the content, as well as the permissions of that content. You can verify this by looking at any web application in Central Administration (except for central administration web application) and choosing “User Policy” from the ribbon. You will see in the policy screen that the “Search Crawler” account has “Full Read” access.

image

image

In case you are wondering why the search service needs to crawl the permissions of content, as well as the content itself, it is because it uses these permissions to trim search results for users who do not have access to content. After all, you don’t want to expose sensitive corporate data via search do you?

There is another more subtle configuration change performed by the Search Service. Once the evilness known as the User Profile Service has been provisioned, the Search service application will grant the Search Service Account specific permission to the User Profile Service. SharePoint is smart enough to do this whether or not the User Profile Service application is installed before or after the Search Service Application. In other words, if you install the Search Service Application first, and the User Profile Service Application afterwards, the permission will be granted regardless.

The specific permission by the way, is “Retrieve People Data for Search Crawlers” permission as shown below:

image    image

Getting back to the title of this post, this is a critical permission, because without it, the Search Server will not be able to talk to the User Profile Service to enumerate user profile information. The effect of this is empty "People Search results.

How people search works (a little more advanced)

Right! Now that the cool kids have joined us (who skipped the first section), lets take a closer look at SharePoint People Search in particular. This section delves a little deeper, but fear not I will try and keep things relatively easy to grasp.

Once the Search Service Application has been provisioned, a default content source, called – originally enough – “Local SharePoint Sites” is created. Any web applications that exist (and any that are created from here on in) will be listed here. An example of a freshly minted SharePoint server with a single web application, shows the following configuration in Search Service Application:

image

Now hopefully http://web makes sense. Clearly this is the URL of the web application on this server. But you might be wondering that sps3://web is? I will bet that you have never visited a site using sps3:// site using a browser either. For good reason too, as it wouldn’t work.

This is a SharePointy thing – or more specifically, a Search Server thing. That funny protocol part of what looks like a URL, refers to a connector. A connector allows Search Server to crawl other data sources that don’t necessarily use HTTP. Like some native, binary data source. People can develop their own connectors if they feel so inclined and a classic example is the Lotus Notes connector that Microsoft supply with SharePoint. If you configure SharePoint to use its Lotus Notes connector (and by the way – its really tricky to do), you would see a URL in the form of:

notes://mylotusnotesbox

Make sense? The protocol part of the URL allows the search server to figure out what connector to use to crawl the content. (For what its worth, there are many others out of the box. If you want to see all of the connectors then check the list here).

But the one we are interested in for this discussion is SPS3: which accesses SharePoint User profiles which supports people search functionality. The way this particular connector works is that when the crawler accesses this SPS3 connector, it in turns calls a special web service at the host specified. The web service is called spscrawl.asmx and in my example configuration above, it would be http://web/_vti_bin/spscrawl.asmx

The basic breakdown of what happens next is this:

  1. Information about the Web site that will be crawled is retrieved (the GetSite method is called passing in the site from the URL (i.e the “web” of sps3://web)
  2. Once the site details are validated the service enumerates all of the use profiles
  3. For each profile, the method GetItem is called that retrieves all of the user profile properties for a given user. This is added to the index and tagged as content class of “urn:content-class:SPSPeople” (I will get to this in a moment)

Now admittedly this is the simple version of events. If you really want to be scared (or get to sleep tonight) you can read the actual SP3 protocol specification PDF.

Right! Now lets finish this discussion by this notion of contentclass. The SharePoint search crawler tags all crawled content according to its class. The name of this “tag” – or in correct terminology “managed property” – is contentclass. By default SharePoint has a People Search scope. It is essentially a limits the search to only returning content tagged as “People” contentclass.

image

Now to make it easier for you, Dan Attis listed all of the content classes that he knew of back in SharePoint 2007 days. I’ll list a few here, but for the full list visit his site.

  • “STS_Web” – Site
  • “STS_List_850″ – Page Library
  • “STS_List_DocumentLibrary” – Document Library
  • “STS_ListItem_DocumentLibrary” – Document Library Items
  • “STS_ListItem_Tasks” – Tasks List Item
  • “STS_ListItem_Contacts” – Contacts List Item
  • “urn:content-class:SPSPeople” – People

(why some properties follow the universal resource name format I don’t know *sigh* – geeks huh?)

So that was easy Paul! What can go wrong?

So now we know that although the protocol handler is SPS3, it is still ultimately utilising HTTP as the underlying communication mechanism and calling a web service, we can start to think of all the ways that it can break on us. Let’s now take a look at common problem areas in order of commonality:

1. The Loopback issue.

This has been done to death elsewhere and most people know it. What people don’t know so well is that the loopback fix was to prevent an extremely nasty security vulnerability known as a replay attack that came out a few years ago. Essentially, if you make a HTTP connection to your server, from that server and using a name that does not match the name of the server, then the request will be blocked with a 401 error. In terms of SharePoint people search, the sps3:// handler is created when you create your first web application. If that web application happens to be a name that doesn’t match the server name, then the HTTP request to the spscrawl.asmx webservice will be blocked due to this issue.

As a result your search crawl will not work and you will see an error in the logs along the lines of:

  • Access is denied: Check that the Default Content Access Account has access to the content or add a crawl rule to crawl the content (0x80041205)
  • The server is unavailable and could not be accessed. The server is probably disconnected from the network.   (0x80040d32)
  • ***** Couldn’t retrieve server http://web.sevensigma.com policy, hr = 80041205 – File:d:\office\source\search\search\gather\protocols\sts3\sts3util.cxx Line:548

There are two ways to fix this. The quick way (DisableLoopbackCheck) and the right way (BackConnectionHostNames). Both involve a registry change and a reboot, but one of them leaves you much more open to exploitation. Spence Harbar wrote about the differences between the two some time ago and I recommend you follow his advice.

(As an slightly related side note, I hit an issue with the User Profile Service a while back where it gave an error: “Exception occurred while connecting to WCF endpoint: System.ServiceModel.Security.MessageSecurityException: The HTTP request was forbidden with client authentication scheme ‘Anonymous’. —> System.Net.WebException: The remote server returned an error: (403) Forbidden”. In this case I needed to disable the loopback check but I was using the server name with no alternative aliases or full qualified domain names. I asked Spence about this one and it seems that the DisableLoopBack registry key addresses more than the SMB replay vulnerability.)

2. SSL

If you add a certificate to your site and mark the site as HTTPS (by using SSL), things change. In the example below, I installed a certificate on the site http://web, removed the binding to http (or port 80) and then updated SharePoint’s alternate access mappings to make things a HTTPS world.

Note that the reference to SPS3://WEB is unchanged, and that there is also a reference still to HTTP://WEB, as well as an automatically added reference to HTTPS://WEB

image

So if we were to run a crawl now, what do you think will happen? Certainly we know that HTTP://WEB will fail, but what about SPS3://WEB? Lets run a full crawl and find out shall we?

Checking the logs, we have the unsurprising error “the item could not be crawled because the crawler could not contact the repository”. So clearly, SPS3 isn’t smart enough to work out that the web service call to spscrawl.asmx needs to be done over SSL.

image

Fortunately, the solution is fairly easy. There is another connector, identical in function to SPS3 except that it is designed to handle secure sites. It is “SPS3s”. We simple change the configuration to use this connector (and while we are there, remove the reference to HTTP://WEB)

image

Now we retry a full crawl and check for errors… Wohoo – all good!

image

It is also worth noting that there is another SSL related issue with search. The search crawler is a little fussy with certificates. Most people have visited secure web sites that warning about a problem with the certificate that looks like the image below:

image

Now when you think about it, a search crawler doesn’t have the luxury of asking a user if the certificate is okay. Instead it errs on the side of security and by default, will not crawl a site if the certificate is invalid in some way. The crawler also is more fussy than a regular browser. For example, it doesn’t overly like wildcard certificates, even if the certificate is trusted and valid (although all modern browsers do).

To alleviate this issue, you can make the following changes in the settings of the Search Service Application: Farm Search Administration->Ignore SSL warnings and tick “Ignore SSL certificate name warnings”.

image  image

image

The implication of this change is that the crawler will now accept any old certificate that encrypts website communications.

3. Permissions and Change Legacy

Lets assume that we made a configuration mistake when we provisioned the Search Service Application. The search service account (which is the default content access account) is incorrect and we need to change it to something else. Let’s see what happens.

In the search service application management screen, click on the default content access account to change credentials. In my example I have changed the account from SEVENSIGMA\searchservice to SEVENSIGMA\svcspsearch

image

Having made this change, lets review the effect in the Web Application User Policy and User Profile Service Application permissions. Note that the user policy for the old search crawl account remains, but the new account has had an entry automatically created. (Now you know why you end up with multiple accounts with the display name of “Search Crawling Account”)

image

Now lets check the User Profile Service Application. Now things are different! The search service account below refers to the *old* account SEVENSIGMA\searchservice. But the required permission of “Retrieve People Data for Search Crawlers” permission has not been granted!

image

 

image

If you traipsed through the ULS logs, you would see this:

Leaving Monitored Scope (Request (GET:https://web/_vti_bin/spscrawl.asmx)). Execution Time=7.2370958438429 c2a3d1fa-9efd-406a-8e44-6c9613231974
mssdmn.exe (0x23E4) 0x2B70 SharePoint Server Search FilterDaemon e4ye High FLTRDMN: Errorinfo is "HttpStatusCode Unauthorized The request failed with HTTP status 401: Unauthorized." [fltrsink.cxx:553] d:\office\source\search\native\mssdmn\fltrsink.cxx
mssearch.exe (0x02E8) 0x3B30 SharePoint Server Search Gatherer cd11 Warning The start address sps3s://web cannot be crawled. Context: Application ‘Search_Service_Application’, Catalog ‘Portal_Content’ Details: Access is denied. Verify that either the Default Content Access Account has access to this repository, or add a crawl rule to crawl this repository. If the repository being crawled is a SharePoint repository, verify that the account you are using has "Full Read" permissions on the SharePoint Web Application being crawled. (0x80041205)

To correct this issue, manually grant the crawler account the “Retrieve People Data for Search Crawlers” permission in the User Profile Service. As a reminder, this is done via the Administrators icon in the “Manage Service Applications” ribbon.

image

Once this is done run a fill crawl and verify the result in the logs.4.

4. Missing root site collection

A more uncommon issue that I once encountered is when the web application being crawled is missing a default site collection. In other words, while there are site collections defined using a managed path, such as http://WEB/SITES/SITE, there is no site collection defined at HTTP://WEB.

The crawler does not like this at all, and you get two different errors depending on whether the SPS or HTTP connector used.

  • SPS:// – Error in PortalCrawl Web Service (0x80042617)
  • HTTP:// – The item could not be accessed on the remote server because its address has an invalid syntax (0x80041208)

image

The fix for this should be fairly obvious. Go and make a default site collection for the web application and re-run a crawl.

5. Alternative Access Mappings and Contextual Scopes

SharePoint guru (and my squash nemesis), Nick Hadlee posted recently about a problem where there are no search results on contextual search scopes. If you are wondering what they are Nick explains:

Contextual scopes are a really useful way of performing searches that are restricted to a specific site or list. The “This Site: [Site Name]”, “This List: [List Name]” are the dead giveaways for a contextual scope. What’s better is contextual scopes are auto-magically created and managed by SharePoint for you so you should pretty much just use them in my opinion.

The issue is that when the alternate access mapping (AAM) settings for the default zone on a web application do not match your search content source, the contextual scopes return no results.

I came across this problem a couple of times recently and the fix is really pretty simple – check your alternate access mapping (AAM) settings and make sure the host header that is specified in your default zone is the same url you have used in your search content source. Normally SharePoint kindly creates the entry in the content source whenever you create a web application but if you have changed around any AAM settings and these two things don’t match then your contextual results will be empty. Case Closed!

Thanks Nick

6. Active Directory Policies, Proxies and Stateful Inspection

A particularly insidious way to have problems with Search (and not just people search) is via Active Directory policies. For those of you who don’t know what AD policies are, they basically allow geeks to go on a power trip with users desktop settings. Consider the image below. Essentially an administrator can enforce a massive array of settings for all PC’s on the network. Such is the extent of what can be controlled, that I can’t fit it into a single screenshot. What is listed below is but a small portion of what an anal retentive Nazi administrator has at their disposal (mwahahaha!)

image

Common uses of policies include restricting certain desktop settings to maintain consistency, as well as enforce Internet explorer security settings, such as proxy server and security settings like maintaining the trusted sites list. One of the common issues encountered with a global policy defined proxy server in particular is that the search service account will have its profile modified to use the proxy server.

The result of this is that now the proxy sits between the search crawler and the content source to be crawled as shown below:

Crawler —–> Proxy Server —–> Content Source

Now even though the crawler does not use Internet Explorer per se, proxy settings aren’t actually specific to Internet Explorer. Internet explorer, like the search crawler, uses wininet.dll. Wininet is a module that contains Internet-related functions used by Windows applications and it is this component that utilises proxy settings.

Sometimes people will troubleshoot this issue by using telnet to connect to the HTTP port. "ie: “Telnet web 80”. But telnet does not use the wininet component, so is actually not a valid method for testing. Telnet will happily report that the web server is listening on port 80 or 443, but it matters not when the crawler tries to access that port via the proxy. Furthermore, even if the crawler and the content source are on the same server, the result is the same. As soon as the crawler attempts to index a content source, the request will be routed to the proxy server. Depending on the vendor and configuration of the proxy server, various things can happen including:

  • The proxy server cannot handle the NTLM authentication and passes back a 400 error code to the crawler
  • The proxy server has funky stateful inspection which interferes with the allowed HTTP verbs in the communications and interferes with the crawl

For what its worth, it is not just proxy settings that can interfere with the HTTP communications between the crawler and the crawled. I have seen security software also get in the way, which monitors HTTP communications and pre-emptively terminates connections or modifies the content of the HTTP request. The effect is that the results passed back to the crawler are not what it expects and the crawler naturally reports that it could not access the data source with suitably weird error messages.

Now the very thing that makes this scenario hard to troubleshoot is the tell-tale sign for it. That is: nothing will be logged in the ULS logs, not the IIS logs for the search service. This is because the errors will be logged in the proxy server or the overly enthusiastic stateful security software.

If you suspect the problem is a proxy server issue,  but do not have access to the proxy server to check logs, the best way to troubleshoot this issue is to temporarily grant the search crawler account enough access to log into the server interactively. Open internet explorer and manually check the proxy settings. If you confirm a policy based proxy setting, you might be able to temporarily disable it and retry a crawl (until the next AD policy refresh reapplies the settings). The ideal way to cure this problem is to ask your friendly Active Directory administrator to either:

  • Remove the proxy altogether from the SharePoint server (watch for certificate revocation slowness as a result)
  • Configure an exclusion in the proxy settings for the AD policy to that the content sources for crawling are not proxied
  • Create a new AD policy specifically for the SharePoint box so that the default settings apply to the rest of the domain member computers.

If you suspect the issue might be overly zealous stateful inspection, temporarily disable all security-type software on the server and retry a crawl. Just remember, that if you have no logs on the server being crawled, chances are its not being crawled and you have to look elsewhere.

7. Pre-Windows 2000 Compatibility Access Group

In an earlier post of mine, I hit an issue where search would yield no results for a regular user, but a domain administrator could happily search SP2010 and get results. Another symptom associated with this particular problem is certain recurring errors event log – Event ID 28005 and 4625.

  • ID 28005 shows the message “An exception occurred while enqueueing a message in the target queue. Error: 15404, State: 19. Could not obtain information about Windows NT group/user ‘DOMAIN\someuser’, error code 0×5”.
  • The 4625 error would complain “An account failed to log on. Unknown user name or bad password status 0xc000006d, sub status 0xc0000064” or else “An Error occured during Logon, Status: 0xc000005e, Sub Status: 0x0”

If you turn up the debug logs inside SharePoint Central Administration for the “Query” and “Query Processor” functions of “SharePoint Server Search” you will get an error “AuthzInitializeContextFromSid failed with ERROR_ACCESS_DENIED. This error indicates that the account under which this process is executing may not have read access to the tokenGroupsGlobalAndUniversal attribute on the querying user’s Active Directory object. Query results which require non-Claims Windows authorization will not be returned to this querying user.

image

The fix is to add your search service account to a group called “Pre-Windows 2000 Compatibility Access” group. The issue is that SharePoint 2010 re-introduced something that was in SP2003 – an API call to a function called AuthzInitializeContextFromSid. Apparently it was not used in SP2007, but its back for SP2010. This particular function requires a certain permission in Active Directory and the “Pre-Windows 2000 Compatibility Access” group happens to have the right required to read the “tokenGroupsGlobalAndUniversal“ Active Directory attribute that is described in the debug error above.

8. Bloody developers!

Finally, Patrick Lamber blogs about another cause of crawler issues. In his case, someone developed a custom web part that had an exception thrown when the site was crawled. For whatever reason, this exception did not get thrown when the site was viewed normally via a browser. As a result no pages or content on the site could be crawled because all the crawler would see, no matter what it clicked would be the dreaded “An unexpected error has occurred”. When you think about it, any custom code that takes action based on browser parameters such as locale or language might cause an exception like this – and therefore cause the crawler some grief.

In Patricks case there was a second issue as well. His team had developed a custom HTTPModule that did some URL rewriting. As Patrick states “The indexer seemed to hate our redirections with the Response.Redirect command. I simply removed the automatic redirection on the indexing server. Afterwards, everything worked fine”.

In this case Patrick was using a multi-server farm with a dedicated index server, allowing him to remove the HTTP module for that one server. in smaller deployments you may not have this luxury. So apart from the obvious opportunity to bag programmers :-), this example nicely shows that it is easy for a 3rd party application or code to break search. What is important for developers to realise is that client web browsers are not the only thing that loads SharePoint pages.

If you are not aware, the user agent User Agent string identifies the type of client accessing a resource. This is the means by which sites figure out what browser you are using. A quick look at the User Agent parameter by SharePoint Server 2010 search reveals that it identifies itself as “Mozilla/4.0 (compatible; MSIE 4.01; Windows NT; MS Search 6.0 Robot)“. At the very least, test any custom user interface code such as web parts against this string, as well as check the crawl logs when it indexes any custom developed stuff.

Conclusion

Well, that’s pretty much my list of gotchas. No doubt there are lots more, but hopefully this slightly more detailed exploration of them might help some people.

 

Thanks for reading

Paul Culmsee

www.sevensigma.com.au

www.spgovia.com

 Digg  Facebook  StumbleUpon  Technorati  Deli.cio.us  Slashdot  Twitter  Sphinn  Mixx  Google  DZone 

No Tags

Send to Kindle