Tuesday, August 30, 2016

Value of Taxonomy in IBM Watson Explorer

WAND Taxonomies are the only third party taxonomies explicitly supported for use in IBM Watson Explorer.  In today's  blog post, we wanted to do a deeper dive on the value of taxonomy in IBM Watson Explorer.  These features are all already very well described in the IBM Watson Explorer product documentation, by the way, so this blog can be considered a supplementary resource.

Taxonomy is a important ingredient to provide users with the absolutely best experience with the IBM Watson Explorer search engine.

There are three key value adds for taxonomy in IBM Watson Explorer.

1) Synonym Query Expansion.  Users want to get consistent results. A user who searches for HR should get the same results as somebody who searches for "Human Resources" because these are the same concepts.  If a search for "HR" only returns 15 results, and a search for Human Resources returns 24 results, AND only 5 results appear in both searches, then there is a major completeness and consistency problem.  Depending on which version of the same concept a user searches for will affect what the user finds.  It also won't be obvious to the user that they are missing out on a lot of possibly relevant information.   From a search perspective, taxonomy is helping to expand the "recall" of the user query.

2) Related, Broader, and Narrower Query Expansion.  Synonym query expansion is designed to add to your query withe other ways to reflect the same concept.  Related, Broader, and Narrower query expansion are designed to add to the user query with other concepts that may be relevant.  For example, if using a Narrower query expansion with the WAND Finance and Investment Taxonomy, a query for "Currencies" could be expanded to also search for Dollar, USD, Peso, Euro, Yen, and more from.  Users would be able to control the expansion by selecting/de-selecting  the specific terms that are in the taxonomy.

3) Auto-Classification and Metadata Refinement.  While query expansion focuses on "recall" of search, auto-classification and metadata refinement provides the user with significantly improved control over the precision of search results.  Taxonomy can be imported into the IBM Watson Explorer auto-classification module. This module will crawl all of your documents and, based upon rules, classify each document into the relevant taxonomy categories.     Then, when users perform a query, the result set can be refined by clicking on one of the taxonomy terms (or branches) to narrow down the result set to only those which were automatically classified to that term.  So, in one click, a user may go from 2500 results to a sub-set of 15 results that have been classified to a specific topic of interest.This is very similar to the way that a user on an e-commerce site can narrow down the product result set by things like color or brand.

IBM Watson Explorer does have algorithmic clustering which provides refiners based on what the engine deems to be statistically interesting within your document set. This can reveal interesting concepts but can also be subject to a lot of noise, as it is not curated.  A term may be statistically significant but not interesting to a human.  Taxonomy and Auto-Classification provides refiners based on a curated set of terms which a library scientist (or your corporate taxonomist!) has deemed to be important.

Both approaches add value and they should be considered complementary.  The algorithmic clustering may identify some concepts or synonyms which could be added to the official taxonomy.

How to accelerate taxonomy for IBM Watson Explorer

In summary, taxonomy in Watson Explorer provides the users with a number of valuable tools that provide greater relevancy, recall and control over the search experience.

WAND Taxonomies cover nearly every vertical industry segment and business functional area so that relevant taxonomy content is easily available for all clients.  Users of IBM Watson Explorer can jump start taxonomy creation with access to the WAND Taxonomy Library Portal and download taxonomies in a format that imports directly into IBM Watson Explorer.

Tuesday, August 23, 2016

How to use synonyms to improve search in SharePoint 2013 and SharePoint 2016

A common misconception with the SharePoint Managed Metadata Service, where taxonomies are deployed, is that the "Other Labels" or synonyms that you add to the SharePoint Term Store will have an impact on SharePoint search.

In reality, the "Other Labels" in the SharePoint term store are only useful when users are tagging content with managed metadata terms.  What this means is that if "HR" is an other label for "Human Resources" in the term store, then if a user begins to type "HR" in the type-ahead tagging capability, SharePoint will direct the user that the correct tag is "Human Resources".  This is helpful and makes sure that users are tagging to the preferred form of the concept that has been defined in the term store. It reduces frustration when tagging by pointing synonyms to the correct place in the taxonomy.

However, it does not help with search.  Don't worry!  there is a way to use synonyms to improve search.

The value of synonyms in SharePoint search is in query expansion.  That is, if "HR" and "Human Resources" are synonyms, a user searching for either term will automatically have the query expanded to search for both variants.  This is an effective way to help users get complete search result sets.  It's an important detail to address to ensure you are delivering the best search experience possible for your users.

Microsoft Technet provides this article with detailed instructions on how to create and maintain a thesaurus in SharePoint Server 2013.  There is a Technet blog post with similar content in case you want an article with some more images.  The content is relevant for SharePoint Server 2016 as well.

Unfortunately, you cannot import a thesaurus in SharePoint Online.

Taxonomies downloaded for SharePoint from the WAND Taxonomy Library Portal come with a file listing the synonyms for the taxonomy.  This file can be used as a thesaurus file for the SharePoint 2013 or SharePoint 2016 search engine.  Remember, however, that only a single thesaurus file can be loaded, so if you download multiple taxonomies from WAND, you will need to combine the supplementary synonym files into one to create a master thesaurus file.

This blog post is relevant for SharePoint 2013 and SharePoint 2016