Tuesday, August 30, 2016

Value of Taxonomy in IBM Watson Explorer

WAND Taxonomies are the only third party taxonomies explicitly supported for use in IBM Watson Explorer.  In today's  blog post, we wanted to do a deeper dive on the value of taxonomy in IBM Watson Explorer.  These features are all already very well described in the IBM Watson Explorer product documentation, by the way, so this blog can be considered a supplementary resource.

Taxonomy is a important ingredient to provide users with the absolutely best experience with the IBM Watson Explorer search engine.

There are three key value adds for taxonomy in IBM Watson Explorer.

1) Synonym Query Expansion.  Users want to get consistent results. A user who searches for HR should get the same results as somebody who searches for "Human Resources" because these are the same concepts.  If a search for "HR" only returns 15 results, and a search for Human Resources returns 24 results, AND only 5 results appear in both searches, then there is a major completeness and consistency problem.  Depending on which version of the same concept a user searches for will affect what the user finds.  It also won't be obvious to the user that they are missing out on a lot of possibly relevant information.   From a search perspective, taxonomy is helping to expand the "recall" of the user query.

2) Related, Broader, and Narrower Query Expansion.  Synonym query expansion is designed to add to your query withe other ways to reflect the same concept.  Related, Broader, and Narrower query expansion are designed to add to the user query with other concepts that may be relevant.  For example, if using a Narrower query expansion with the WAND Finance and Investment Taxonomy, a query for "Currencies" could be expanded to also search for Dollar, USD, Peso, Euro, Yen, and more from.  Users would be able to control the expansion by selecting/de-selecting  the specific terms that are in the taxonomy.

3) Auto-Classification and Metadata Refinement.  While query expansion focuses on "recall" of search, auto-classification and metadata refinement provides the user with significantly improved control over the precision of search results.  Taxonomy can be imported into the IBM Watson Explorer auto-classification module. This module will crawl all of your documents and, based upon rules, classify each document into the relevant taxonomy categories.     Then, when users perform a query, the result set can be refined by clicking on one of the taxonomy terms (or branches) to narrow down the result set to only those which were automatically classified to that term.  So, in one click, a user may go from 2500 results to a sub-set of 15 results that have been classified to a specific topic of interest.This is very similar to the way that a user on an e-commerce site can narrow down the product result set by things like color or brand.

See a screenshot here of taxonomy refiners in an IBM Watson Explorer search result page

IBM Watson Explorer does have algorithmic clustering which provides refiners based on what the engine deems to be statistically interesting within your document set. This can reveal interesting concepts but can also be subject to a lot of noise, as it is not curated.  A term may be statistically significant but not interesting to a human.  Taxonomy and Auto-Classification provides refiners based on a curated set of terms which a library scientist (or your corporate taxonomist!) has deemed to be important.

Both approaches add value and they should be considered complementary.  The algorithmic clustering may identify some concepts or synonyms which could be added to the official taxonomy.

How to accelerate taxonomy for IBM Watson Explorer

In summary, taxonomy in Watson Explorer provides the users with a number of valuable tools that provide greater relevancy, recall and control over the search experience.

WAND Taxonomies cover nearly every vertical industry segment and business functional area so that relevant taxonomy content is easily available for all clients.  Users of IBM Watson Explorer can jump start taxonomy creation with access to the WAND Taxonomy Library Portal and download taxonomies in a format that imports directly into IBM Watson Explorer.