Thursday, February 11, 2016

Taxonomy for Component Content Management Systems

In today's post, we'll be looking at the value of taxonomy for component content management systems (CCMS) and then specifically examining the  taxonomy capabilities of a few CCMS tools such as, xDocs, EasyDITA, DITACMS, Componize, and Vasont

DITA content is modular content which is focused on a singular topic.  For example, an individual piece of content may explain how to import a taxonomy into an application.  Another piece of content would focus on how to add a new term to a taxonomy in that application.    A common application for DITA content might be a knowledge base.    DITA based Component Content Management Systems (CCMS) are a little bit different than a conventional content management system because CCMS need to manage a larger volume of smaller snippets of content.

Just as with conventional content management systems, taxonomy is a very important ingredient to a successful CCMS implementation. Tagging the individual pieces of content with relevant taxonomy concepts (and other metadata elements) is the best way to make sure that your content is findable and reusable.   Taxonomy terms can be exposed to users as a set of facets to filter conventional search results but to provide a set of terms which users can us to browse through content.  Taxonomy can also be used to represent synonyms or alternate labels for concepts so that people can user alternate forms to look for the same concepts. This provides a consistency in how information is presented regardless of which words people use for a concept.    these are all a lot of the same benefits you see from taxonomy in other applications, but they are equally valid and important for CCMS.

We've done a brief review of the taxonomy management capabilities for some popular CCMS tools.  As we would expect, each vendor offers support for taxonomy in its toolset and each has spent some time highlighting the value of taxonomy and metadata for component content management.


xDocs by Bluestream is a CCMS that is designed specifically for DITA content.     From Bluestream's website:

"XDocs Extended Metadata supports thesauri, taxonomies, ontologies, and faceted browse. The pre-configured Thesaurus system can also be used for terminology control and terminology indexing."

XDocs supports a standard ontology modeling format called SKOS. I can't tell 100% if it supports a taxonomy import or not.


EasyDITA is another CCMS focused on DITA content and are  advocates for the value of using taxonomy in CCMS.  From EasyDita's website:

"There are several huge benefits to developing basic taxonomies and a metadata scheme if you haven’t done so already"

EasyDITA has a taxonomy manager within its administrative toolset, but does not appear to support taxonomy import.


IXIASOFT DITA CMS is another CCMS which believes in the value of taxonomy.  From its documentation:

"taxonomy is a hierarchical classification system that contains one or more taxonomy terms that you define. When you apply these terms to documents or elements, you can make information easier to find, both for authors and for end users. You can also use them when you process output to facilitate features like dynamic publishing portals."

The tool has a good taxonomy management capability as well as taxonomy import via simple TSV format.  Taxonomy import is a new feature in its newest version 4.2


Componize CCMS supports the use of taxonomy as well for better content tagging and search:  From its website:

"Organize your content with predefined categories, or use collaborative tagging or folksonomy to customize your tags. " Componize talks about how you can " and filter your content with faceted searches"

Componize supports taxonomy import in RDF and XML format, which are more advanced semantic data formats.  this speaks highly to the investment Componize has made in its taxonomy feature set.


Vasont also supports taxonomy with its Vasont CMS product.   Vasont calls out taxonomy specifically in its article about creating a strong publishing backbone:

"A taxonomy is a system of categories and subcategories that allow information to be organized to make it easier to find and relate to other information. You’ve already encountered useful taxonomies in your experience. The Dewey Decimal system is a taxonomy for organizing books in a library by subject, title, or author. If you browse internet kiosks, they frequently have a taxonomy to help you to find the product you’re looking for. In information systems, they are often represented by tree structures, so they fit well in the XML world."

Vasont does have a listing for controlled vocabulary (another term for taxonomy) in its feature grid. For now, we am unsure of the exact scope of taxonomy features or whether taxonomy import is supported.

WAND Taxonomy Library Portal is a great resource for foundation taxonomies covering a wide variety of industry vertical and business operational topics which can jump start an initiative to create a DITA Taxonomy.  Depending on the capabilities of the tool you choose, WAND Taxonomies can be imported directly or they can be used as a reference for manual

Wednesday, February 3, 2016

Taxonomy in Documentum Content Server

We continue our series on applications that can use taxonomies with this feature on Documentum Content Server.  EMC Documentum is a Gartner Magic Quadrant Leader and  Forrester Wave Leader.  It's a formidable player in the enterprise content management space and it has built significant taxonomy related features into its software.

In Documentum, the taxonomy features reside within the Content Intelligence Services (CIS) module.  Taxonomies can be imported in CSV or XML formats - the WAND  Taxonomy Library Portal supports downloads in the Documentum CIS format)

CIS allows for automatic categorization of documents to terms in the taxonomy.  CIS uses a rules based approach to categorization where keywords and key phrases are associated with each taxonomy category as evidence.  This evidence will be used to match documents to categories in the taxonomy.  Depending on the strength of the match of the words in a document to the evidence associated to a taxonomy term, a document may or may not be placed in each given taxonomy category.

Content can also be manually categorized in Documentum Content Server, without CIS.

Once content has been categorized to a taxonomy, a Documentum administrator can make the taxonomies available to end users. This means that users will be able to see the taxonomies and use them to navigate and search content.  From Documentum's Content Server 7.2 Administrative User Guide;

"When you bring it online, the taxonomy, its categories, and categorized documents appear to users under the Categories node in Documentum Administrator and in Webtop."

This is a powerful tool for making content more usable and findable within Documentum Content Server.

Starting at page 377, you can read more about the detailed capabilities of Content Intelligence Services in Documentum Content Server 7.2:

WAND Taxonomies covering every industry and business topic can be imported into Documentum Content Server to jump start a taxonomy initiative in Documentum.

Tuesday, January 26, 2016

Taxonomy in SAS Ontology Management

Taxonomy and ontology are critical in text analytics because they give context and meanings to the words in the text making them more than just keywords.

WAND provides taxonomies and ontologies for nearly every industry vertical and business functional area.  These models can jump-start a taxonomy initiative for a text analytics application by giving a strong foundation of terminology for a business area. Today we continue to profile applications where WAND Taxonomies can be imported to add value to an enterprise information management initiative.  

SAS is a leading enterprise software vendor focusing specifically on business analytics.  To support its text analytics and document categorization applications, SAS has a standalone application called SAS Ontology Management which gives users the ability to manage taxonomy and ontology models.

Using SAS Ontology Management, ontologies can be managed including adding concepts, managing hierarchies, creating concept attributes and metadata, and more.  Vocabularies in SAS Ontology Management can be published out to consuming applications via export or API.

Specifically, per SAS's website, SAS Ontology Management...
  • "Includes built-in integration with SAS Enterprise Content Categorization to enable automatic document categorization; entity classes from SAS Ontology Management can be uploaded to a SAS Enterprise Content Categorization Studio data repository as extraction or classifier concepts.... Output to metadata repositories, including SAS Metadata Server, SharePoint, FAST, EMC Documentum, Endeca and others, is connected via APIs"  
We'll follow up on how SAS Enterprise Content Categorization takes advantage of taxonomy in an upcoming blog post.

SAS Ontology Management is an important tool because it serves as a centralized repository for organizations that wish to take a semantic approach to managing content.   Any such organization will need taxonomies and ontologies.  The WAND Taxonomy Library Portal is a great place to start. WAND Taxonomies can speed up the time it takes to get value from an investment in SAS Ontology Management because organizations can more quickly develop the vocabularies needed to be published to consuming applications.

SAS Ontology Management Resources:

Thursday, January 21, 2016

Seven steps to get started with taxonomy in SharePoint

Taxonomy is the underpinning of good information architecture in SharePoint.   If you are using SharePoint for enterprise content management, you need to know about Taxonomy.

Microsoft was one of the first major vendors to showcase the importance of taxonomy when it released the Managed Metadata Service and the term store as major features in SharePoint 2010. This feature has persisted and been improved in SharePoint 2013, SharePoint Online in O365, and upcoming in SharePoint 2016.

For anybody considering taxonomy in SharePoint, here are seven steps to get started

1) Watch Managed Metadata 101: Taxonomy and Tagging in SharePoint.  This is a popular WAND webinar that has been viewed over 30,000 times (across several posted editions) and gives a feature by feature walk through of the SharePoint Term Store.  Don't worry, this is an educational - not a marketing - webinar.

If you want to go a little bit deeper, watch Managed Metadata 201: Advanced Taxonomy in SharePoint which covers topics such as how to set up metadata based search refiners and walks through how to control your SharePoint site navigation using the term store.

2) Download the WAND General Business Taxonomy.  This is a free download that provides a starter taxonomy which you can easily import to the SharePoint term store.  This gives you business relevant content that you can use to begin getting some hands on experience with taxonomy and the term store in your own instance of SharePoint.  You want to become very familiar with how the term store works and the WAND General Business Taxonomy will give you what you need to give it a real test drive.

3) Map out Your SharePoint Information Architecture.  This includes laying out your site navigation, content types, search filters, and metadata strategy.  Begin to determine which bodies of content will be placed where.  You may have a separate site for HR for example.  Begin to sketch out the columns of metadata that you would like to capture across your information architecture.  Once you have this map, you can begin to identify which term sets, or taxonomies, you will need to create to tag various bodies of content.  

Microsoft Technet has a great article that goes into depth with advice on how to plan your Managed Metadata

4) Develop your Term Sets.  Begin to create your term sets in the term store. Term sets should include important business processes, documents, and concepts that your users may want to use to tag and then search for content in SharePoint.  Get input from business users, but don't ask those users to create the taxonomies themselves.  A good practice is to spend 60 minutes with 2-3 stakeholders from each of your content areas (HR, Accounting, Product, etc).  Spend a short period of time explaining the value of creating a taxonomy. This Before and After: The impact of WAND Taxonomy on SharePoint Search document can help illustrate the value.

For the remainder of the session, you want to get as much information from the stakeholders as possible about what types of content they generate, what do people in their department often look for, what terms or concepts are important to their department.   Ask the stakeholders to bring sample documents that you can test the taxonomy against.  Ask to see any existing folder structures in shared drives (or elsewhere) that they may use.  While you don't want to copy a folder structure for your term sets, it will give you some great insight into the way that they have thought to organize their content and the terms they have used.  Once this session is complete, create a first draft of the taxonomy in a spreadsheet and then share it with the stakeholders so they can provide comments and feedback.  This is a great time to make sure you are capturing as many synonyms as possible.

The WAND Taxonomy Library Portal is a great resource for accessing taxonomies covering nearly every industry and business functional area.   This content can be downloaded and customized so that you aren't starting with a blank page when creating your term sets.  If you have pre-built taxonomy, the sessions with your stakeholders can be spent specifically customizing those taxonomies to tune and polish them for your organization instead of building them from scratch.

5) Develop your tagging strategy.  Once your taxonomy has been created, you will need to have a strategy for tagging your content with that taxonomy.  The three major approaches to tagging are manual tagging, default values, and automatic tagging.    Manual Tagging, where users tag documents when checking them in, and default values, generally based on file location, can both be done in SharePoint out of the box.  If you want automatic tagging, you will have to invest in one of several automatic tagging solutions that are available as add-ons for SharePoint.

6) Deploy your term sets.   Often times, it's hard to create a taxonomy for the entire organization and deploy it all at once.  It may be best to develop an initial taxonomy for one division - HR is a great choice - and roll it out to that group's SharePoint site first.  Begin to populate your managed metadata column with tags from the new term set and make sure that you have enabled these values as refiners in SharePoint search so that users will see, first-hand, the value of adding metadata to content.

This first departmental taxonomy will be your showcase for the power of taxonomy and managed metadata across the organization.  An internal showcase in a high profile division is a great way to build momentum for an enterprise wide taxonomy project in SharePoint.

7) Create your taxonomy governance plan. Taxonomy should not be created and then ignored. Taxonomy should be part of your overall SharePoint Governance plan to make sure that the taxonomy continues to grow and evolve as your organization does. Watch  Managed Metadata 301: Term Store Custom Properties and Taxonomy Governance in SharePoint  to learn about some of the fundamentals of taxonomy governance.

Tuesday, January 19, 2016

Taxonomy in Oracle Webcenter Content Categorizer

Taxonomy is a foundation component of an enterprise information architecture in content management.  Oracle Webcenter is a major content management system, so we want to explore how taxonomy can be used with this tool. Unfortunately, online documentation about how to manage taxonomies within Webcenter is extremely thin.  We do see, however, copious references the fact that Webcenter does use taxonomies.

A key tool inside of Oracle Webcenter is called Content Categorizer. Content Categorizer allows for automatic tagging or classification of documents to taxonomy metadata either in a batch or in real time as content is checked in.   Content Categorizer has its own rules engine for classification, but it also has an open API so that clients can use an external taxonomy manager and classification engine, (such as SmartLogic Semaphore).  As a note, if you are using an external taxonomy management tool, the answer as to how to edit taxonomies, import taxonomies, etc is extremely clear.

From Oracle's documentation:

Content Categorizer provides organizations with the capability to use one or more taxonomies within WebCenter Content Server. In addition to its out-of-the-box categorization tools and functionality, Content Categorizer provides an open API for third-party categorization engines. With this open architecture, users can take advantage of the rule sets and taxonomies provided by third-party categorization tools. As a result, organizations can choose the categorization engine that best fits their business needs. For example, organizations can use their existing vertical industry taxonomy to organize their managed content into specific categories and subcategories.

Additional Resources:

More details on Content Categorizer from Oracle

If you are an Oracle Webcenter client, ask your client services rep for more information.   WAND Taxonomies can be used as a starting point for creating your own custom industry or business functional taxonomies for use in Oracle Webcenter.

Thursday, January 14, 2016

Taxonomies in SAP NetWeaver Enterprise Portal Knowledge Management

If you are an SAP NetWeaver Enterprise Portal client, you have access to tools that allow you to leverage taxonomy.   Taking advantage of these taxonomy capabilities can help you get the most out of your NetWeaver Enterprise Portal KM and make your enterprise documents as useful and accessible as possible.

As an overview of NetWeaver Knowledge Management on its website, SAP notes that:

"With the Knowledge Management functional unit, SAP NetWeaver provides a central, role-specific point of entry to unstructured information from various data sources. This unstructured information can exist in different formats such as text documents, presentations, or HTML files. Workers in an organization can access information from different source such as file servers, their intranet, or the World Wide Web. A generic framework integrates these data sources and provides access to the information contained in them through the portal.
The Knowledge Management functional unit supports you in structuring information and making it available to the correct target audience. You can use the different functions on all content of integrated data sources, as long as the technical conditions are met."

So, this provides us a framework for this application.  Now, how is taxonomy used?  In SAP NetWeaver Knowledge Management, taxonomy is used to classify documents. Then, the documents can be browsed via the centralized taxonomy regardless of where the documents are actually stored. 

Again from SAP:

"A taxonomy is a hierarchical structure of categories in which you classify documents according to content, organizational, or other criteria. Documents that are stored in different repositories can be included in the same category. Taxonomies portal users to navigate in a uniform structure throughout an organization even if information is stored in heterogeneous storage locations.
After the initial configuration has taken place, the system automatically classifies new and changed documents."

SAP Help Portal goes into depth on how taxonomy is used and the taxonomy features of SAP NetWeaver Knowledge Management:

SAP NetWeaver Taxonomy Based Document Classification

SAP also has a document classification capability which can be done via a Query-based Classification or via an Example-based Classification. 

Query-based classification operates by assigning a search query to each taxonomy category. Then, documents are assigned to the taxonomy categories based upon the query.  This is essentially a rules based approach.

Example-based classification operates by assigning exemplar documents to each taxonomy category.  This is a training set based approach.

More details between the two approaches from SAP are at

How to get started with Taxonomy in SAP NetWeaver

Taxonomies from the WAND Taxonomy Library Portal can easily be imported directly through the SAP NetWeaver Taxonomy import capability:  
This is a great way to jump start an SAP NetWeaver taxonomy initiative with highly relevant pre-built industry and business function taxonomies that can be customized specifically for your business. 

Tuesday, January 12, 2016

How to use Taxonomies in Drupal

As the world's leading provider of taxonomies, WAND's role is to evangelize and raise awareness about where taxonomies can actually be used.  Today we'll talk about Drupal. Drupal is one of the most popular open source content management systems, used to power websites, e-commerce sites, and portals.

Drupal has built in taxonomy support as well as a robust ecosystem of add-on modules that can enhance the experience of using taxonomy.  Taxonomy is a Core module in Drupal which means it is included as part of the Drupal Project Download.

Taxonomy is important in Drupal for three primary reasons:

     1) Taxonomy in Drupal can be used as a framework for overall site navigation.

     2) Taxonomy in Drupal can be used as a set of tags that can be applied to content. Content that has been well-tagged with a curated taxonomy is easier to search and reuse across a website.

     3) Creating synonyms within your taxonomy ensures that content is tagged in a consistent manner regardless of the words people want to use for a concept.  H.R. = Human Resources.

In essence, taxonomy provides the basic foundation for a Drupal site's information architecture


"Taxonomy can be used in workflow, to customize defined sections of your website with different themes or to display specific content based on taxonomy terms."

If you've decided to deploy taxonomies in your Drupal site, the WAND Taxonomy Library Portal will give you access to pre-defined taxonomies covering nearly every industry and every business topic.  Taxonomies can be downloaded in a format that will easily import directly into Drupal

I've gathered some additional resources directly from that goes into greater detail on how to take advantage of taxonomy to make the content on your Drupal site more accessible for your users.

       Documentation for how to use Taxonomies in Drupal:

Organizing Content With Taxonomies:
About Taxonomies:

       Add-on modules to enhance the basic taxonomy capabilities of Drupal: