From Peter Wilkerson, Search & Discovery Practice Area Manager
In May, IBM announced the next version of OmniFind Enterprise Edition. When you read through the announcement you will see a number of enhancements and new features. The feature that is of particular interest is “Facets” and how they can be created. This article considers the new combination of functionality and exactly what that means for Enterprise search. If you get complaints that your enterprise search “just doesn’t work,” this article will help you understand how you can help and give you some new solutions to the problem.
Facets can be used in many ways. A common use is to let users filter large result sets based on the attributes of documents in the result set. In the Enterprise, some common facets are document type, department, project code and the like. These are all very useful when searching but, it doesn’t always help people find what they are looking for.
Let’s think outside the box for a moment. How else might we guide our users to documents that they want and need?
What if…
What if we looked beyond attributes commonalty such as a document’s digital attributes and the business context in which it was created or used (document type, department, author, project code, etc.) and add a different set of facets based on a different kind of characteristic – a customer-interest facet? (Sidebar: I describe all individuals using search as customers whether they are searching as members of the Enterprise or B2B/Retail customers).
Here is how I get people to think about customer-interest facets:
If you knew someone from HR was searching your content, what kind of documents would they most likely want to see? How about people from your Marketing Department? Customer Service? People in each of these departments are likely to have a very different idea of the “ideal search result set” – even if they all entered the same search keywords (which is why it seems that you “just can’t win” when you are tuning search).
What can you do?
You need to identify characteristics in these documents that you can use to identify and differentiate how the “ideal document result set” for any given customer group is different than the less useful documents returned. You also need to consider how these differentiators might help you identify the ideal result set for your other user groups. Once you’ve identified these differentiating characteristics, we need a way to “automatically” assign a facet that identifies a document as relevant to a given group. For our HR scenario, let’s say that we assigned a facet value of “HR-relevant” to documents in the ideal result set for HR.
How can you implement this?
There are two pieces of this puzzle. First, you need the ability to influence the sequence of documents in a result set based on facet values and second, you need a way to assign these values to individual documents in a cost-effective way.
In OmniFind EE v9.1, you are able to influence how important any given document is for a given search. Any document can be made to appear higher in a result set than it would have otherwise ummappeared based on the value contained in fields (sometimes called features). In our HR example, if a person is signed in and is identified as an HR worker, then we can “boost” any documents that have a facet (or feature) value of “HR-relevant.”
Next, there is the problem of how to identify which documents out of hundreds of thousands of documents should be marked as “HR-relevant.” It is not cost-effective to have an individual go through all the documents and make that determination. You could assign codes when a document is created but that approach has it’s drawbacks as well. What we need is a way to also assign facet values (and other document metadata) based on patterns.
A pattern can be defined in terms of the intersection of words found in the content, author, document type and other attributes. (There are software packages that can help you identify these patterns.) Once a pattern has been identified and defined, you need the ability to assign a new facet value whenever that pattern occurs. This is where one of my favorite tools comes into play – UIMA.
UIMA stands for “Unstructured Information Management Architecture.” It is a way to extract information based on patterns, presence of keywords/synonyms, proximity of words, parts of speech – all based on unstructured data found in documents being indexed. With Omnifind v9.1, it is possible to assign a new value to a facet field based on the UIMA pattern.
As a result we have a tremendous amount of flexibility when we need to update or refine what documents should be assigned a facet value (say “HR-relevant” for our earlier HR scenario) – all without having to change the content sources themselves.
Bottom Line
Companies who use this approach will find they are able to customize/personalize results sets based on who is searching (probably based on user login and profile information). This capability will lead to higher satisfaction with search in general. An added benefit is that you will also be able to update your search “schema” of how to weight documents and respond to changing needs and new challenges more easily.
Are you interested in learning more about the new version of OmniFind Enterprise Edition? Send me an email to continue the conversation.