Clusterpoint Server provides a mechanism to rank textual content in any custom XML data item for the best search query relevance from the classic enterprise search point of view. It is being activated only when customer defines Relevancy Weight Interval for a particular XML data item: two values - minimum (0-100%) and maximum values (0-100%), for example, as a range syntax 20..50 used in Document policy file. The assigned Relevancy Weight Interval is being dynamically translated into actual overall search relevancy weight between those two values during search query, depending how well textual content of that XML data item matches particular full text search query.
Please note that in the above our sample, 'Main text' item is treated activating Clusterpoint Server's built-in enterprise search algorithms for textual content. Otherwise with a fixed single value for Relevancy weight for an XML data item, this functionality is not used.
For this item 'Main text' in the above example we were assigning textual content ranking based on specified interval of two relevancy weights: from 20% to 50%.
When relevancy weights interval rule is configured for a particular XML item, total number of query terms matches and positioning in this field ('Main text') textual content is being taken into account.
For textual content ranking the resulting search relevancy will be calculated by Clusterpoint database engine dynamically, further ordering positioning for search results in this particular XML item, and placing higher those results which has multiple query terms matches at higher text position.
For example, a single query term hit will result in 20% relevancy, but more repetitive hits will bring the result up to 50% relevancy. More relevant will also be hits which are closer to each other, and hits at the beginning of text content in the 'Main text' field. In our sample it could look like this:
This is very powerful option for databases, where unstructured data
and text values are stored, and need to to be processed by classic
enterprise search software methods, increasing document relevancy and
position in the search results, based on better search query match to
the actual textual content. Any XML database in essence is a document
database and this extra textual ranking feature will benefit many web
applications storing both structured and unstructured data.
Often for advanced textual search processing customers have to resort
to extra enterprise search tools, adding to overall complexity of
their IT systems.
We deliver this textual content match ranking capacity out-of-the-box,
by enabling to activate classic enterprise search relevance building
algorithm for those XML data structure items, which contain text,
through Relevancy Weight Interval ranking defined by Document Policy
configuration file.
Please note that textual ranking fits into our overall model of
information ranking, as it is producing overall single relevancy
weight at search, and thus can be freely combined with XML database
structure ranking with fixed Relevancy Weights described in Step 1. In
fact, it allows to rank your database textual content match to query,
even if other two ranking mechanisms are fixed or are unused in
Clusterpoint architecture (if you wish, you can skip Step1 and Step3
ranking methods and get a clean and fast working full text search for
any of your database).
Our interval-based relevancy application mechanism also enables much greater flexibility in other use cases compared to fixed weights model used by some classic enterprise search engines. For example, customer can create meta XML markups increasing or decreasing particular database object relevancy by simply repeating few strings within a technical XML tag, without changing overall customized Information Ranking mechanism, to address database exceptions, such as boosting of a paid search option for a customer or similar need. Please not that this exception handling method also does not require to change application software. We have customers who significantly reduce software development efforts otherwise requiring to often modify application software case by case.
Eventually, our customer can set up the most advanced data grouping and sorting algorithms for search of his own choice, potentially based even on very complex business rules for exact search results positioning, mixing and tuning structured, semistructured and unstructured (textual content) search, and do it with Clusterpoint's Document Policy configuration file changes only, or in special cases, achieve the goals with database content adjustment, without any complex application software programming.
This separation of XML database structure information ranking from application software logic in our opinion is very great advantage, as it does not require to change application software at all. You can adjust and change your custom defined Clusterpoint Index rules for the best user search experience, until your custom search result relevancy rules defined by Document Policy are considered the most relevant. Then you can reuse Clusterpoint database and its search functionality across all of your application platforms.