Clusterpoint CONTACTS / Technical support Become Technology Partner Download free version
  • Home
  • Advantages
  • Products
  • Services
  • Download
  • Documentation
  • Support
  • Store
  • Partners
  • About
  • Search
  • Tour 1
  • Tour 2
  • Tour 3
  • Tour 4
  • DBMS Features
  • Information Ranking
  • Solutions
  • Index Ranking for Search Relevance
  • Step 1: Ranking XML Data Structure
  • Step 2: Ranking of Text Content
  • Step 3: Ranking of Documents
  • Step 4: Calculation of Relevance
  • Linear Ranking Scalability in Clusters
  • Reduce Multi-page Browsing Queries
  • Information Ranking Performance
  • Configuring Ranking Rules by Policy
  • Benefits for Database Management

Step 2: Ranking of Text Content

Clusterpoint Server provides a mechanism to rank textual content in any custom XML data item for the best search query relevance from the classic enterprise search point of view. It is being activated only when customer defines Relevancy Weight Interval for a particular XML data item: two values - minimum (0-100%) and maximum values (0-100%), for example, as a range syntax 20..50 used in Document policy file. The assigned Relevancy Weight Interval is being dynamically translated into actual overall search relevancy weight between those two values during search query, depending how well textual content of that XML data item matches particular full text search query.

Please note that in the above our sample, 'Main text' item is treated activating Clusterpoint Server's built-in enterprise search algorithms for textual content. Otherwise with a fixed single value for Relevancy weight for an XML data item, this functionality is not used.

For this item 'Main text' in the above example we were assigning textual content ranking based on specified interval of two relevancy weights: from 20% to 50%.

When relevancy weights interval rule is configured for a particular XML item, total number of query terms matches and positioning in this field ('Main text') textual content is being taken into account.

For textual content ranking the resulting search relevancy will be calculated by Clusterpoint database engine dynamically, further ordering positioning for search results in this particular XML item, and placing higher those results which has multiple query terms matches at higher text position.

For example, a single query term hit will result in 20% relevancy, but more repetitive hits will bring the result up to 50% relevancy. More relevant will also be hits which are closer to each other, and hits at the beginning of text content in the 'Main text' field. In our sample it could look like this:

XML database item textual ranking with enterprise search features, proximity search, phrase search
This is very powerful option for databases, where unstructured data and text values are stored, and need to to be processed by classic enterprise search software methods, increasing document relevancy and position in the search results, based on better search query match to the actual textual content. Any XML database in essence is a document database and this extra textual ranking feature will benefit many web applications storing both structured and unstructured data.

Often for advanced textual search processing customers have to resort to extra enterprise search tools, adding to overall complexity of their IT systems.

We deliver this textual content match ranking capacity out-of-the-box, by enabling to activate classic enterprise search relevance building algorithm for those XML data structure items, which contain text, through Relevancy Weight Interval ranking defined by Document Policy configuration file.

Please note that textual ranking fits into our overall model of information ranking, as it is producing overall single relevancy weight at search, and thus can be freely combined with XML database structure ranking with fixed Relevancy Weights described in Step 1. In fact, it allows to rank your database textual content match to query, even if other two ranking mechanisms are fixed or are unused in Clusterpoint architecture (if you wish, you can skip Step1 and Step3 ranking methods and get a clean and fast working full text search for any of your database).

Our interval-based relevancy application mechanism also enables much greater flexibility in other use cases compared to fixed weights model used by some classic enterprise search engines. For example, customer can create meta XML markups increasing or decreasing particular database object relevancy by simply repeating few strings within a technical XML tag, without changing overall customized Information Ranking mechanism, to address database exceptions, such as boosting of a paid search option for a customer or similar need. Please not that this exception handling method also does not require to change application software. We have customers who significantly reduce software development efforts otherwise requiring to often modify application software case by case.

Eventually, our customer can set up the most advanced data grouping and sorting algorithms for search of his own choice, potentially based even on very complex business rules for exact search results positioning, mixing and tuning structured, semistructured and unstructured (textual content) search, and do it with Clusterpoint's Document Policy configuration file changes only, or in special cases, achieve the goals with database content adjustment, without any complex application software programming.

This separation of XML database structure information ranking from application software logic in our opinion is very great advantage, as it does not require to change application software at all. You can adjust and change your custom defined Clusterpoint Index rules for the best user search experience, until your custom search result relevancy rules defined by Document Policy are considered the most relevant. Then you can reuse Clusterpoint database and its search functionality across all of your application platforms.


© Clusterpoint Ltd. 2006-2012. All rights reserved
  • Home
  • Privacy Policy
  • Trade Marks
  • Site Map
  • Contacts