Clusterpoint CONTACTS / Technical support Become Technology Partner Download free version
  • Home
  • Advantages
  • Products
  • Services
  • Download
  • Documentation
  • Support
  • Store
  • Partners
  • About
  • Search
  • Tour 1
  • Tour 2
  • Tour 3
  • Tour 4
  • DBMS Features
  • Information Ranking
  • Solutions
  • Index Ranking for Search Relevance
  • Step 1: Ranking XML Data Structure
  • Step 2: Ranking of Text Content
  • Step 3: Ranking of Documents
  • Step 4: Calculation of Relevance
  • Linear Ranking Scalability in Clusters
  • Reduce Multi-page Browsing Queries
  • Information Ranking Performance
  • Configuring Ranking Rules by Policy
  • Benefits for Database Management

Index Ranking For Search Relevance

Clusterpoint Server provides powerful information ranking mechanism built into the very core of the Clusterpoint DBMS data indexing and retrieval engine. It is used together with Clusterpoint Indexing and Clusterpoint Search mechanisms to drive the core database operations.

You can rank for exact positioning into search results over 400 billion items per single database storage in the Clusterpoint database architecture, where search queries are performed at ultra-fast sub-second speed servicing typical Web ad hoc search requests of users.

Clusterpoint database platform game changing information ranking algorithm for flexible search relevance customization
This Clusterpoint ranking capacity enables to develop the most complex database applications with respect of precise data positioning, grouping and ordering at search, while still maintaining ultra-fast search response times.

Technically Clusterpoint's DBMS information ranking mechanism is Clusterpoint's openly described customization method how our customers can rank their database information from database owners and application users point-of-view, when searching for the most relevant information. Customers can freely design, apply and use their own information ranking rules, which Clusterpoint Server software will take into account when building Clusterpoint Index for that particular customer database and performing search.

Clusterpoint Index is built and organized in such a way, that your own ranked index configuration is applied to your database full content by Clusterpoint Server, creating and maintaining a customized pre-sorted index so that database delivers extremely fast information access during search queries, including full text search queries in combined structured and unstructured data content. Ranking enables exact positioning, grouping and ordering of search output and makes database index natively scalable in clusters.

This unique indexing method does not require massive and repetitive sorting of database information at each search query, characteristic to SQL platforms. Please see section Clusterpoint Index for more information about how our database indexing works.

Clusterpoint Index scales out for fast and relevant search in a large cluster, provides out-of-the-box full text search and XML-structure search functionality, and is being built and maintained completely automatically for any Clusterpoint XML database storage by Clusterpoint Server software. It can be customized using Document Policy configuration file, a small XML file, which describes customer specific rules how to apply information ranking for a particular Clusterpoint storage.

Although Clusterpoint API enables to store and access data objects in both XML and JSON data formats, internally the Clusterpoint database stores all data objects in XML (JSON has some data format limitations so that it is possible to convert all JSON data objects to XML in straightforward unambiguous way, but it is not possible to store in JSON all XML items, that is the reson we stick to industry standard XML data format internally). Information ranking is always applied to internal resulting XML structure (even if data came in in JSON, and has been converted for interbal storage).

Clusterpoint Information Ranking concept is technically implemented as an easy to understand system of ranking the three most basic objects of any database information:

  • Ranking for customer XML data structure: you can assign Relevancy Weights (0-100%) to rank customer XML data structure items; Relevancy weights applies as a relative ranking among any items of an XML document structure (for example, a Title tag can be ranked 100%, a Notes tag 10%). The higher is relevancy weight value of an item, the higher position in search results will be to those documents, where search hits matches content within that particular data item; Please read more in subsection Step 1: Ranking of XML Data Structure.
  • Ranking of textual content in XML data items: you can assign Relevancy Weights Intervals (0-100%) to dynamically rank relevance of any customer textual XML data item content matches; Relevancy weights interval values specify a range of two pairs: a minimum value and a maximum value (for example, 20%..50%) and is used to calculate dynamically overall single Relevancy Weight at Search, taking into account how many times search term is actually present in the particular XML text item, and at which position in the text it is located. The higher relevancy is calculated for those hits, where search terms are occurring multiple times and located closer to each other or to the beginning of text; this extra ranking is useful for more user friendly full text search in XML databases containing a lot of unstructured and semi-structured data; Please read more in subsection Step 2: Ranking of Textual Content.
  • Ranking for data objects (we call them documents): you can assign Document rate (0-232) values used to rank all database documents among themselves; Document rate tag is defined as an XML tag having a numeric integer values and may be added to your XML database data objects (e.g., <rate>1234567</rate>) or selected from existing XML tags such as timestamps or any other numeric sequences; Higher Document rate value is higher document rank in the entire database, determining position in search results for that particular document at equal Relevancy Weight group; Please read more in subsection Step 3: Ranking of Documents.

This our ranking mechanism has a capacity to instantly deliver up to 100 groups of ordered search results, where in each group you can additionally and uniquely rank more than 4 billion documents among themselves. Altogether more than 400 billion data items can be uniquely ranked in Clusterpoint architecture. Please see Step 4: Calculation of Relevance.

With those three basic information ranking mechanisms, one for ranking of your XML database structure, the second for ranking of textual content along the best practice enterprise search principles, and the third one - for ranking of your database documents among themselves, Clusterpoint Server database platform provides you with a very powerful and extremely flexible information ranking customization facility. It empowers you to design, program and operate web database applications, where database search logic will be based on your own business rules and search relevancy preferences.

Please read more details about Clusterpoint Information Ranking illustrated with samples below, which we will try to cover step by step. We understand that information ranking concept probably is the most difficult to understand section of Clusterpoint DBMS, as many people did not work previously with enterprise search systems. Please do not be afraid to re-read this section again. When the concept is fully understood, you can start building your own exciting scalable and entirely searchable database applications. You are also welcome to call us for questions at Clusterpoint Support.


© Clusterpoint Ltd. 2006-2012. All rights reserved
  • Home
  • Privacy Policy
  • Trade Marks
  • Site Map
  • Contacts