Clusterpoint Server automatically creates and maintains
complete database
index: Clusterpoint Index, which is used for high performance
data access and search in all of your stored documents.

This our indexing model radically differs from relational SQL
world. In relational database systems indexes have
to be specified for fast search, with primary and secondary keys,
applied in application specific way across certain normalized
multi-table database structure. It requires a lot of efforts
and often painstaking attention to technical details to get the
indexing system right from database architects developing SQL systems.
As rather complex indexing system, it is prone to many
database administration and software development errors. It
is often quite expensive to maintain knowledge and application software
versioning
over the relational database life-cycle, and one of the key reasons for
this cost is underlying complexity of legacy database indexing system.
Relational indexing requires a customized design of all
database
entities and relationships. It is very often started as
a one-person concepted custom design of the database and its indexing
model, quickly gets outdated or sophisticated when database is
developed and when it goes into real life production and maintenance.
Most legacy
relational databases gradually becomes heavy to use and understand by
follow-up application developers. We believe that this legacy
indexing model is aging, based on more than 30-years ago developed
SQL database architecture concept. Today available hardware
capacity
and processing power also makes this model outdated and obsolete.
It
simply is too complex to design, maintain and to use it efficiently, it
requires complex software and complex knowledge to manage all that
complexity.
Unlike relational databases, Clusterpoint Server indexes entirely everything that goes into our XML database: all data items, texts, dates, numbers, strings, relations etc. It also uses very special type of index data storage which we have developed to implement generic and absolutely scalable indexing system for a massively distributed database architecture, potentially running across large number of hardware servers (in 1000s of servers storing only parts of total database). Clusterpoint Index is a full database content index, and takes advantage of modern hardware capacity: abundance of disk storage and RAM memory, cheap CPU power and ubiquitous networking. Hardware and storage cost is nearly nothing compared to the cost of software development, integration and maintenance today. To simplify database management complexity, we simplified database indexing to the ultimately possible: index everything which may be searched for. Clusterpoint Index is designed for high-speed search in any database structure. It is a database structure independent index and therefore much more simple to use and manage than a fixed-structure relational SQL index. It eliminates the need to "know" about the index structure in application software and therefore cuts of lot of index-specific application programming efforts from database software developers agenda. Also DBAs have much less problems to handle with a single uniform database index, compared to often bizarre indexing systems in SQL world they need to support for each particular database.
Clusterpoint Index is being updated in real-time, when new
documents are being stored into Clusterpoint storage or existing ones
modified or deleted. 
On database system architecture level Clusterpoint Index was designed by "marrying" mathematical concepts of a graph database indexing principles, with our custom inverted index used for full-text search, and with traditional B-trieve type of indexes, used for structural index elements requiring traditional sorting methods.

The picture above illustrates general concept how Clusterpoint
database indexing system works. Your XML documents are stored
into Document repository in their original XML format. We do
not change them. The content of this your XML database is
being used to create and maintain a RAM based fast-access Vocabulary
and disk-based Clusterpoint Index by Clusterpoint Server. For
each database storage on a particular cluster node the resulting index
is completely autonomous and only serves for ultra-fast access for
documents stored on that particular cluster node.
This
database storage and indexing architecture provides high resilience
against the service unavailability in massively clustered environments.
Even if some hardware equipment fails, it would affect only a
small part of the total database, which may be temporarily
unavailable, but will guarantee that your database services will never
be lost completely. This architecture also allows to mirror
database storages on each cluster node into as many identical parts as
necessary within a cluster, guaranteeing 100% availability of database
services through additional mirroring of cluster storages. 
Clusterpoint Index internal structure is engineered using a set of methods and algorithms that produce a unique 'atomic' index from any supplied database content stored on each cluster node. Clusterpoint Server software indexes and stores every smallest item of information present in all of your documents in the XML database storage managed by that particular cluster node: words, values, emails, strings, numbers, dates, relations, xml tags etc., along the required relationships and Information Ranking attributes. That is why we call this 'atomic' indexing model.
This 'atomicity' of Clusterpoint database index together with Information Ranking effectively allows to partition database of any size into as many parts as necessary in distributed hardware cluster, and still guarantee ultra-fast and relevant search, using just the very basic elements of our 'atomic' index as search terms. This 'atomic' index forms the foundation of Clusterpoint's database platform high performance indexing and search mechanism.
We have also developed lightning fast database querying algorithms that can work with that 'atomic' index combined with your XML documents in totally different way, that is orders of magnitude more efficient at relevant information retrieval and search than any relational software is capable to do.
Efficiency is achieved through Information Ranking mechanism enabling to assign relevancy and rank your information in your total database content, customizing your database search algorithms for exact data grouping and ordering, that your users consider "the most relevant". This customization is crucial for a good user search experience, in particular, in large databases where any search query can produce overwhelming results sets and frustrated multi-page browsing by users, often producing thousands and even millions of results matching the user's search query.
Overall organization for Clusterpoint 'atomic' index elements illustratively can be described as a huge graph of interconnected and ranked for relevance "atoms" (all database index elements):

In
this model (please forgive us, if you did not find physics
interesting) our
database index organization could be explained with a model of grouping
and ordering chemical elements in The
Periodic Table of Elements,
where each atom have a mass and chemical energy.
Imagine
Clusterpoint DBMS being like a chemical laboratory, that takes matter
(all database content), splits it into basic chemical elements, and
organizes and groups it for fast access and search into Clusterpoint
Index similarly like chemical elements are grouped in The Periodic
Table of Elements. Clusterpoint DBMS also provides you with
a mechanism to
assign your own customized atomic mass (XML data structure relevancy
weight) and chemical energy (document rate) as ranking attributes to
all of your index 'atoms'. And we provide the engine
that
uses that
mechanism to instantly retrieve the most massive and the most energetic
'atoms' and
all
related things that are made of them (your XML documents), sorted and
grouped for meaningful search.
Technical details how to apply Clusterpoint information
ranking can be read in the section Information
Ranking.
As a result Clusterpoint Server based database system can
instantly find any content in any custom database by simple
and user-friendly query mechanism, with Internet-style ad hoc query
terms,
and return results sorted by your own customizable relevance.
With Clusterpoint DBMS you can design your own unique
ranking
system for your
valuable business databases. Controlling your databases
through
your own custom ranked search you can provide high level of your
customer
satisfaction and great search experience that others would struggle to
match. A custom defined Information
Ranking
rules for a Clusterpoint database you can even protect with patents and
commercial secret. Our Document
Policy
file
describing your ranking rules is a structured rule set which is
uniquely designed for your particular XML database storage and
therefore can be protected by trade secret.
Our customizable database index ranking system simplifies database search and makes it extremely powerful and fast in any database. It is capable to bring the most relevant database search results always on the first web page, sorting and grouping your database information according to your own business needs. This is in stark contrast with some closed and proprietary information ranking systems available on the Internet, where you depend on someone else to organize your information.
Using analogy again, the Clusterpoint indexing technology enables to instantly "find any needle in a very large hay stack" where "a hay stack" is the complete content of your database. Actually it can also find "a needle in thousands of hay stacks", executing sub-second database search query in a large cluster of servers without performance penalties, characteristic to legacy database architectures. Even more search power - it can find not only "a single needle", but, if we continue our "hay stack" analogy, the Clusterpoint database technology can instantly find "all needles from all hay stacks (cluster nodes) and deliver them sorted according to their weight and length, more weighty and lengthy needles first". And we also supply you with a ranking mechanism to custom assign weight and length to each particular needle relatively to other needles.
We hope that those two analogies above illustrates how
Clusterpoint Index and related Information
Ranking
works. For technical details about database ranking and index
customization
please look into the section Information
Ranking. 
Technically Clusterpoint database technology indexing mechanism creates a fast and pre-sorted disk index for your XML database, which improves overall efficiency of your IT system by requiring less CPU and less disk access across all of your IT infrastructure. Clusterpoint indexing model is very efficient in write-seldom, read many times database usage models dominating web-driven IT industry today.
It eliminates repetitive sorting and grouping of data per each query as in SQL server, a legacy method requiring to load large size index files or substantial parts of those index files into memory for efficient data sorting and processing. Clusterpoint Index does not require constant swapping of large size index files from disk to memory for achieving fast search queries. Eliminating the need to read from disks tens and hundreds of megabytes of index data per query as in SQL databases, Clusterpoint Server is radically reducing workload on database server disk subsystems: by several orders of magnitude. It significantly contributes to energy footprint, requiring less powerful servers to manage large databases. Installed server capacity can be re-used for other purposes and can be switched off.
In Clusterpoint database 'atomic' index architecture most
index data retrieval is sequential and packetized in small 5K-10K data
transfer transactions between cluster nodes. In essence,
Clusterpoint Server for each search term perform direct disk access to
find respective 'atom' with a tree of ranked attribute "leaves"
pointing to matching documents. Probably few sectors of data
is read from the disk per each search term per cluster node.
As all "leaves" are already sorted according to the relevance
defined by Information Ranking rules, there is only minimum disk
operations required per server to return the most relevant data to the
cluster node, initiating the query. Most of the time taken by
the Clusterpoint Server system to answer a particular query is to wait
on network data transfers, while all 5K-10K packets are received and
merged, and final result set is delivered to the requesting web
application. Typically it takes around 0.2 seconds to respond to the
search query in Clusterpoint architecture, even with disk access
performed on multiple networked cluster nodes.
Those savings for heavily used databases translate also into substantially fewer disk input/output operations, less and smaller disk data buffering and caching needs, and big savings in processing power requirements. Add to those savings also fewer database search transactions from unproductive multi-page browsing. With Clusterpoint database software the most relevant data is almost always available on the first web page and multi-page results browsing is not necessary. Less browsing activity also reduces your corporate web server, application server and network traffic volumes (commonly encrypted by SSL, taking an extra toll on CPUs), and further reduces resource consumption within all of your core IT systems. Finally, instantly responsive corporate databases cut unnecessary waiting time of your employees for search results and requires less idle time for computing resources at your employee workplaces. Cutting transaction volume and making any corporate database search experience fast and relevant with Clusterpoint software will contribute to overall business productivity and work efficiency.
Taking into account all of the above about energy efficiency,
we
have firm grounds to believe that Clusterpoint database server
software platform delivers much more
"green" power-saving technology in data management than any SQL
technology could ever do. 
Clusterpoint database indexing model is independent from a cluster network configuration and independent from total number of cluster nodes. It enables our customers incrementally to add new servers to the cluster when their database size or usage grows. Customers can flexibly increase their database storage capacity scaling out cluster with new servers or flexibly reconfigure cluster database operational capacity by swapping parts of database on different servers, without negatively affecting performance of their database operations. This flexibility of our software scalability is ideal fit for modern cloud environments: all clustering setup and operations for a database can be performed by system administrators, without involvement of application and database software programmers. Actually there is no need to change application software at all, compared to many cluster systems requiring partitioning of database clustering logic also in application software.

In a massively clustered environment many of your time
consuming
database tasks such as indexing or reindexing could be easily split
among large
number of servers available in Clusterpoint distributed database
architecture, yielding productivity gains proportionally to the number
of servers in a cluster.
Clusterpoint database
architecture is designed to
efficiently scale to run on hundreds and even thousands of computers in
a single cluster. With hundreds of underlying servers there
may be needed some database software
adjustments for matching network configuration, to avoid network
related bottlenecks,
yet our database software platform was designed on system architecture
level with this
generic capacity to scale out linearly. We are welcoming our
partners to
suggest projects where we can test-drive the software on massive data
sets requiring such scalability, and who can provide hardware
resources. Please see also Partnerships.
Below are our sample scalable database application project,
where we can
also provide the prototype application software to our
partner. 
Linear scale out ability of Clusterpoint DBMS starts to be affected with large number of extra servers, if they all are using the same network switching infrastructure in linearly connected way. Network transaction times among cluster nodes, albeit individually very short, with large number of linearly connected cluster nodes start affecting overall system performance of a distributed database performance.
To achieve much greater scope of scalable IT database infrastructure capacity and still maintain the same ultra-fast performance, you may need to set up a custom networking infrastructure among hundreds of hardware nodes so, that there is minimum network traffic switching over the same hardware links.
For example, you can set up a hierarchy (connected in a tree-topology) server farm, minimizing the number of traffic hops from the top level network cluster nodes to the bottom level network cluster nodes. Clusterpoint Server software is architected and engineered in such a way, that is is possible to easily customize it to your specific hierarchal network switching infrastructure, building and efficiently operating really massive instantly searchable databases such as Internet indexes, huge library and document archives, billions of tweets etc.
Otherwise, without taking into account the factor of network switching "fabric" effect , it would not be possible to guarantee sub second search times for terabyte- and petabyte-size databases.
We have built on top of our database platform several scalable demonstration applications, that proves this scalability and performance of Clusterpoint DBMS in demanding computing environments.
Sample Application No 1: Global Internet Search Platform
For example, Clusterpoint DBMS software together with our prototype Global Internet Search Platform application can be used as a cohesive and inter operable, fully scalable database and application software solution for ambitious Web search infrastructure projects, which may require massively scalable and entirely searchable Internet search index capacity. Clusterpoint technology can deliver the necessary scalability and performance. We would be glad to provide to interested our customers a streamlined and robust all-included Internet search solution that scales: Global Internet Search Platform application, based on scalable Clusterpoint DBMS data storage.
Global Internet Search Platform's software key advantage is simplicity: no integration of architecturally and conceptually different systems are required. Most often integration costs for putting together disparate systems are too high, their management requires a lot of attention and efforts, that very often and famously results into spectacular failures.
We invite interested parties to try Clusterpoint database technology for their Internet search projects to see the difference.

Here is how our technology solution is designed to work for a
national or a global Internet search project illustrated above:
Step 1:
Internet Crawler (a key part of our Global Internet Search Platform) application is launched from all cluster
nodes for
automatic link spidering downloading Internet information into a
distributed Clusterpoint DBMS database
Step 2:
All downloaded data is stored into
the Clusterpoint cluster storage (database) spanning multiple
hard-wired servers, that are interconnected between
themselves into a tree-like network topology, designed with
maximum 4- or 5- levels of networking 'hops' between servers on any two
different levels of the cluster hierarchy;
Step 3:
Step 1 is repeated, re-crawling Internet again, this time and all next
re-crawling times applying customer defined Information Ranking rules
to all database objects simply by rewriting them; initially two full
crawls of Internet are necessary; the customer defined Information
Ranking algorithm depends on previously collected
full database statistics; the database collected during Step 1
is at first being built from zero and statistically meaningful and
correct data is not available until the full data set will be crawled
and collected at least once; all consecutive crawls and all incremental
index updates will further improve Information Ranking, if it will be
based on some recurring statistical re-calculation algorithm, for
example, taking
into account aggregated totals form previously calculated statistics
per each database object of interest. After minimum of two
full Internet crawls the resulting dataset becomes relevantly indexed
from the search application user point of view; it is then possible to
apply even better fine-tuned algorithm where index quality improves
with each next crawling
round.
Step 4: The resulting Internet search index database (with all original content downloaded, cached and stored in Clusterpoint database) is used to operate a large scale Internet search service. With appropriate hardware and networking infrastructure, the system potentially can store billions of database objects and make them searchable with simple ad hoc queries from end-users with query response times in low fraction of a second. With Clusterpoint DBMS our customers can provide search results based oncustomer own defined relevancy algorithms providing whatever ranking formulas they consider competitive.
You can read more about Global Internet Search Application in our Web site Products / Applications section.
Sample Application No 2: Clusterpoint Network Traffic Surveillance System Application
Another our sample application to illustrate our database technology is Clusterpoint Network Traffic Surveillance System (NTSS) - a scalable network traffic capturing, storage and search database application, which runs on top of Clusterpoint DBMS.
For NTSS we have set up the dedicated product Web site, following our customer feedback, as it quickly turned from a prototype scalable database application into a full-fledged commercial software product, solving customer problems in IT and business security and compliance area. In fact, NTSS application allows to create and maintain fully and instantly searchable corporate network traffic database.
Please visit Clusterpoint NTSS product Web site to learn more.