Clusterpoint API messaging is based on very simple and performance-efficient concept, based on two major alternative principles, which may be selected by our customers depending on the database interoperability and performance needs of our customers.
Depending on our customer priority to develop applications with simplicity and interoperability in design as a key consideration or alternatively - with a data transfer performance as key priority, customer can select either REST HTTP/HTTPS protocol for Clusterpoint API, or use our native binary TCP/IP API libraries we are providing for major most popular programming languages such as Java, PHP, .NET.
Please note that this difference is only on the transport protocol layer between client application and Clusterpoint Server. The messaging content is always the same - same XML / JSON messages, independently of which transport mechanism is chosen - REST over web service, or more speedy method over TCP/IP when using our API libraries for particular chosen programming languages.
Next we describe in more details how Clusterpoint API works and how to use it under those transport protocol models.
Alternative No.1 - messaging protocol based on REST web services over HTTP/HTTPS
Clusterpoint API by default uses easy to use, open and cross-platform REST principles supported by virtually any modern programming environments (please see Representational_State_Transfer in Wikipedia), commonly recognized as the best and the most simple client-server architecture messaging concept for web application development. This is very simple transport method and works across any platforms - just send request-reply type XML / JSON messages over HTTP / HTTPS, and process them in your application software.
Please see the next section Clusterpoint API Messaging Basics describing the details content of Clusterpoint API messages.
Alternative No.2 - messaging protocol based on raw binary TCP/IP using client API libraries
If you need higher performance messaging system for a lot of relatively small database objects, there is no point to use REST principles, as they are more heavy and in some cases with overhead characteristic to HTTP protocol.
In particular, with small size data object the performance difference can be substantial. For example, to transfer 100-bytes of XML over HTTP can take 5 milliseconds, while native TCP/IP based protocol method can do the same task below 1 millisecond. If you have a lot of small database updates, or would like to do mass upload of data, we recommend to use client API libraries that work over raw TCP/IP.
We currently supply native client-side API libraries for the following most popular programming languages:
- JavaScript API Library (coming soon)
Based on our customer requests we are adding new native client API libraries for other languages and programming environments. Please suggest your interest to Clusterpoint Support Email.
Alternative No 3. - combination of both methods and/or using multiple documents commands
With Clusterpoint you can freely combine use of both transport protocol methods - either native raw TCP/IP or over REST, and both are supported at the same time by Clusterpoint Server. This is convenient for different use scenarios of the database where you can apply the most appropriate method. Also you do not need to change software code later, if it was previously developed for one transport protocol method, but later you decide to use other methodfor other parts of your application.
TIP: If you want high interoperability using REST web principles, you can solve small and frequent database upload / updates performance problem with packaging multiple small-size XML / JSON documents within the same single HTTP request in Clusterpoint API architecture.
Database objects in many Clusterpoint API commands (such as insert or update commands) can be packaged as in a container XML / JSON data structure, and in this way you can send to database storage a pack of multiple documents within a single HTTP request, for example, you can pack 100 or 1000 database objects (documents) concatenated as a long XML or JSON string, where Clusterpoint Server will process each data object update separately applying the API command which is requested for each document on the server side. This method also effectively does away with HTTP performance overhead described above, albeit it may be a bit more complicated to program in application software.
There are only two main types of client-server messages in the Clusterpoint API protocol (messages exchanged between customer application software and Clusterpoint Server):
1) Clusterpoint XML / JSON Request
2) Clusterpoint XML /
JSON Reply
Clusterpoint XML / JSON Request and Clusterpoint XML / JSON Reply are two simple XML 1.0 or JSON formatted messages sent over HTTP or HTTPS protocols. Imagine - it is an envelope where your XML data object is sent and received over the network. That is all you need to access Clusterpoint Server, update your database, retrieve documents, perform search etc.
It is a very simple, fast and secure transaction system over HTTP or HTTPS protocols:

Alternatively TCP/IP transport protocol can be used to pass
those XML / JSON messages through Clusterpoint API Libraries.
Within each Clusterpoint
Request there is a specific API command requested,
which should be executed on a Clusterpoint Server. To specify
command, just include <command>command</command>
tag into Clusterpoint Request and optional parameters for
each specific API commands (a set of one or more command-specific extra
tags).
Clusterpoint
Server
always returns a Clusterpoint
Reply message with XML- or JSON - (depending on data
format required) formatted results, including error
codes, for that particular specified in Clusterpoint Request
command.
This client-server messaging method guarantees simplicity and
efficiency between any customer application and Clusterpoint
Server. It
is also
entirely platform independent and open API.
We have defined all API commands to be self-explaining and
mnemonically very easy to understand and remember such as 'insert',
'update', 'delete', 'retrieve' etc. You can quickly start
using this
our simplified command set, requiring only minimal learning from
already known SQL world. Please see in the next section the
list of key Clusterpoint
API commands, driving the most important database server operations
from the
application developer point of view.
Please see our XML / JSON Request and XML JSON Reply message formating details in Developers Guide / Clusterpoint API Specification.
The Clusterpoint Server internally stores any custom user defined XML / JSON document in XML data format (since JSON can not accomodate all XML use cases, but XML can accomodate all JSON use cases) and automatically creates an index on the XML / JSON document's internal data and content when Clusterpoint XML / JSON "request" message with API 'insert', 'update' or 'replace' commands is sent to the Clusterpoint Server, with customer specified original XML / JSON document included.

In order to search, the Clusterpoint XML / JSON request message
with API search
command is sent to the Clusterpoint Server again this
time
with the content part of the Clusterpoint XML envelope contains the
user's search query.
In order to create any Clusterpoint database, the customer
chooses a
unique data object <id> (e.g., URL, file name or
database
object
identifier), assigns a <title> tag or few tags to be
listed in
search results, and assigns a custom <rate> value for
results
ordering
according to specific business needs.
Naming of tags is our
customer choice: we illustrate this sample only, and customer can apply
any XML / JSON tag names for this functionality. In this way
you
can
simply create from your own XML or JSON formatted data object
collections (see
<document> tag as an example of a document in our
pictures) any
number of
databases (storages) that your need for your particular software
applications, starting from top-down database design and simple
and understandable key data objects expressed as
complete and
undivided XML or JSON documents (initially very simple, later probably
acquiring more complex structure).
Transaction results are
always returned from the Clusterpoint
server
as Clusterpoint XML / JSON reply messages. Replies are
formatted
in
XML or JSON again for easy parsing by any
programming language (Java, PHP, .NET etc.).
Replies for
search
queries also contain technical
parameters for easy construction of multi-page navigation systems for
Web database applications.
Customer can apply his CSS, XSLT
or
other styling rules to format XML output for HTML or as necessary.
Please note that all data items in your XML /
JSON documents can be
de-normalized text values, fully human readable in Clusterpoint DBMS.
Full text search engine built into the core Clusterpoint
Server
software addresses any text values as well as any encoded values,
without performance loss. Therefore it is not needed to
heavily
encode database data, as it makes application software more complex and
database structure hard to understand by other programmers.
Please
see more detailed description about Clusterpoint DBMS
documents in section Developer's
Guide / Understanding Clusterpoint Server Document Structure.
This section of Developer's Guide also describes all
configuration attributes which you may apply to your database for
customization of indexing and search rules (called Document policy - a
small XML configuration file that contains your own custom indexing and
search rules defined for each storage). Please read more
about Document policy also in section Information
Ranking / Document Policy.
Despite Clusterpoint seemingly simple API and scalable cluster data
storage
concept described above,
there
is a rich feature set available for developers who would like to
exploit advanced programming options and functionality of the
Clusterpoint Server.
There are more than 160 software developer options and system configuration options in Clusterpoint Server. Those options enable, for example, deep Xpath filtering of data, fast full text search options usually available only in expensive enterprise search tools, including queries by simple keywords, phrases, wildcard templates, multi-level AND, OR, NOT Boolean expressions, proximity search and many other possibilities.
The platform also has language support enabling to store and query data in multiple languages in the same storage in UTF-8, so that our customers can use search terms in multiple languages in a single search query without setting up customized database versions for each language.
As Clusterpoint API is open and based only on web technology and industry standard XML (with optional JSON) data format, our customers can use any favorite programming language to access Clusterpoint Server functionality. It is very similar to Web services, yet it does not require more complex SOAP or Web-services document type definition schemes (DTDs). In fact, all you need to create, send, receive and parse Clusterpoint XML / JSON messages with API commands in your application to start to develop database applications for the Clusterpoint Server is your own preferred programing language: PHP, Java, C/C++, C#, JavaScript, Ruby on Rails or any other programming language.
Most application development environments and programming languages support XML / JSON formatted data representing them internally as objects, arrays, vectors or Json parameterized strings. Majority have built-in functions to convert XML-formatted or JSON-formatted data to one of those object types and vice versa. One can say that Clusterpoint Server 'speaks' the language of your programming environment, without any specific client software or drivers. Client software was commonly used to access functionality of legacy database servers. It is more complex to set up and maintain client software in working order across all computers, than use web-only API interface which is open and does not require any client software. Clusterpoint supports the later, less complex web API model.
Below is a list of key Clusterpoint API commands
supported by
Clusterpoint Server. 
| API command |
Functionality description |
| insert | add a document to the Clusterpoint storage with unique Document ID; unique Document ID is any custom your specified XML tag having only unique string values per database , such as URL, a database primary key, an application specific unique identification code, user account number, session id code or any similar string with only unique values per database (including uniqueness in a distributed cluster database). You can specify which XML tag of your custom XML document structure should be used as a Document ID by Clusterpoint Server through XML configuration file called Document Policy. This small configuration file is supplied for each storage, and used to customize your database according to your own field naming schema, indexing preferences and data structure. See more about Document Policy. |
| update | update or add a document to the Clusterpoint storage. Replaces existing XML document if the document with the specified unique Document ID already exists into database, otherwise add a new one. |
| replace | replace the document in Clusterpoint storage using known Document ID as identifier, rewriting it |
| delete |
delete the document from Clusterpoint storage using
known
Document ID as identifier |
| search-delete | delete documents from Clusterpoint storage using Clusterpoint search command syntax as filter |
| clear | delete all documents from a particular named storage, and all index files. By default this command preserves only Document Policy and Storage Configuration files, emptying the database and deleting all indexes only. However you can also explicitly specify in this command to remove everything for a particular storage, including configuration files. Once deleted permanently with configuration files, the storage would not anymore exist as the storage-named directory file system and all API commands for that storage would generate error code. |
| API command |
Functionality description |
| search | processes database search queries in Clusterpoint. This is the most powerful and the most advanced Clusterpoint API command with many features and options. It has a special query syntax based on simple XML -formatted search request, enabling to address both simple keyword search Internet style (ad hoc search), structural search using XML data fields similarly as in SQL, and the combination of both. It also can provide Boolean logic and multi-level nested query conditions, enterprise search options such as word template lookups, word stemming, proximity search etc. Please see detailed description of Clusterpoint API 'search' command in Developer's Guide / API command - SEARCH. |
| retrieve | returns a document from the Clusterpoint storage by known Document ID; the particular XML document will be returned exactly as you stored it into Clusterpoint storage: we do not change the document XML structure. This command uses a Document ID as a single and only primary key to identify and retrieve the XML documents, that is why you mandatory need to assign Document ID tag in your XML data structure using Document Policy configuration file. Clusterpoint Server will search for Document ID within a specified storage (can be a cluster storage spanning multiple server nodes), will find it, read from the disk completely, and return back to application for processing into application code (for example, by a data entry application, or a reporting tool). |
| retrieve-first |
retrieve the first documents by Document rate
tag; to process the first added documents forward |
| retrieve-last |
retrieve the last document by Document rate tag; used for time-stamp rated documents, to process the last added documents backward |
| lookup | search for the document in the Clusterpoint storage and
return
its document ID if it exists, together with specified XML fields to be
listed; useful for checking the document presence without retrieval of
the entire XML content, that often can be large size documents hundreds
of kilobytes or even megabytes. It is much faster than
'retrieve'
command. However, it can still be customized by listing only
XML
field names to be returned, which, unlike 'retrieve' command returning
full original XML document, can be just very few data items. |
| select | search
for a list of document identifiers using identifiers or
wild cards. Unlike 'retrieve',
'lookup'
and 'search'
commands, this command
always
returns only document IDs. Convenient command for application
developers
and performing database system routine tasks for batch processing,
report making, data importing/exporting etc. |
| similar | searches for similar documents in the Clusterpoint
storage to
a given textual information (content). This command is
useful
for XML documents, where some text is present, such as news articles,
Web pages etc. It uses statistical algorithms to determine
most
similar documents by text content, based on frequency of indexed words,
their relative occurrence in other documents etc. This is a command
based on probabilistic algorithm and satisfactory results can
only
provide for large massive databases, where number of databases objects
in a particular language are in the range of millions, otherwise
database content could be statistically insignificant to provide useful
results. For example, if you create and operate for example,
large scale Internet index, this command can provide "Similar pages"
functionality for indexed Web pages stored into Clusterpoint database.
Hence the name of the command. |
| alternatives | returns spell-checking suggestions for words using
accumulated index elements ('fuzzy search' functionality which is
common in enterprise search tools). This is very powerful
command, as it can provide 'Did
you mean that ...?"
functionality, by providing correct spelling of search query keywords,
if your end users mistyped them, or do not know their precise spelling.
Also very important is that this command uses actual database
index, instead of some linguistic vocabularies with all possible word
forms. As a result you can always make suggestions for search
terms which are actually present in your particular database index,
without wasting time to check through all possible lexicographic
combinations. You can also customize this command for
performance
needs, as sometimes suggestions can be too many to present them all, so
the Clusterpoint Server provides tools to limit options to the most
useful ones. |
| list-last | searches for documents most recently added or modified
using 'insert',
'update' or 'replace' commands |
| list-first | searches for documents first added or modified using 'insert', 'update' or 'replace' commands |
| list-paths | returns all XML field names in Xpath notation from a storage |
| list-facets | returns all unique facet values for a specified XML tag, if index policy is facet. |
| API command |
Functionality description |
| reindex |
tells the Clusterpoint Server to start the process of
entire database reindexing, taking into account the latest Document
Policy
configuration file changes and applying new document Information Ranking
rules to each of the Clusterpoint
Index
elements. Pleas note that Clusterpoint database index is always being fully updated in real-time by Clusterpoint API commands 'insert', 'update', 'delete', 'replace', and during production and operation phase when the database is being updated through your application transactions, normally it is not necessary to use command 'reindex'. However, during application development phase and testing this command is quite useful as it allows to re-apply Information Ranking rules to the whole existing database, without reloading of all the documents (which can be in high millions even for test databases). You can conveniently design and develop your own custom Document Policy, working out and assigning the best XML data items ranking rules, such as relative relevancy weights for XML data fields, assigning XML tags used for information ranking (pre-sorted indexing), and adjusting corresponding software application algorithm rules to your business needs. After you consider the job finished, you can reindex the database, without re-loading documents, using command 'reindex'. This command will reindex the entire database content for new Document policy changes to take effect. This command rebuilds full database index, processing all stored documents one by one, including full text search index. This command may be useful also for quick recovering to normal database operations after unexpected equipment crashes, when you suspect database index may be damaged. |
| set-policy | assigns XML configuration file to the Clusterpoint Server particular storage, defining Document Policy, which instructs Clusterpoint Server how to apply Information Ranking rules for your custom XML data structure. Used for flexible relevancy assignment to XML data fields within a document, selecting specific indexing methods (e.g, skip structure indexing, do only full text, or vice versa, or both), hide certain XML tags from indexing, create virtual XML meta tags, not present in the document for some special search or access needs, assign common alias names for different XML tags to enable consolidated search across multiple fields, and perform other customizations, with respect to index optimizations and performance tuning. In the most simple case you can assign only two items in the Document Policy for a storage: an XML field containing a Document ID values and the default policy rule to index everything (both structure and full text). All the other more advanced indexing and relevance assignment rules are totally optional in the Document Policy XML configuration file. Command 'set-policy' is also useful if you want to replace Clusterpoint Manager user interface for editing Document Policy XML configuration file with your own software tool, such as an XML editor, or dynamically perform Document Policy configuration changes from your application software. |
| get-policy |
retrieves Document
Policy
configuration file from an existing particular Clusterpoint
database storage. This command is useful if your application
needs to know what kind of indexing policy, and relevancies of XML data
object parts within your document is applicable to specific database.
Also useful if you want to replace Clusterpoint Manager user
interface for editing Document Policy XML configuration file with your
own software tool, such as an XML editor, or dynamically perform
Document Policy changes from your application software. |
| start |
start a Clusterpoint database server software instance,
servicing a
particular named storage. When started, each Clusterpoint storage
(database) has is own copy of Clusterpoint Server process in RAM per
each cluster node, isolated from all other storages. You can
start a single database or a cluster database on all cluster nodes,
with a single command 'start'. In a cluster all database
storages with the same name will be started and Clusterpoint Server
software will operate the entire database as a single logical database
from the application point of view. |
| stop | stop a Clusterpoint database server software instance, servicing a particular named storage. You can stop a single database or a cluster database on all cluster nodes, with a single command 'stop'. In a cluster all database storages with the same name will be stopped and Clusterpoint Server software will shutdown entire database as a single logical database from the application point of view. When a storage status is "inactive", there is no server software instances running in RAM, and client-server database transactions are not possible. This command is useful for controlled shutdown of database operations or performing scheduled downtime maintenance, as it writes all data buffers and transaction logs to disk storage before stopping the database server software. You should always stop the storage if you want to perform quick non-online backup or restore for a particular database, just by copying storage-named directory content at file system level. Otherwise working database servers may cause problems with database integrity, if storage backups are to be copied over file system, as there may be still update transactions which are changing the storage directory content. For online database backup/restore, without stopping database operations, please use Clusterpoint Manager built-in online backup/restore tools, which is slower process, but safe. |
| set-configuration |
there is another XML configuration file for each
particular storage which is called Storage
Configuration.
This is less frequently required, as it specifies some
default
values of internal technical parameters of a database storage, such as
the size of pre-read buffers, minimum length of common words to process
at high speed in ad hoc search, delimiter characters for full text
index elements, number of concurrent requests to process for a
particular hardware, and many other performance related, or RAM memory
usage related parameters. They can be modified, fine-tuned
and
adjusted to specific application needs, primarily to target a
combination of parameters providing the highest database server
performance. Once specified, those parameters are seldom
changed
for a particular storage. You can always change those
parameters
manually by using Clusterpoint Manager. However, in some case
it
can be useful to provide API command to change the Storage
Configuration.
This command is for convenience of application developers who
would like to control everything from their application. |
| get-configuration |
similarly to Document Policy file retrieval, there is
also a command to retrieve the Storage Configuration XML file using
API,
for eventual modification of storage configuration on-the-fly, or just
to process it in application specific way, for example,
informing
users that certain short and common words in a database with billions
of objects, will be skipped in user search queries for performance
reasons, unless especially specified as mandatory query terms.
|
| status | returns status information about each Clusterpoint server instance (storage) in a cluster. The status information includes many useful parameters, used by system administrators, database administrators and application developers, including uptime since the last storage server activation, number of documents in the Clusterpoint storage, number of unique words in the vocabulary, total number of words in the Clusterpoint storage, number of executed API commands since the last startup of the storage server instance, number of errors that have occurred since the last startup of the storage server instance, software version, storage indexing status etc. This is useful and one of the most frequently called API commands |
| cluster-status |
returns information about cluster configuration and
status,
including data about mirrored storages, hardware nodes servicing parts
of storages in a striped configuration, and other data which may be
useful for developing load sharing application logic, checking if
everything is operating Ok in the cluster, application performance
tuning in heavy multi-user environment etc. |
| cluster |
clustering control API command, that enables
application software
to join new cluster nodes on the fly, list storages, drop
cluster nodes etc. It has a set of sub-command operations
which
performs cluster-wide configuration changes. All options are
described in more details in section Clustering. |
| backup |
starts online incremental o full backup |
| restore | restore data from incremental or full backup |
| synchronize | synchronize cluster node database storage with mirror storage on another cluster node |
The list of main Clusterpoint API commands is being periodically updated as new Clusterpoint Server functionality is being upgraded in software.
Please see Documentation and release notes for all latest Clusterpoint versions to see if there are new API commands.
You
are welcome to suggest new API commands or new parameters to existing
API commands for Clusterpoint database server platform.
Please
send your suggestions to Clusterpoint
Support Email.
