Clusterpoint NoSQL Database Server: Simplify database design, management and search!Download FREE Software: TEST-DRIVE scalable NoSQL DBMS server software with fast full text search ranking for relevance, clustering in cloud computing architecture, database replication into multiple copiesResell softwareCommercially supported full text seach database software nosql scalable data store platform with enterprise search

Clusterpoint API Protocol Options

Clusterpoint API messaging is based on very simple and performance-efficient concept, based on two major alternative principles, which may be selected by our customers depending on the database interoperability and performance needs of our customers. 

Depending on our customer priority to develop applications with simplicity and interoperability in design as a key consideration or alternatively  - with a data transfer performance as key priority, customer can select either REST HTTP/HTTPS protocol for Clusterpoint API, or use our native binary TCP/IP API libraries we are providing for major most popular programming languages such as Java, PHP, .NET.  

Please note that this difference is only on the transport protocol layer between client application and Clusterpoint Server.  The messaging content is always the same - same XML / JSON messages, independently of which transport mechanism is chosen - REST over web service, or more speedy method over TCP/IP when using our API libraries for particular chosen programming languages.

Next we describe in more details how Clusterpoint API works and how to use it under those transport protocol  models.

Alternative No.1 - messaging protocol based on REST web services over HTTP/HTTPS

Clusterpoint API by default uses easy to use, open and cross-platform REST principles supported by virtually any modern programming environments (please see  Representational_State_Transfer in Wikipedia), commonly recognized as the best and the most simple client-server architecture messaging concept for web application development.  This is very simple transport method and works across any platforms - just send request-reply type XML / JSON messages over HTTP / HTTPS, and process them in your application software. 

Please see the next section Clusterpoint API Messaging Basics describing the details content of Clusterpoint API messages.

Alternative No.2 - messaging protocol based on raw binary TCP/IP using client API libraries

If you need higher performance messaging system for a lot of relatively small database objects, there is no point to use REST principles, as they are more heavy and in some cases with overhead characteristic to HTTP protocol.  

In particular, with small size data object the performance difference can be substantial.  For example, to transfer 100-bytes of XML over HTTP can take 5 milliseconds, while native TCP/IP based protocol method can do the same task below 1 millisecond.  If you have a lot of small database updates, or would like to do mass upload of data, we recommend to use client API libraries that work over raw TCP/IP.

We currently supply native client-side API libraries for the following most popular programming languages:

- PHP API Library

- Java API library

- .NET API Library

- JavaScript API Library (coming soon)

Based on our customer requests we are adding new native client API libraries for other languages and programming environments.  Please suggest your interest to Clusterpoint Support Email.

Alternative No 3.  - combination of both methods and/or using multiple documents commands

With Clusterpoint you can freely combine use of both transport protocol methods - either native raw TCP/IP or over REST, and both are supported at the same time by Clusterpoint Server.  This is convenient for different use scenarios of the database where you can apply the most appropriate method.  Also you do not need to change software code later, if it was previously developed for one transport protocol method, but later you decide to use other methodfor other parts of your application.

TIP:  If you want high interoperability using REST web principles, you can solve small and frequent database upload / updates performance problem with packaging multiple small-size XML / JSON documents within the same single HTTP request in Clusterpoint API architecture.

Database objects in many Clusterpoint API commands (such as insert or update commands) can be packaged as in a container XML / JSON data structure, and in this way you can send to database storage a pack of multiple documents within a single HTTP request, for example, you can pack 100 or 1000 database objects (documents) concatenated as a long XML or JSON string, where Clusterpoint Server will process each data object update separately applying the API command which is requested for each document on the server side.  This method also effectively does away with HTTP performance overhead described above, albeit it may be a bit more complicated to program in application software.

back

Clusterpoint API Messaging Basics

There are only two main types of client-server messages in the Clusterpoint API protocol (messages exchanged between customer application software and Clusterpoint Server):

1) Clusterpoint XML / JSON Request
2) Clusterpoint XML / JSON Reply

Clusterpoint XML / JSON Request and Clusterpoint XML / JSON Reply are two simple XML 1.0 or JSON formatted messages sent over HTTP or HTTPS protocols.  Imagine - it is an envelope where your XML data object is sent and received over the network.  That is all you need to access Clusterpoint Server, update your database, retrieve documents, perform search etc.  

It is a very simple, fast and secure transaction system over HTTP or HTTPS protocols:


Alternatively TCP/IP transport protocol can be used to pass those XML / JSON messages through Clusterpoint API Libraries.

Within each Clusterpoint Request there is a specific API command requested, which should be executed on a Clusterpoint Server.  To specify command, just include <command>command</command> tag into Clusterpoint Request and optional parameters for each specific API commands (a set of one or more command-specific extra tags).

Clusterpoint Server always returns a Clusterpoint Reply message with XML- or JSON - (depending on data format required) formatted results, including error codes, for that particular specified in Clusterpoint Request command.  

This client-server messaging method guarantees simplicity and efficiency between any customer application and Clusterpoint Server.  It is also entirely platform independent and open API.  

We have defined all API commands to be self-explaining and mnemonically very easy to understand and remember such as 'insert', 'update', 'delete', 'retrieve' etc.  You can quickly start using this our simplified command set, requiring only minimal learning from already known SQL world.  Please see in the next section the list of key Clusterpoint API commands, driving the most important database server operations from the application developer point of view.

Please see our XML / JSON Request and XML  JSON Reply message formating details in Developers Guide / Clusterpoint API Specification.

The Clusterpoint Server internally stores any custom user defined XML  / JSON document in XML data format (since JSON can not accomodate all XML use cases, but XML can accomodate all JSON use cases) and automatically creates an index on the XML / JSON document's internal data and content when Clusterpoint XML / JSON "request" message with API 'insert', 'update' or 'replace' commands is sent to the Clusterpoint Server, with customer specified original XML / JSON document included.


In order to search, the Clusterpoint XML / JSON request message with API search command is sent to the Clusterpoint Server again this time with the content part of the Clusterpoint XML envelope contains the user's search query. 

In order to create any Clusterpoint database, the customer chooses a unique data object <id> (e.g., URL, file name or database object identifier), assigns a <title> tag or few tags to be listed in search results, and assigns a custom <rate> value for results ordering according to specific business needs.

Naming of tags is our customer choice: we illustrate this sample only, and customer can apply any XML / JSON tag names for this functionality.   In this way you can simply create from your own XML or JSON formatted data object collections (see <document> tag as an example of a document in our pictures) any number of databases (storages) that your need for your particular software applications, starting from top-down database design and simple and understandable key data objects expressed as complete and undivided XML or JSON documents (initially very simple, later probably acquiring more complex structure). 

Transaction results are always returned from the Clusterpoint server as Clusterpoint XML / JSON reply messages.  Replies are formatted in XML or JSON again for easy parsing by any programming language (Java, PHP, .NET etc.).  

Replies for search queries also contain technical parameters for easy construction of multi-page navigation systems for Web database applications.  

Customer can apply his CSS, XSLT or other styling rules to format XML output for HTML or as necessary.

Clusterpoint Document ID for database storagesPlease note that all data items in your XML / JSON documents can be de-normalized text values, fully human readable in Clusterpoint DBMS.  Full text search engine built into the core Clusterpoint Server software addresses any text values as well as any encoded values, without performance loss.  Therefore it is not needed to heavily encode database data, as it makes application software more complex and database structure hard to understand by other programmers.

Please see more detailed description about Clusterpoint DBMS documents in section Developer's Guide / Understanding Clusterpoint Server Document Structure.  This section of Developer's Guide also describes all configuration attributes which you may apply to your database for customization of indexing and search rules (called Document policy - a small XML configuration file that contains your own custom indexing and search rules defined for each storage).  Please read more about Document policy also in section Information Ranking / Document Policy.

Despite Clusterpoint seemingly simple API and scalable cluster data storage concept described above, there is a rich feature set available for developers who would like to exploit advanced programming options and functionality of the Clusterpoint Server.

There are more than 160 software developer options and system configuration options in Clusterpoint Server.   Those options enable, for example, deep Xpath filtering of data, fast full text search options usually available only in expensive enterprise search tools, including queries by simple keywords, phrases, wildcard templates, multi-level AND, OR, NOT Boolean expressions, proximity search and many other possibilities.

The platform also has language support enabling to store and query data in multiple languages in the same storage in UTF-8, so that our customers can use search terms in multiple languages in a single search query without setting up customized database versions for each language.

As Clusterpoint API is open and based only on web technology and industry standard XML (with optional JSON) data format, our customers can use any favorite programming language to access Clusterpoint Server functionality.  It is very similar to Web services, yet it does not require more complex SOAP or Web-services document type definition schemes (DTDs).  In fact, all you need to create, send, receive and parse Clusterpoint XML / JSON messages with API commands in your application to start to develop database applications for the Clusterpoint Server is your own preferred programing language: PHP, Java, C/C++, C#, JavaScript, Ruby on Rails or any other programming language.   

Most application development environments and programming languages support XML / JSON formatted data representing them internally as objects, arrays, vectors or Json parameterized strings.  Majority have built-in functions to convert XML-formatted or JSON-formatted data to one of those object types and vice versa.  One can say that Clusterpoint Server 'speaks' the language of your programming environment, without any specific client software or drivers.  Client software was  commonly used to access functionality of legacy database servers.  It is more complex to set up and maintain client software in working order across all computers, than use web-only API interface which is open and does not require any client software.  Clusterpoint supports the later, less complex web API model.

Below is a list of key Clusterpoint API commands supported by Clusterpoint Server.  back

Database Modification Commands

API command
Functionality description
insert add a document to the Clusterpoint storage with unique Document ID; unique Document ID is any custom your specified XML tag having only unique string values per database , such as URL, a database primary key, an application specific unique identification code, user account number, session id code or any similar string with only unique values per database (including uniqueness in a distributed cluster database).  You can specify which XML tag of your custom XML document structure should be used as a Document ID by Clusterpoint Server through XML configuration file called Document Policy.  This small configuration file is supplied for each storage, and used to customize your database according to your own field naming schema, indexing preferences and data structure.  See more about Document Policy.
update update or add a document to the Clusterpoint storage.  Replaces existing XML document if the document with the specified unique Document ID already exists into database, otherwise add a new one.
replace replace the document in Clusterpoint storage using known Document ID as identifier, rewriting it
delete
delete the document from Clusterpoint storage using known Document ID as identifier
search-delete delete documents from Clusterpoint storage using Clusterpoint search command syntax as filter
clear delete all documents from a particular named storage, and all index files.  By default this command preserves only Document Policy and Storage Configuration files, emptying the database and deleting all indexes only.  However you can also explicitly specify in this command to remove everything for a particular storage, including configuration files.  Once deleted permanently with configuration files, the storage would not anymore exist as the storage-named directory file system and all API commands for that storage would generate error code.

back

Data Search & Retrieval Commands

API command
Functionality description
search processes database search queries in Clusterpoint.  This is the most powerful and the most advanced Clusterpoint API command with many features and options.  It has a special query syntax based on simple XML -formatted search request, enabling to address both simple keyword search Internet style (ad hoc search), structural search using XML data fields similarly as in SQL, and the combination of both.  It also can provide Boolean logic and multi-level nested query conditions, enterprise search options such as word template lookups, word stemming, proximity search etc.   Please see detailed description of Clusterpoint API 'search' command in Developer's Guide / API command - SEARCH.
retrieve returns a document from the Clusterpoint storage by known Document ID; the particular XML document will be returned exactly as you stored it into Clusterpoint storage: we do not change the document XML structure.  This command uses a Document ID as a single and only primary key to identify and retrieve the XML documents, that is why you mandatory need to assign Document ID tag in your XML data structure using Document Policy configuration file.  Clusterpoint Server will search for Document ID within a specified storage (can be a cluster storage spanning multiple server nodes), will find it, read from the disk completely, and return back to application for processing into application code (for example, by a data entry application, or a reporting tool).
retrieve-first
retrieve the first documents by Document rate tag; to process the first added documents forward
retrieve-last
retrieve the last document by Document rate tag; used for time-stamp rated documents, to process the last added documents backward
lookup search for the document in the Clusterpoint storage and return its document ID if it exists, together with specified XML fields to be listed; useful for checking the document presence without retrieval of the entire XML content, that often can be large size documents hundreds of kilobytes or even megabytes.  It is much faster than 'retrieve' command.  However, it can still be customized by listing only XML field names to be returned, which, unlike 'retrieve' command returning full original XML document, can be just very few data items.
select search for a list of document identifiers using identifiers or wild cards.  Unlike 'retrieve', 'lookup' and 'search' commands,  this command always returns only document IDs.  Convenient command for application developers and performing database system routine tasks for batch processing, report making, data importing/exporting etc.
similar searches for similar documents in the Clusterpoint storage to a given textual information (content).  This command is useful for XML documents, where some text is present, such as news articles, Web pages etc.  It uses statistical algorithms to determine most similar documents by text content, based on frequency of indexed words, their relative occurrence in other documents etc. This is a command based on probabilistic algorithm and satisfactory results can only provide for large massive databases, where number of databases objects in a particular language are in the range of millions, otherwise database content could be statistically insignificant to provide useful results.  For example, if you create and operate for example, large scale Internet index, this command can provide "Similar pages" functionality for indexed Web pages stored into Clusterpoint database.  Hence the name of the command.
alternatives returns spell-checking suggestions for words using accumulated index elements ('fuzzy search' functionality which is common in enterprise search tools).  This is very powerful command, as it can provide 'Did you mean that ...?" functionality, by providing correct spelling of search query keywords, if your end users mistyped them, or do not know their precise spelling.  Also very important is that this command uses actual database index, instead of some linguistic vocabularies with all possible word forms.  As a result you can always make suggestions for search terms which are actually present in your particular database index, without wasting time to check through all possible lexicographic combinations.  You can also customize this command for performance needs, as sometimes suggestions can be too many to present them all, so the Clusterpoint Server provides tools to limit options to the most useful ones.
list-last searches for documents most recently added or modified using 'insert', 'update' or 'replace' commands
list-first searches for documents first added or modified using 'insert', 'update' or 'replace' commands
list-paths returns all XML field names in Xpath notation from a storage
list-facets returns all unique facet values for a specified XML tag, if index policy is facet.

back

Database Server Control Commands

API command
Functionality description
reindex
tells the Clusterpoint Server to start the process of entire database reindexing, taking into account the latest Document Policy configuration file changes and applying new document Information Ranking rules to each of the Clusterpoint Index elements.  

Pleas note that Clusterpoint database index is always being fully updated in real-time by Clusterpoint API commands 'insert', 'update', 'delete', 'replace', and during production and operation phase when the database is being updated through your application transactions, normally it is not necessary to use command 'reindex'.

However, during application development phase and testing this command is quite useful as it allows to re-apply Information Ranking rules to the whole existing database, without reloading of all the documents (which can be in high millions even for test databases).  You can conveniently design and develop your own custom Document Policy, working out and assigning the best XML data items ranking rules, such as relative relevancy weights for XML data fields, assigning XML tags used for information ranking (pre-sorted indexing), and adjusting corresponding software application algorithm rules to your business needs.  After you consider the job finished, you can reindex the database, without re-loading documents, using command 'reindex'.  This command will reindex the entire database content for new Document policy changes to take effect.  This command rebuilds full database index, processing all stored documents one by one, including full text search index.  This command may be useful also for quick recovering to normal database operations after unexpected equipment crashes, when you suspect database index may be damaged.
set-policy assigns XML configuration file to the Clusterpoint Server particular storage, defining Document Policy, which instructs Clusterpoint Server how to apply Information Ranking rules for your custom XML data structure.  Used for flexible relevancy assignment to XML data fields within a document, selecting specific indexing methods (e.g, skip structure indexing, do only full text, or vice versa, or both), hide certain XML tags from indexing, create virtual XML meta tags, not present in the document for some special search or access needs, assign common alias names for different XML tags to enable consolidated search across multiple fields, and perform other customizations, with respect to index optimizations and performance tuning.  In the most simple case you can assign only two items in the Document Policy for a storage: an XML field containing a Document ID values and the default policy rule to index everything (both structure and full text). All the other more advanced indexing and relevance assignment rules are totally optional in the Document Policy XML configuration file. Command 'set-policy' is also useful if you want to replace Clusterpoint Manager user interface for editing Document Policy XML configuration file with your own software tool, such as an XML editor, or dynamically perform Document Policy configuration changes from your application software.
get-policy
retrieves Document Policy configuration file from an existing particular Clusterpoint database storage.  This command is useful if your application needs to know what kind of indexing policy, and relevancies of XML data object parts within your document is applicable to specific database.  Also useful if you want to replace Clusterpoint Manager user interface for editing Document Policy XML configuration file with your own software tool, such as an XML editor, or dynamically perform Document Policy changes from your application software.
start
start a Clusterpoint database server software instance, servicing a particular named storage. When started, each Clusterpoint storage (database) has is own copy of Clusterpoint Server process in RAM per each cluster node, isolated from all other storages.  You can start a single database or a cluster database on all cluster nodes, with a single command 'start'.   In a cluster all database storages with the same name will be started and Clusterpoint Server software will operate the entire database as a single logical database from the application point of view.
stop stop a Clusterpoint database server software instance, servicing a particular named storage.  You can stop a single database or a cluster database on all cluster nodes, with a single command 'stop'.   In a cluster all database storages with the same name will be stopped and Clusterpoint Server software will shutdown entire database as a single logical database from the application point of view.  When a storage status is "inactive", there is no server software instances running in RAM, and client-server database transactions are not possible.  This command is useful for controlled shutdown of database operations or performing scheduled downtime maintenance, as it writes all data buffers and transaction logs to disk storage before stopping the database server software.  You should always stop the storage if you want to perform quick non-online backup or restore for a particular database, just by copying storage-named directory content at file system level.  Otherwise working database servers may cause problems with database integrity, if storage backups are to be copied over file system, as there may be still update transactions which are changing the storage directory content.  For online database backup/restore, without stopping database operations, please use Clusterpoint Manager built-in online backup/restore tools, which is slower process, but safe.
set-configuration
there is another XML configuration file for each particular storage which is called Storage Configuration.  This is less frequently required, as it specifies some default values of internal technical parameters of a database storage, such as the size of pre-read buffers, minimum length of common words to process at high speed in ad hoc search, delimiter characters for full text index elements, number of concurrent requests to process for a particular hardware, and many other performance related, or RAM memory usage related parameters.  They can be modified, fine-tuned and adjusted to specific application needs, primarily to target a combination of parameters providing the highest database server performance.  Once specified, those parameters are seldom changed for a particular storage.  You can always change those parameters manually by using Clusterpoint Manager.  However, in some case it can be useful to provide API command to change the Storage Configuration.  This command is for convenience of application developers who would like to control everything from their application.
get-configuration
similarly to Document Policy file retrieval, there is also a command to retrieve the Storage Configuration XML file using API, for eventual modification of storage configuration on-the-fly, or just to process it in application specific way, for example, informing users that certain short and common words in a database with billions of objects, will be skipped in user search queries for performance reasons, unless especially specified as mandatory query terms.  
status returns status information about each Clusterpoint server instance (storage) in a cluster.  The status information includes many useful parameters, used by system administrators, database administrators and application developers, including uptime since the last storage server activation, number of documents in the Clusterpoint storage, number of unique words in the vocabulary, total number of words in the Clusterpoint storage, number of executed API commands since the last startup of the storage server instance, number of errors that have occurred since the last startup of the storage server instance, software version, storage indexing status etc.  This is useful and one of the most frequently called API commands
cluster-status
returns information about cluster configuration and status, including data about mirrored storages, hardware nodes servicing parts of storages in a striped configuration, and other data which may be useful for developing load sharing application logic, checking if everything is operating Ok in the cluster, application performance tuning in heavy multi-user environment etc.
cluster
clustering control API command, that enables application software to join new cluster nodes on the fly, list storages, drop cluster nodes etc.  It has a set of sub-command operations which performs cluster-wide configuration changes.  All options are described in more details in section Clustering.
backup
starts online incremental o full backup
restore restore data from incremental or full backup
synchronize synchronize cluster node database storage with mirror storage on another cluster node

back

Welcome to Suggest New Features

The list of main Clusterpoint API commands is being periodically updated as new Clusterpoint Server functionality is being upgraded in software. 

Please see Documentation and release notes for all latest Clusterpoint versions to see if there are new API commands.

You are welcome to suggest new API commands or new parameters to existing API commands for Clusterpoint database server platform.  Please send your suggestions to Clusterpoint Support Email
back