Architecture
Overview

ScalienDB combines technologies of the scalable, highly-available distributed systems with the dependability and reliability of traditional relational databases, to ensure very high performance read and write transactions across a dynamically scalable cluster of commodity servers. A ScalienDB cluster consists of the following modules:

The controller nodes are the "brain" of the system and control the rest of the cluster. The data nodes are where user data is stored in a high-performance, linear on-disk storage engine that scales to terabytes of data. The client applications -- such as a webserver -- use the ScalienDB client library to connect to the database. The client library is "smart" and handles all ScalienDB features such as automatic failover, data migration and fast batched data loads and dumps. ScalienDB is open source, written in C++, and runs on Linux, Microsoft Windows and MacOS X.

Controllers

The controller nodes act as the "brain" of the ScalienDB cluster and communicate among themselves to ensure the fault-tolerance and high-availability of the entire distributed database system. Data nodes are managed by the controllers to achieve linear read and write scalability and failover in case a data node goes down. Internally, the controller nodes use the Paxos protocol to guarantee that the ScalienDB cluster is always in a well-defined, consistent state without the possibility of a "split-brain" scenario. One of the controller nodes acts as the master, which simplifies management tasks for the users and administrators of the database. The controllers contain the following modules:

It is possible to use one, two, three or more controller nodes in the ScalienDB setup depending on the level of fault-tolerance desired by the customer use-case. It is recommended to use one controller node in test environments and three or five nodes in production.

Data nodes

The data nodes are the workhorses of the distributed ScalienDB architecture. The data nodes, like the controllers, use a variant of the Paxos algorithm to ensure that the database is always in a consistent, well-defined state. The use of Paxos throughout the ScalienDB system guarantees single ACID and read-your-write semantics. Data nodes are assigned into quorums by the controllers for replication, which ensures exceptionally high-availability and fault-tolerance of the database. The data nodes use a high-performance linear on-disk storage engine hand-crafted for ScalienDB to store user data in a compact way. The ScalienDB storage engine is the best of both worlds, as it combines technologies from traditional architectures such as transactional redo logging to eliminate any possibility of data loss and cutting-edge technologies such as linearized disk I/O introduced by Google's Bigtable system. The ScalienDB runs exceptionally well both on traditional spinning disk media and new solid state drives (SSDs). This combination of technologies assures that ScalienDB delivers both the dependability of old-school databases while opening the door to unlimited data and throughput scaling on conventional commodity hardware running Linux.

Clients

The client library is a fully featured "smart client" that makes it easy for the programmer to quickly write ScalienDB-based applications without worrying about the distributed nature of the database. The client is fully aware of the state of the ScalienDB cluster and hides any failover or shard migration events from the user for maximum conveniance and minimal development overhead. The client supports distributing reads across quorum members for linear read scalability and distributing writes to different quorums for linear write scalability. Additionally, writes may be sent in a batched manner in case of bulk loading for maximum throughput. ScalienDB client libraries are available for Java, Python, PHP and C++, with each client library politely adhering to the idioms of its language for maximum programmer friendliness.

Data model

ScalienDB is a NoSQL database with a key-value data model. At the highest level, data is organized into databases, then into tables, just like in traditional SQL databases. Tables are namespaces for key-value data, which may be split into several shards, which are then mapped into quorums and stored on data nodes for scalability and durability. Keys and values in ScalienDB are not strongly typed, which allows for maximum flexibility when developing applications against our API. Keys are limited to 64K length, while values may be practically unlimited in length (2^32 bytes), which means ScalienDB can be used for practically all data management tasks. This architecture combines the best of relational and NoSQL models: it features databases and tables to organize data while maximizing the flexibility for user-data by the use key-values and avoiding any locking and other single points of failure.

Management console

Administrative tasks are performed on the web-based management console using a browser like Mozilla Firefox or Google Chrome. The management console is fully featured and one-click supports all ScalienDB operations such as:

ScalienDB has a built-in HTTP server for performing management operations. This allows administrators and developers to integrate these tasks into their broader management suite using the powerful ScalienDB JSON API.

Products   »   Architecture      Open Source      Downloads

Documentation   »   Wiki      Groups

Company   »   Contact

Social   »   Blog      Twitter