RainStor: Making Hadoop Fly

by Robin Bloor on February 21, 2012

RainStor claims to offer the most effective method of compression of all the “Big Data” databases, and in all probability it does. At the heart of any database is the engine that gets you data. Its role is to get you (or to be exact, your query) the data it wants as fast as possible. At the same time, because a database is a busy little beaver, it has to effectively manage every other query that has been thrown at it.

The strategy that the traditional relational database management system (RDBMS) product adopted was to store data on disk in indexed structures (btrees usually) and then pull the data in as needed, often reading it from the index. Additionally, it cached data in memory and tried to make the most efficient use possible of the available CPU power. That kind of database architecture is fast becoming outdated. It worked fine for transactions, and fine for data warehousing, up to a point. The point where it broke was when data volumes grew too large for that arrangement of data to deliver adequate response times. In short, it didn’t scale.

So now there is a plethora of new database products emerging with distinctly different engines that scale far better than the old RDBMS products do. While a few of them are similar in their approach to achieving scalability, many of the new kids on the block are distinctly different. RainStor is one such product.

At the heart of RainStor lies a unique approach to data compression. To appreciate RainStor’s architecture, you need to have some understanding of how it compresses data. This is illustrated in a simple way in the diagram below.

RainStor Compression: A Simplified Diagram

Instead of storing database tables as tables, RainStor converts the tables into a kind of tree structure. The top diagram shows the first record in a table being stored; First Name – Peter, Last Name – Smith, Company Classification – Automotive, Salary – $40,000. Not only are the values stored but also the links between these values, which preserve the fact that it is a distinct row. The second diagram shows the way the second record is stored; Paul, Smith, Finance, $35,000. Because the value Smith already exists it doesn’t need to be stored; it just needs to be linked to. The values are colored green in the diagram for the sake of clarity. The bottom diagram shows the addition of the third record, in yellow; John, Brown, Pharmaceutical, $40,000. This process continues with the addition of every new record. If a completely new table is added to the database, a new tree is built. RainStor employs additional techniques to further compress the data, by applying byte level compression within field values, but the fundamental approach is as we have described.

By using this method of organizing the data RainStor reduces the space it requires considerably, because you only store as many values for any column as the cardinality of the column. This compression technique has the additional advantage that data ingest is extremely fast.

As far as we are aware, Rainstor’s data compression scheme is at least four times as economic as, for example, the typical column store database. And it is not the only advantage of storing the data in this way. RainStor can retrieve the results of any query on the data simply by “walking the tree.” So it compresses the data considerably without damaging its ability to query it. There are many ways to compress data, and they all carry a penalty when data is decompressed. With RainStor the penalty is very small.

RainStor and Data Archiving

After its initial release, RainStor quickly established a market for itself in the area of data archiving/data retention, where it offered a unique capability – enabling what can be thought of as an on-line archiving capability. Data was stored in RainStor, where it was held economically on disk, but was still available for query through Rainstor’s SQL interface. Used in this mode, data that was archived from a typical relational database could be compressed to occupy (roughly) one fortieth of the disk space it previously consumed. It was an attractive alternative to relegating data to tape back-ups where it was no longer accessible quickly, if at all.

RainStor could have been used, and indeed can be used, as an in-memory database, but at the time of its first release (several years ago) there was little enthusiasm for in-memory database products, so the company pursued a niche market where no other product could match its capabilities. The opportunity for RainStor broadened significantly with the advent of Hadoop – allowing it to enter a market where its underlying architecture and method of storage can pay huge dividends.

RainStor and Hadoop

RainStor was quick to take advantage of Hadoop. Because of its tree-based data structures, RainStor can easily distribute data across multiple servers. RainStor’s processing kernel is very light-weight and thus it is possible to run it on every server. So as long as you break down SQL requests and any associated processing to the individual server level, you have a highly scalable configuration.

In its Hadoop-based implementation, RainStor simply distributes its data trees through the Hadoop file system (HDFS), which has, as you may be aware, built-in redundancy and hence can be recovered quickly in the event of any node failure. RainStor maintains a common metadata map that it uses to break down requests written either in SQL or Apache PIG (an analytics capability) or both. It distributes subqueries to the nodes in a Hadoop cluster or grid, processes them locally at each node and then aggregating to produce the answer. In such a configuration, RainStor’s benchmarks suggest that it will outperform a MapReduce configuration by a wide margin.

The speed that RainStor is capable of varies according to how much of the data any given workload needs to access. For example, if it is a batch query that touches all data, RainStor outperforms MapReduce by a factor of about three. However, for lighter ad-hoc queries against the same data, it is much faster, possibly by two order of magnitude (i.e., 100).

There are two other advantages that RainStor provides in such a context, aside from blistering speed.

  • First, far fewer servers are needed because RainStor is so economic in its use of disk space.
  • Secondly, thee is no need for a programmer to write Java code or grapple with the complexities of the MapReduce approach. Queries can all be presented in PIG and SQL.

It is also possible to use RainStor in a distributed in-memory manner (backing data off to disk) in such a configuration, if you’re willing to invest in the memory and you desperately need the speed.

Those companies that have a genuine need for the large scale processing that Hadoop is capable of would do well, in our opinion, to take a close look at RainStor.

 

{ 6 comments… read them below or add one }

Jefferson Braswell March 14, 2012 at 10:20 pm

This is an interesting approach to compressing data, and it is clear why it can leverage Hadoop to its advantage. One could term this approach something like “Row/Reduce” perhaps.

However, this approach also has some properties that limit the scope of tasks for which it is appropriate:

1. Because data elements which are (at load time) identical are reduced to a single linked element, when/if a data element of a row is updated, there is a significant tree-balancing workload that is encountered. A search for an equivalent existing data element that matches the updated value would need to be performed, and — if one did not exist — a new element would have to be created and the link in the tree updated. When scaled up, this architecture would have bottlenecks similar to those with the first generations of Teradata tree-partitioned hardware technology: good for read only, but updates are laborious.

2. Saving space on disk may make sense for read-only archival uses, but it would be one of the lowest benefits or priorities in most environments today because of the low cost of disk. This out-of-date objective reminds me of the original Oracle physical storage model (i.e., versions 1 and 2 in the 1980s) which stored variable-length fields with a null delimiter, in variable-length rows instead of allocating disk space for the full field width. Yes, some expensive 1980s disk space was saved, but the time it took to locate a data field in a row to update suffered because (1) the row had to be scanned to locate the field (no fixed offsets), and (2) if the updated field did not fit in the original truncated field, the row had to be copied to secondary extents in order to save the now-larger row.

3. I don’t think the ‘far fewer servers are needed because RainStor is so economic in its use of disk space’ is necessarily an advantage, because servers have other resources besides disk which are needed to scale distributed processing systems, namely: CPU cores and RAM. Fewer servers predicated on less disk storage needs means fewer CPUs and total RAM — and those are resources that have a significant effect on total system performance. Depending on need for speed (as opposed to the equivalent of tertiary lookup storage on magtape mounted by operators), reducing servers would quite likely have a net negative effect on total system performance.

Again, suitability depends on the application. Based on my limited knowledge of RainStor and the above description of its storage strategy, it would not appear that it is a general purpose database platform solution alternative, however.

Reply

Ramon Chen March 15, 2012 at 5:21 am

Hi Jefferson,

Thanks for your insightful comments.
If I may clarify, RainStor is an append only (read only) Big Data database. So points 1 and 2 do not apply since the data, once loaded into RainStor is not updateable.

We architected the database to target use cases where data is historical (has completed its transactional lifecycle) or immediately historical (like most machine generated data from sensors, smartphones etc). Most “Big Data” use cases involve data that will never be modified once generated.

The concept of RainStor is that retaining and keep data online and accessible for compliance or historical analysis is still not affordable, even at the dramatically lower price of disk. Additionally because the total cost of ownership of running and operating large numbers of nodes/servers on an annual basis, far exceeds the purchase price of the hardware, the reduction in physical storage footprint through our compression leads to TCO reduction in the millions when data to be retained hits petabyte scale.

I completely agree with your 3rd point in that the reduction in servers, by reducing storage, may not necessarily be an advantage. It does depend upon the CPU and memory requirements of the intended use case. However, in scenarios where Big Data dictates the growth in servers/nodes, the CPUs are generally significantly underutilized. By reducing the physical storage footprint, RainStor provides the option of having fewer number of more powerful servers with larger specs if desired to meet processing needs.

RainStor’s unique compression (unlike byte-level compression) has no re-inflation penalty and so acts as a bandwidth and disk performance magnifier, reducing I/O bound dependencies. The intelligence of the metadata layer created upon load allows data to be retrieved across the large distributed data sets without a typical RDBMS index (which adds further to storage needs), or “brute force” parallelism across all the data on all nodes.

I hope this clarifies why we architected RainStor in this manner. Our goal was to effectively provide a database that stores data in the most compressed format possible, by sacrificing updates and actually imposing immutability as a benefit. Once the data is as small as possible, it allows for more flexibility in server configurations by reducing physical storage and boosting data access.

I would be happy to give you more details, if you are interested. You can reach me at [email protected]
Thanks for your comments.

Regards
Ramon Chen
VP Product Management
RainStor

Reply

Jefferson Braswell March 15, 2012 at 5:18 pm

Thanks, Ramon, for the clarification. The structured field, row-compression design of RainStor is clearly applicable for read-only, append-only use.

Given that you have mentioned the objective of positioning RainStor to be able to efficiently ingest ‘big data’, I have two additional observations/questions which I will pose for your response or clarification:

1. The compression technique used by RainStor is based on not duplicating data storage for structured fields that have the same value across a set of structured rows (based on my interpretation of the instructive graphic that Robin has provided).

It would appear, therefore, that RainStor is not designed to handle data such as
(1) variable, complex, or hierarchical documents and
(2) unstructured text

2. Sticking to structured row data for a moment, does RainStor apply the field compression technique only across rows of the same structure, or are links to equivalent data values generated for any and all text field values regardless of the row structure of the data being ingested ? (If the latter, that would theoretically allow RainStor to ingest variable-format hierarchical documents as well, within certain limits on what constitutes a data ‘field’ in the document, presumably.)

3. If RainStor *is* intended to archive unstructured text (e.g., messages, tweets, posts, etc), I assume you do not attempt to apply compression to individual words in the text ( in a variant of map/reduce ), as the linkage overhead of pointing to individual words would likely be equivalent to the space saved by not duplicating the words, yes ?

4. How do you handle the case where the data fields are largely numeric as opposed to textual ? Floating point numbers are rarely if ever identical, save for certain degenerate cases (such as zero or null), and equivalent integer value compression would share the same overhead as individual word in free text, only worse given that only 4 bytes are typically needed for simple integers.

Thanks again for your response — I look forward to further clarification of my questions.

Reply

Ramon Chen March 15, 2012 at 7:54 pm

Thanks again Jefferson, we’ll chat more about your questions offline. I’ll reach out to you.

-Ramon

Reply

Raoul Mangoensentono March 15, 2012 at 11:35 am

Good points made. I guess Rainstor would then be best used in situation where:
- very fast results for very complex queries where needed
- fast inserts where needed
- data was not regularly updated
Kinda makes think of datawarhouses.
But reading your statement ” it would not appear that it is a general purpose database platform solution alternative, however.”, the question arises:
Is there something as a general purpose NoSQL database?

Reply

Jefferson Braswell March 15, 2012 at 8:09 pm

Raoul,

You are right, there is no such thing as a general purpose NoSQL database, given the wide variety of different types, architectures, and alternatives in the NoSQL world at the moment.

By general purpose, whether it be ACID or CAP, I was referring to an interpretation of the term as referring to a database platform that supports a full range of typical full-range database operations regardless of the data types or physical storage model, namely: read/write/update/index/sort/search etc.

Reply

Leave a Comment

Previous post:

Next post: