Mongo DB implementation of HistoricalTimeSeriesMaster

vineeth · February 11, 2012, 10:58pm

In the following link , i have seen that mongoDB implementation for HistoricalTimeSeriesMaster is tried out somewhere.
LINK - http://docs.opengamma.com/display/DOC/Sources%2C+Masters%2C+and+Databases
Please point to that code.
It will help a lot.
Also while using mongoDB to store timeseries was there any particular difficulties faced ?

stephen · February 13, 2012, 5:36am

There was an attempt to implement the Master interfaces using MongoDB. What we found was that MongoDB (and probably other similar NoSQL databases) could not implement the Master interfaces accurately. The problem is that the interfaces represent data with both versions and corrections. In order to make that work, the SQL implementation uses transactions (to “end date” the old row when the new row is added). Since the MongoDB implementations did not work fully, they have been deleted. We would recommend the SQL versions for production use.

vineeth · February 13, 2012, 6:17am

I didn’t fully understand the issue in using mongoDB.
What i understand from your post is that we wont be able to track changes made to time series. Like if a value of a day is corrected later , we wont be able to track the old value.
Am i having the right idea here ?

vineeth · February 13, 2012, 6:49am

Also cant we implement lock on the HistoricalTimeSeriesMaster implementation here.

Like make sure that either update/remove/correct/get function run at a time for a particular index.

jim · February 13, 2012, 8:29am

What we found was that to store data in the complex versioned format that we’ve adopted (with both versions and corrections to versions), we required the use of more than one collection. Because Mongo doesn’t support transactions, or any real locking (or at least didn’t when we last used it), we stopped using it. Now theoretically you could add your own locking layer, and we did look into that, but we came to the conclusion it was more trouble than it was worth. In particular, my experience with time series databases that require global locking (MySQL required internal global locks on it’s auto-increment id columns as sequences were not supported) showed that this was a very bad idea. Insert performance became a serious problem as the database grew. There are strategies for overcoming this, but as I said, we didn’t think it was worth the trouble.

jim · February 13, 2012, 8:36am

To be clear about the older version, this was a much simpler implementation that didn’t support proper versioning and wouldn’t now be compatible with the architecture of the rest of the system.

vineeth · February 13, 2012, 9:34am

Got it.
Thanks Jim and Stephen.

kirk · February 13, 2012, 9:50am

Quite apart from the problems with using MongoDB (or any other NoSQL system) as the primary Master interface for any type of data (the “correct” storage of which requires traditional ACID-style semantics in the OpenGamma Platform), that’s not the only potential use of these systems in our architecture.

The rigorous use of Source and Master interfaces throughout the system means that in any real-world use case you will have a cascade of implementations between client code and the ultimate data source. A typical production client might have:

A CachingSource, on top of
A RemoteSource, communicating over a network connection with
A Servlet/other container, communicating with
A CachingSource, on top of
A DatabaseSoruce

There are many layers in between the ultimate client code and the actual database code. This gives maximum flexibility in configuring your particular application for your particular runtime topology.

Where we believe that NoSQL technologies are likely to have the greatest impact is in the layers in between the ultimate client code and the RDBMS implementation (see the CachingSources above). There, we think that systems like Voldemort or Cassandra may prove to be superior layers above the canonical RDBMS-based Master implementations in large-scale implementations, using RDBMS sharding underneath.

Topic		Replies	Views
In OG-Example, how did OG understood that it has to use DbHistoricalTimeSeriesMaster implementation OG-Platform	5	1886	February 27, 2012
Performance testing for OG timeseries data store OG-Platform	19	3484	April 17, 2013
Htsmaster is not allowing to insert Datapoints prior to the date that exists in timeseries OG-Platform	4	1457	April 10, 2012
Live Data to historical time series OG-Platform	1	1578	August 21, 2013
Time series add time series document error with postgresql hts OG-Platform	5	2008	May 10, 2013

Mongo DB implementation of HistoricalTimeSeriesMaster

Related topics