Live Data to historical time series


I’m just getting started with OpenGamma at the request of one of my clients and I’m VERY rusty with Java having programmed C# for the last 10 years or so, so go easy on me :slight_smile:

I’m preparing to implement Live Data servers for a number of streaming ticker data sources, but it seems that Live Data is based on a subscription model from looking at the bloomberg code, where as I need to keep the connections live at all times and record the data as it occurs, or at least the last recieved data at some set interval (1 hour or less) as the sources do not provide any means to access historical data.

Is there a way to instruct OpenGamma to collect the data, or do I have to build something into the Live Data Server to record the historical data ??

Another problem is that several of the exchanges I will be interfacing with trade the same “product”, but frequently at different prices… I expect that is going to require me to set up separate symbols in the system for each exchange even though the same “product” is being traded ??



So am I correct in thinking you have more of a ‘firehose’ feed coming into your system?

We have written several handlers for feeds like that from e.g. Tullet Prebon and ICAP. These feeds are based on a different underlying implementation known as ‘COGDA’ (see com.opengamma.livedata.cogda.*) which is intended to be scalable and fault tolerant.

COGDA basically converts a firehose feed into a pub/sub feed. It splits the handling up into several processes: a chunker, which breaks off pieces of the feed and chops them up into ticks, a distributer, which does normalisation and updates an LKV, and then the live data server, which queries the LKV and maintains a subscription list.

There is no current out-of-the-box tick storage, although it’s something we’ve been thinking of adding for some time. Implementing storage and replay was something we did a basic implementation of to support demos (see the ‘replay’ package in OG-Bloomberg), but we don’t use it any longer and it should in no way be considered useful for production.

The current historical time series storage system is only used for storing day duration samples (although you can have multiple ‘observation times’ and ‘data providers’).

As for your issue about products - this is a common problem with merging multiple feeds. Typically the solution is to use different symbology to get different data providers (e.g. like Bloomberg’s ‘@’ mechanism) or to route different symbols to different data providers using a routing layer. Some data providers produce ‘composite’ values based on multiple quotes too (like Bloomberg’s CMP[L|N|T].

I hope that helps.