InterSystems And Caché 2010

InterSystems is one of those companies that largely passes under the radar of many analysts and commentators and yet it continues to quietly grow its user base and partner community.

Indeed, as a company with a strong financial base it took the opportunity of the recent recession to invest in expansion where its competitors stagnated. That’s always a good sign. In 2009, for example, it took on around 250 new staff so that it has now has over 1,000 employees with offices in 33 countries and with revenues approaching $300m.

Historically the company has been regarded as technology driven and that is still true to some extent but there has been an increasing focus on end-user applications, especially in healthcare, over the last few years. Its TrakCare product has now been sold in some 25 markets around the world.

On the technology front the company has similarly expanded over the last couple of years from Caché and Ensemble to include DeepSee, which supports embedded real-time business intelligence capabilities, and InterSystems is now looking to move into the master data management arena, though more as a platform for MDM than anything else.

Of course, the foundation for all InterSystems products is Caché. This is sometimes listed, for example on Wikipedia, as a NoSQL database, which is fine provided you take “no” to mean “not only” but not if you think it means “not”, because Caché has supported SQL for years.

Anyway the latest release of Caché is Caché 2010. This has arguably the most elegant implementation of high availability capability that I have seen in any database. It works using what the company calls database mirroring (as opposed to mirrored disks) using low cost redundant servers and storage with changes to the primary server being propagated to the back-up server using logical, synchronous replication.

The big advantage of this approach, apart from the low costs, is that you don’t need any cluster management software. If you are using the company’s Enterprise Cache Protocol (ECP) communications then you not only don’t lose any transactions in the event of a failover but not even any locks.

Compare that to Oracle’s RAC where you can get a cluster freeze when a node goes down precisely because the lock lists have to be re-built. If you are using JDBC rather than ECP then you get rollback to the last transaction boundary. I’ll be writing about this in more detail in due course but for the time being I think this is seriously interesting.

The other major new feature in Caché 2010 is the introduction of eXTreme for Java. Historically, Caché has had a server-side development model but this new capability effectively provides a tunnel into the server-side to support external development. This is done through a multi-dimensional API sitting on top of a Java Native Interface (JNI) and it should help to attract new developers towards using Caché and, potentially, introduce InterSystems to new markets.

In particular, it is called eXTreme because it has been designed specifically for very high volume, very low latency environments, for which you might otherwise choose to use a complex event processing (CEP) solution, except that CEP solutions don’t come with persistence except for playback or unless you license a separate database. The company is particularly looking (with partners) at areas such as capital markets and smart grids.

I have to make a comment here. You wouldn’t normally associate a database technology, even if it is integrated with a development environment, as being a suitable platform for CEP. You certainly couldn’t do it with Oracle, IBM or Microsoft and I’d be hard pushed to think of another database product that has the ingestion and processing capabilities that would make it possible.

However, I think InterSystems may just be capable of it. It is an extremely high performing product and it is suitable for complex applications. Of course the proof will be in the pudding but it will stir up the CEP market if it succeeds, not to mention the company’s own fortunes.

Philip Howard is Research Director (Data Management) at Bloor Research. Data management refers to the management, movement, governance and storage of data and involves diverse technologies that include (but are not limited to) databases and data warehousing, data integration (including ETL, data migration and data federation), data quality, master data management, metadata management and log and event management. Philip also tracks spreadsheet management and complex event processing.