Helping Bloomberg build a real-time news search engine with Luwak

Bloomberg is one of the world’s leading providers of financial news via the Bloomberg Terminal, an almost ubiquitous presence on the desks of finance professionals. As you might expect their systems heavily depend on effective search and over the last few years they have become increasingly involved in the open source community, sponsoring events such as Lucene Revolution and also helping me to run (and often hosting) the London Lucene/Solr Meetup. They also now employ no less than three Apache Solr committers and have contributed features including an XML query parser and analytics component.

The scale of Bloomberg’s systems is significant: 320,000 subscribers who carry out 8 million searches every day of an archive of 400 million stories. A million new stories are published every day and in the financial sector response time is paramount, so they want new stories available within 100 milliseconds.

One component of their platform is a large scale news alerting framework, handling around 1.5 million stored searches created both internally and by their subscribers. Some of these stored searches are highly complex Boolean expressions. As part of a migration away from a commercial solution, they have recently built a new alerting system based on the open source Luwak library we developed for media monitoring applications.

Initially, Luwak depended on a (rather large) patch for Lucene to add positional information to the index, but Bloomberg kindly funded the integration of this into trunk Lucene and as of version 5.3 it is part of the main release. We’ve also been working with them to develop and tune Luwak’s capabilities to address their performance and accuracy requirements.

Daniel Collins, who has led the alerting system development, recently talked in New York on the use of Luwak in their alerting system and you can watch the video of his talk and a short article covering their journey. He writes:

Corporate technology has become highly complex. At the lower levels of the stack, innovators know that proprietary software can cause more problems than it solves. A lot of companies are deciding they can’t sit behind closed doors any more, and they need to get more involved in open source.

We’re very grateful for Bloomberg’s support of the Luwak project and we are continuing to develop it – do let us know if you would like to know more about how to use it in your application.

Leave a Reply

Your email address will not be published. Required fields are marked *