Out with the old – and in with the new Lucene query parser?

Over the years we've dealt with quite a few migration projects where the query syntax of the client's existing search engine must be preserved. This might be because other systems (or users) depend on it, or a large number of stored expressions exist and it is difficult or uneconomic to translate them all by hand. Our usual approach is to write a query parser, which understands the current syntax but creates a Continue reading

Helping Bloomberg build a real-time news search engine with Luwak

Bloomberg is one of the world's leading providers of financial news via the Bloomberg Terminal, an almost ubiquitous presence on the desks of finance professionals. As you might expect their systems heavily depend on effective search and over the last few years they have become increasingly involved in the open source community, sponsoring events such as Lucene Revolution and also he...Continue reading

London Text Analytics Meetup – Making sense of text with Lumi, Signal & Bloomberg

This month's London Text Analytics Meetup, hosted by Bloomberg in their spectacular Finsbury Square offices, was only the second such event this year, but crammed in three great talks and attracted a wide range of people from both academia and business. We started with Gabriella Kazai o...Continue reading

Out and about in search & monitoring – Autumn 2015

It's been a very busy few months for events - so busy that it's quite a relief to be back in the office! Back in late November I travelled to Vienna to speak at the FIBEP World Media Intelligence Congress with our client Infomedia about how we've helped them to migrate their media monitoring platform from the elderly, unsupported and hard to scale Continue reading

Luwak 1.3.0 released

The latest version of Luwak, our open-source streaming query engine, has been released on the Sonatype Nexus repository and will be making its way to Maven Central in the next few hours.  Here's a summary of the new features and improvements we've made: Batch processing I...Continue reading

Talks: Replacing Autonomy IDOL with Solr, Elasticsearch for e-commerce & relevancy tuning

I'll be speaking at several events over the next few weeks, in the UK and abroad. On the 19th of November I'll be at the FIBEP World Media Intelligence Congress in Vienna, to talk about how we helped our client Infomedia migrate from a closed-source search engine (Autonomy IDOL and Verity) to a new platform based on Apache L...Continue reading

Elasticsearch Percolator & Luwak: a performance comparison of streamed search implementations

Most search applications work by indexing a relatively stable collection of documents and then allowing users to perform ad-hoc searches to retrieve relevant documents. However, in some cases it is useful to turn this model on its head, and match individual documents against a collection of saved queries. I shall refer to this model as "streamed search". One example of streamed search is in media monitoring. The monitoring agency's ...Continue reading

Media monitoring with open source search – 20 times faster than before!

We're happy to announce we've just finished a successful project for a division of the Australian Associated Press to replace a closed source search engine with a considerably more powerful open source solution. You can read the press release here. As our client had a large investment in stored searches (which repr...Continue reading