Elasticsearch Percolator & Luwak: a performance comparison of streamed search implementations

Most search applications work by indexing a relatively stable collection of documents and then allowing users to perform ad-hoc searches to retrieve relevant documents. However, in some cases it is useful to turn this model on its head, and match individual documents against a collection of saved queries. I shall refer to this model as "streamed search". One example of streamed search is in media monitoring. The monitoring agency's ...Continue reading

A review of Stephen Arnold’s CyberOSINT & Next Generation Information Access

Stephen Arnold, whose blog I enjoy due to its unabashed cynicism about overenthusiastic marketing of search technology, was kind enough to send me a copy of his recent report on CyberOSINT & Next Generation Information Access (NGIA), the latter being a term he has recently coined. OSINT itself refers to intelligence gathered from open, publically available sources, not anything to do with software license...Continue reading

Searching for opportunities in Real-Time Analytics

I spent a day last week at a new event from UNICOM, a conference on Real-Time Analytics. Mike Ferguson chaired the event and was kind enough to spend time with me over lunch exploring how search software might fit into the mix, something that has been on my mind since hearing about the Unified Log<...Continue reading

Out and about in January and February

We're speaking at a couple of events soon: if you're in London and interested in Apache Lucene/Solr we're also planning another London User Group Meetup soon. Firstly my colleague Alan Woodward is speaking with Martin Kleppman at FOSDEM in Brussels (31st January-1st February) on Continue reading

London Search Meetup – Serious Solr at Bloomberg & Elasticsearch 1.0

The financial information service Bloomberg hosted last Friday's London Search Meetup in their offices on Finsbury Square - the venue had to be seen to be believed, furnished as it is with neon, chrome, modern art and fishtanks. A slight step up from the usual room above a pub! The first presenter was Ramkumar Aiyengar of Bloomberg on their new search...Continue reading

ElasticSearch London Meetup – a busy and interesting evening!

I was lucky enough to attend the London ElasticSearch User Group's Meetup last night - around 130 people came to the Goldman Sachs offices in Fleet Street with many more on the waiting list. It signifies quite how much interest there is in ElasticSearch these days and the event didn't disappoint, with some fascinating talks. Hugo Pickford-Wardle from Continue reading

Introducing Luwak, a library for high-performance stored queries

A few weeks ago we spoke in Dublin at Lucene Revolution 2013 on our work in the media monitoring sector for various clients including Gorkana and Australian Associated Press. These organisations handle a huge number (sometimes hundreds of thousands) of news articles every day ...Continue reading

Lucene Revolution 2013, Dublin: day 2

A slow start to the day, possibly due to the aftereffects of the conference party the night before, but the stadium was still buzzing. I went to Rafal Kuć's talk on SolrCloud which is becoming the standard way to build scalable Solr installations (we have two projects underway that use it). The shard splitting features in recent releases of Solr were interesting - previous...Continue reading

Search Solutions 2012 – a review

Last Thursday I spent the day at the British Computer Society's Search Solutions event, run by their Information Retrieval Specialist Group. Unlike some events I could mention, this isn't a forum for sales pitches, over-inflated claims or business speak - just some great presentations on all aspects of search and some lively networking or discussion. It's one of my favourite events of t...Continue reading