Elasticsearch Percolator & Luwak: a performance comparison of streamed search implementations

Most search applications work by indexing a relatively stable collection of documents and then allowing users to perform ad-hoc searches to retrieve relevant documents. However, in some cases it is useful to turn this model on its head, and match individual documents against a collection of saved queries. I shall refer to this model as "streamed search". One example of streamed search is in media monitoring. The monitoring agency's ...Continue reading

Media monitoring with open source search – 20 times faster than before!

We're happy to announce we've just finished a successful project for a division of the Australian Associated Press to replace a closed source search engine with a considerably more powerful open source solution. You can read the press release here. As our client had a large investment in stored searches (which repr...Continue reading

An open source replacement for the dtSearch closed source search engine

We've been working on a client project where we needed to replace the dtSearch closed source search engine, which doesn't perform that well at scale in this case. As the client has significant investment in stored queries (it's for a monitoring application) they were keen that the new engine spoke exactly the same query language as the old - so we've built a version of Apache Lucene to replace dtSearch. There are a ...Continue reading

Search backwards – media monitoring with open source search

We're working with a number of clients on media monitoring solutions, which are a special case of search application (we've worked on this previously for Durrants). In standard search, you apply a single query to a large amount of documents, expecting to get a ranked list of documents that match your query as a result. However in media monitoring you need to ...Continue reading

Next-generation media monitoring with open source search

Media monitoring is not a traditional search application: for a start, instead of searching a large number of documents with a single query, a media monitoring application must search every incoming news story with potentially thousands of queries, searching for words and terms relevant to client requirements. This can be difficult to scale, especially when accuracy must be maintained - a client won't be happy if their media monitors miss relevant stories or send them news that isn't relevant. ...Continue reading