Boosts Considered Harmful – adventures with badly configured search

During a recent client visit we encountered a common problem in search - over-application of 'boosts', which can be used to weight the influence of matches in one particular field. For example, you might sensibly use this to make results that match a query on their title field come higher in search results. However in this case we saw huge boost values used (numbers in the hundreds) which were probably swamping everything else - and it wasn't at all clear where the values had come from, be it ex...Continue reading

Out with the old – and in with the new Lucene query parser?

Over the years we've dealt with quite a few migration projects where the query syntax of the client's existing search engine must be preserved. This might be because other systems (or users) depend on it, or a large number of stored expressions exist and it is difficult or uneconomic to translate them all by hand. Our usual approach is to write a query parser, which understands the current syntax but creates a Continue reading

London Text Analytics Meetup – Making sense of text with Lumi, Signal & Bloomberg

This month's London Text Analytics Meetup, hosted by Bloomberg in their spectacular Finsbury Square offices, was only the second such event this year, but crammed in three great talks and attracted a wide range of people from both academia and business. We started with Gabriella Kazai o...Continue reading

Out and about in search & monitoring – Autumn 2015

It's been a very busy few months for events - so busy that it's quite a relief to be back in the office! Back in late November I travelled to Vienna to speak at the FIBEP World Media Intelligence Congress with our client Infomedia about how we've helped them to migrate their media monitoring platform from the elderly, unsupported and hard to scale Continue reading

Talks: Replacing Autonomy IDOL with Solr, Elasticsearch for e-commerce & relevancy tuning

I'll be speaking at several events over the next few weeks, in the UK and abroad. On the 19th of November I'll be at the FIBEP World Media Intelligence Congress in Vienna, to talk about how we helped our client Infomedia migrate from a closed-source search engine (Autonomy IDOL and Verity) to a new platform based on Apache L...Continue reading

Building a new press cuttings service for the Financial Times

Those of you who read my slides from Search Solutions 2010 will have spotted a case study on our work for the Financial Times, one of the world’s leading business news organisations. When the Financial Times decided to bring their digital press cuttings in-house in summer 2010, they asked us to build a powerful 'search server' that they could easily integrate into their existing product offerings. We built an indexer for...Continue reading

Finding French TV with Flax

We've recently been working with mySkreen, who like Hulu in the U.S. provide a service for finding and viewing television programs via your web browser. mySkreen is the brainchild of Frédéric Sitterlé, previously Head of New Media at the Le Figaro media group. mySkreen works with French-language content, and is currently indexing over 1.6 million programmes (and counting). Using Fla...Continue reading