Meetup at Big Data London – One-click Solr & Factchecking with Solr

Posted on November 10, 2016 by Charlie Hull

Last week I spoke at the Big Data London conference, a very busy event with several thousand people attending. My session was on using open source search to make sense of Big Data - you can get slides here. In the evening we ran another Continue reading

Lucene Revolution 2016, Boston

Posted on October 26, 2016 by Charlie Hull

After our two successful hackdays, it was on to the main event of the week and the largest open source search event of the year. In between catching up with other Lucene/Solr folks on the first day I enjoyed Chris 'Hossman' Hostetter's talk on Hidden Gems of Apache Sol...Continue reading

A tale of two cities (and two Lucene Hackdays)

Posted on October 21, 2016 by Charlie Hull

To mark Flax's 15th anniversary we ran two Lucene Hackdays recently, in London and Continue reading

Not one, but three Lucene hackdays coming soon!

Posted on August 24, 2016 by Charlie Hull

We're always keen to get more people involved in the Lucene search community - there's always lots to do, from deep hacking of the core code, to testing with different frameworks and clients, to creating documentation and examples. It's also just over fifteen years since Tom Mortimer and I founded Flax and we thought we should mark this birthday with some kind of event! So I'm thus very happy to announce we'll be involved in three Lucene hackday events over the next two months: Firstly, Continue reading

Boosts Considered Harmful – adventures with badly configured search

Posted on August 19, 2016 by Charlie Hull

During a recent client visit we encountered a common problem in search - over-application of 'boosts', which can be used to weight the influence of matches in one particular field. For example, you might sensibly use this to make results that match a query on their title field come higher in search results. However in this case we saw huge boost values used (numbers in the hundreds) which were probably swamping everything else - and it wasn't at all clear where the values had come from, be it ex...Continue reading

Out with the old – and in with the new Lucene query parser?

Posted on May 13, 2016 by Charlie Hull

Over the years we've dealt with quite a few migration projects where the query syntax of the client's existing search engine must be preserved. This might be because other systems (or users) depend on it, or a large number of stored expressions exist and it is difficult or uneconomic to translate them all by hand. Our usual approach is to write a query parser, which understands the current syntax but creates a Continue reading

Can you make a contribution to Apache Solr core development?

Posted on April 26, 2016 by Charlie Hull

As any regular reader of this blog will be aware, we use almost exclusively open source software on customer projects. To meet their requirements, we often have to extend the functionality of the software (e.g. XJOIN in Solr). As far as possible, with the agreement of the customer, we like to then contribute these changes b...Continue reading

Running out of disk space with Elasticsearch and Solr: a solution

Posted on April 21, 2016 by Tom

We recently did a proof-of-concept project for a customer which ingested log events from various sources into a Kafka - Logstash - Elasticsearch - Kibana stack. This was configured with Ansible and hosted on about a dozen VMs inside the customer's main...Continue reading

Apache Kafka London Meetup – Real time search and insights

Posted on April 14, 2016 by Charlie Hull

The rise of Apache Kafka as a streaming data solution is something we've been watching for a while - as part of a collection of Big Data tools, it provides a 'TiVo for data' feature. We've begun to use it in client projects covering both search and log analysis...Continue reading

Unified Log Meetup – Scaling up with Skyscanner, Samza and Samsara

Posted on February 18, 2016 by Charlie Hull

Last night I dropped in on the Unified Log Meetup at JustEat's offices (of course, they provided lots of pizza for us all!). I've written about this Meetup before - as a rule the events cover logging and analytics at massive scale, with search being only part of the picture. Joseph Francis from Continue reading