Tuning and improving elasticsearch for the Government Digital Service

The exciting GOV.UK project is getting close to its first release date of October 17th and we were asked by them to help with some search tuning as they migrate from Apache Solr to elasticsearch. Although elasticsearch has some great features there are still some areas where it lags Solr, such as the lack of spelling suggestion and proximity boost features. Alan from Flax spent a couple of days working with the GDS team and has blogged about how proximity boosting in particular can be implemented – at least for terms that are relatively close to each other rather than being separated by a page or so.

If you’re interested in more details of how we fixed this and a few other elasticsearch issues, you may want to take a look at the code we worked on – one of the best things about working with the GOV.UK team is that it was already up as open source software within a day (yes, you read that right – code paid for by the taxpayer is open source, as it should be!). We’re looking forward to launch day!

Update: changed ‘proximity search’ to ‘proximity boost’ – thanks Alan!

Share this postShare on FacebookShare on Google+Tweet about this on TwitterShare on LinkedInShare on RedditEmail this to someone

3 thoughts on “Tuning and improving elasticsearch for the Government Digital Service

  1. Hi,
    Just came across this, with all the debates on solr vs ES just wondering what drove the decision to move from one to the other..?

    • Hi Franek,

      I think you’d have to ask the GDS team that one! There seem to be lots of people moving from Solr to ES these days, and I think the main reason is ease-of-use regarding scalability and schemas. Of course ES still lags Solr in some functional areas and Solr’s community is considerably larger. The encouraging thing for end users (and for those of us building search applications for them) is that the ‘arms race’ between ES and Solr means new functionality is appearing often – for example Solr 4.5, out next week, will have a ‘schemaless’ option I believe.

Leave a Reply

Your email address will not be published. Required fields are marked *