London Lucene/Solr Meetup: Query Pre-processing & SQL with Solr

Bloomberg kindly hosted the London Lucene/Solr Meetup last night and we were lucky enough to have two excellent speakers for the thirty or so attendees. René Kriegler kicked off with a talk about the Querqy library he has developed to provide a pre-processing layer for Solr (and soon, Elasticsearch) queries. This library was originally developed during a project for Germany’s largest department store Galeria Kaufhof and allows users to add a series of simple rules in a text file to raise or lower results containing certain words, filter out certain results, add synonyms and decompound words (particularly important for German!). We’ve seen similar rules-based systems in use at many of our e-commerce clients, but few of these work well with Solr (Hybris in particular has a poor integration with Solr and can produce some very strange Solr queries). In contrast, Querqy is open source and designed by someone with expert Solr knowledge. With the addition of a simple UI or an integration with a relevancy-testing framework such as Quepid, this could be a fantastic tool for day-to-day tuning of search relevance – without the need for Solr expertise. You can find Querqy on Github.

Michael Suzuki of Alfresco talked next about the importance of being bilingual (actually he speaks 4 languages!) and how new features in Solr version 6 allow one to use either Solr syntax, SQL expressions or a combination of both. This helps hide Solr’s complexity and also allows easy integration with database administration and reporting tools, while allowing use of Solr by the huge number of developers and database administrators familiar with SQL syntax. Using a test set from the IMDB movie archive he demonstrated how SQL expressions can be used directly on a Solr index to answer questions such as ‘what are the highest grossing film actors’. He then used visualisation tool Apache Zeppelin to produce various graphs based on these queries and also showed dbVisualizer, a commonly used database administration tool, connecting directly to Solr via JDBC and showing the index contents as if they were just another set of SQL tables. He finished by talking briefly about the new statistical programming features in Solr 6.6 – a powerful new development with features similar to the R language.

We continued with a brief Q&A session . Thanks to both our speakers – we’ll be back again soon!

Leave a Reply

Your email address will not be published. Required fields are marked *