We've been working on a project for a customer which uses Logstash to read messages from Kafka and write them to Elasticsearch. It also parses the messages into fields, and depending on the content type does DNS lookups (both forward and reverse.)
While performance testing I noticed that adding caching to the Logstash DNS filter actually reduced performance, contrary to expectations. With four filter worker threads, and the following configuration:
JSON has been the lingua franca of data exchange for many years. It's human-readable, lightweight and widely supported. However, the JSON spec does not define what parsers should do when they encounter a duplicate key in an object, e.g.:
Implementations are free to interpret this how they like. When different systems have different interpretations this can cause problems.
We recently encounter...Continue reading
Last night the estimable Martin White, intranet and enterprise search expert and author of many books on the subject, flaggedup two surprising articles from Forrester who have declared that Cognitive Search (we'll define this using their own terms in...Continue reading
I visited Aberdeen before Easter to speak at Industry Day, a part of the European Conference on Information Retrieval. Following a reception at Aberdeen's Town House (a wonderful building) hosted by the Lord Provost I spent an evening with various information retrieval luminaries including Professor Udo Kruschwitz of the University of Essex. We had a chance to discuss the book we're co-authoring (draft title 'Searching the Enterprise', designed as a review of t...Continue reading
A small crowd for this month's London Lucene/Solr Meetup, kindly hosted by Barclays in their sumptuous Canary Wharf offices. I introduced the Meetup and spoke briefly on how Flax is currently looking for team members (want to work on a variety of cutting-edge open source search projects in the UK and abroad? Get in touch!) before introducing Flax's Alan Woodwar...Continue reading
We're sometimes asked by clients to examine not just their technical implementation of search, but also the wider picture: how search functionality is exposed to users, how it compares to competitors' websites and best practice. This process usually takes us ten days to two weeks and results in a highly detailed report with clear recommendations for improvement.
This process is slightly different each time, but usually includes the following steps, shown with some examples of questions we mig...Continue reading
It won't have escaped your notice that factchecking is very much in the news recently due to last year's political upheavals in both the US and UK and the suspected influence of fake news on voters. Both traditional and social media organisations are making efforts in this area; examples include Channel 4 and Faceboo...Continue reading
Last week I attended Search Solutions, one of my favourite annual events where all aspects of search are covered from web to intranet to enterprise. The first speaker Sebastian Blohm from Microsoft spoke about a new personalised Clutter folder for email and how his team had first devel...Continue reading