Defining relevance engineering part 4: tools

Posted on November 15, 2018 by Charlie Hull

Relevance Engineering is a relatively new concept but companies such as Flax and our partners Open Source Connections have been carrying out relevance engineering for many years. So what is a relevance engineer and what do they do? In this series of blog posts I'll try to explain what I see as aContinue reading

Posted in Blog, Technical | Tagged elasticsearch, NCDG, quepid, relevance engineering, rre, SOLR, tools, tuning | 2 Replies

Highlights of Search, Store, Scale & Stream – Berlin Buzzwords 2018

Posted on June 18, 2018 by Charlie Hull

Reply

I spent last week in a sunny Berlin for the Berlin Buzzwords event (and subsequently MICES 2018, of which more later). This was my first visit to Buzzwords which was held in an arts & culture complex in an old brewery north of the city centre. The event was larger than I was expecting at around 550 people with three main tracks of talks. Although due to so...Continue reading

Posted in Blog, Events, Presentations, Technical | Tagged analytics, autoscaling, events, intention, query understanding, replicas, SOLR, tika | Leave a reply

London Lucene/Solr Meetup – Java 9 & 1 Beeelion Documents with Alfresco

Posted on February 8, 2018 by Charlie Hull

Reply

This time Pivotal were our kind hosts for the London Lucene/Solr Meetup, providing a range of goodies including some frankly enormous pizzas - thanks Costas and colleagues, we couldn't have done it without you! Our first talk was from Uwe Schindler, Lucene committer, who started with...Continue reading

Posted in Blog, Events, Meetups, Technical | Tagged clustering, java, lucene, meetup, scaling, SOLR | Leave a reply

Finding the Bad Actor: Custom scoring & forensic name matching with Elasticsearch

Posted on February 1, 2018 by Charlie Hull

Reply

Continue reading

Posted in Events, Media Monitoring, Meetups, News & Media, Presentations, Technical | Tagged arachnys, client, elasticsearch, forensics, lucene, media monitoring, plugin, scoring, span query | Leave a reply

A search-based suggester for Elasticsearch with security filters

Posted on November 16, 2017 by Tom

2

Both Solr and Elasticsearch include suggester components, which can be used to provide search engine users with suggested completions of queries as they type: Query autocomplete has become an expected part of the search experience. Its benefits to the user include les...Continue reading

Posted in Blog, Technical | Tagged access control, elasticsearch, security, SOLR, suggester | 2 Replies

Worth the wait – Apache Kafka hits 1.0 release

Posted on November 2, 2017 by Charlie Hull

Reply

We've known about Apache Kafka for several years now - we first encountered it when we developed a prototype streaming Boolean search engine for media monitoring with our own library Luwak. Kafka is a distributed streaming platform with some simple but powerful concepts - everything it deals with is a stream ...Continue reading

Posted in Blog, Business, Media Monitoring, News & Media, Sectors, Technical | Tagged kafka, kibana, log, luwak, streaming data | Leave a reply

Better performance with the Logstash DNS filter

Posted on August 17, 2017 by Tom

2

We've been working on a project for a customer which uses Logstash to read messages from Kafka and write them to Elasticsearch. It also parses the messages into fields, and depending on the content type does DNS lookups (both forward and reverse.) While performance testing I noticed that adding caching to the Logstash DNS filter actually reduced performance, contrary to expectations. With four filter worker threads, and the following configuration:
dns { resolve => [ ...Continue reading

Posted in Blog, Technical | Tagged DNS, elastic, elasticsearch, kafka, logstash | 2 Replies

Elasticsearch, Kibana and duplicate keys in JSON

Posted on August 3, 2017 by Tom

Reply

JSON has been the lingua franca of data exchange for many years. It's human-readable, lightweight and widely supported. However, the JSON spec does not define what parsers should do when they encounter a duplicate key in an object, e.g.:
{ "foo": "spam", "foo": "eggs", ... }
Implementations are free to interpret this how they like. When different systems have different interpretations this can cause problems. We recently encounter...Continue reading

Posted in Blog, Technical | Tagged curl, elasticsearch, indexing, json, kibana, sense | Leave a reply

London Lucene/Solr Meetup: Query Pre-processing & SQL with Solr

Posted on June 2, 2017 by Charlie Hull

Reply

Bloomberg kindly hosted the London Lucene/Solr Meetup last night and we were lucky enough to have two excellent speakers for the thirty or so attendees. René Kriegler kicked off with a talk about the Continue reading

Posted in Events, Meetups, Technical | Tagged database, ecommerce, lucene, quepid, querqy, query, SOLR, sql, test driven relevance, tuning | Leave a reply

Release 1.0 of Marple, a Lucene index detective

Posted on February 24, 2017 by Tom

Reply

Back in October at our London Lucene Hackday Flax's Alan Woodward started to write Marple, a new open source tool for inspecting Lucene indexes. Since then we have made nearly 240 commits to the Marple GitHub repository, and are now happy to announce its first release.Continue reading

Posted in Blog, Events, Meetups, Technical | Tagged lucene, marple, open source, relevance, tools | Leave a reply

Post navigation

← Older posts

Search Our Blog

Search

Recent Posts

Little Mermaids, Haystacks and moving on

Flax joins OpenSource Connections

More needles, more Haystacks, more relevance!

Defining relevance engineering part 4: tools

Activate 2018 day 2 – AI and Search in Montreal

Categories

Biotechnology

Blog

Business

Case study

E-commerce

Events

Government

Healthcare

Intranet Search

Legal

Media Monitoring

Meetups

News

News & Media

Newsletter

Presentations

Recruitment

Reference

Sectors

Technical

White paper

Archives
Archives
Tags
autonomy big data elasticsearch events lucene networking open source performance SOLR xapian

Home

Who we are

Partners

Consulting

Elasticsearch Consulting

Lucene/Solr Consulting

Big Data Consulting

Search Relevance Tuning

Media Monitoring

Log Analysis

Training & Support

Lucene/Solr Training

Elasticsearch Training

Lucene/Solr Support

Elasticsearch Support

Resources

Publications

Meetups

Events

Our Github

Presentations

Open Source

Clients

Testimonials

Case studies

Blog

C/O Freeths Llp, Routeco Office Park, Davy Avenue, Knowlhill, Milton Keynes, United Kingdom, MK5 8HJ
Site Map

Cookies

Privacy Policy

Developed by Granite 5
OpenSource Connections Limited registered in England: 11736223
Registered Office: C/O Freeths Llp, Routeco Office Park, Davy Avenue, Knowlhill, Milton Keynes, United Kingdom, MK5 8HJ

Apache Lucene, Apache Solr, Apache Kafka, Apache Hadoop and their respective logos are trademarks of the
Apache Software Foundation. Elasticsearch is a trademark of Elasticsearch BV,
registered in the U.S. and in other countries.