Working with Hadoop, Kafka, Samza and the wider Big Data ecosystem

We've been working on a number of projects recently involving open source software often quoted as 'Big Data' solutions - here's a quick overview of them. The grandfather of them all of course is Apache Hadoop, now not so much a single project as an ecosystem including storage and processing for potentially huge amounts of data, spread across clusters of machines. Interestingly Hadoop was originally created by D...Continue reading

Flax Newsletter November 2015

In this month's Flax Newsletter:

  • Building an open source search team is hard - let us help with training & mentoring on Solr and Elasticsearch
  • RS Components: Flax & Quepid help us to make "crucial" data driven decisions for tuning search
  • 40x faster indexing with Elasticsearch for Hadoop - over a gigabyte per second!
...Continue reading

Finding the elephant in the room: open source search & Hadoop grow closer together

I've been lucky enough to attend two talks on Hadoop in the last few weeks which has made me take a closer look at this technology. In case you didn't know, Hadoop is an Apache top level open source project comprising a framework for distributed computing and storage, originally created by Doug Cutting (also the creator of Apache Lucene) while at Yahoo! in 2005. Distributed computing is carried out using Continue reading

Cambridge Search Meetup – Search for publication success and low-cost apps

After a short break the Cambridge Search Meetup returned last night with our usual mix of presentations, questions, networking, beer and snacks. We had a few issues with the projector and cables (one of these is on the shopping list for next time) so thanks to both presenters and audience for their patience! First up was Li...Continue reading