Helping Bloomberg build a real-time news search engine with Luwak

Posted on March 8, 2016 by Charlie Hull

Bloomberg is one of the world's leading providers of financial news via the Bloomberg Terminal, an almost ubiquitous presence on the desks of finance professionals. As you might expect their systems heavily depend on effective search and over the last few years they have become increasingly involved in the open source community, sponsoring events such as Lucene Revolution and also he...Continue reading

Working with Hadoop, Kafka, Samza and the wider Big Data ecosystem

Posted on March 3, 2016 by Charlie Hull

We've been working on a number of projects recently involving open source software often quoted as 'Big Data' solutions - here's a quick overview of them. The grandfather of them all of course is Apache Hadoop, now not so much a single project as an ecosystem including storage and processing for potentially huge amounts of data, spread across clusters of machines. Interestingly Hadoop was originally created by D...Continue reading

London Lucene/Solr Meetup – Learning to Rank and Hibernate Search

Posted on February 24, 2016 by Charlie Hull

Back to the very impressive Bloomberg lecture theatre for this month's Lucene/Solr Meetup, with an good turnout (I'm guessing 60-70 people). Our first talk came from Diego Ceccarelli of Bloomberg on how his team have created a Solr implementation of Continue reading

Unified Log Meetup – Scaling up with Skyscanner, Samza and Samsara

Posted on February 18, 2016 by Charlie Hull

Last night I dropped in on the Unified Log Meetup at JustEat's offices (of course, they provided lots of pizza for us all!). I've written about this Meetup before - as a rule the events cover logging and analytics at massive scale, with search being only part of the picture. Joseph Francis from Continue reading

Better search for life sciences at the BioSolr Workshop, day 2 – Elasticsearch & others

Posted on February 15, 2016 by Charlie Hull

Over the last 18 months we've been working closely with the European Bioinformatics Institute on a project to improve their use of open source search engines, funded by the BBSRC. The project was originally named BioSolr but has since grown to encompass Continue reading

Posted in Biotechnology, Blog, Events | Tagged bioinformatics, biology, biosolr, DIH, django, elasticsearch, indexing, lucene, python, redis, scaling, SOLR, sphinx, sql | Leave a reply

Better search for life sciences at the BioSolr Workshop, day 1 – Apache Lucene/Solr

Posted on February 10, 2016 by Charlie Hull

Reply

Over the last 18 months we've been working closely with the European Bioinformatics Institute on a project to improve their use of open source search engines, funded by the BBSRC. The project was originally named BioSolr but has since grown to encompass Continue reading

Posted in Biotechnology, Blog, Events, Presentations | Tagged bioinformatics, biology, biosolr, EBI, EMBL-EBI, federated search, high availability, indexing, lucene, MySQL, NCBI, SOLR, xjoin | Leave a reply

Time to replace your Google Search Appliance with open source search

Posted on February 9, 2016 by Charlie Hull

1

As many others have noted, Google have recently announced their Google Search Appliance (GSA) will not be available for sale from 2017. Search gurus Miles Kehoe and Martin White have written an insightful analysis of the move with some recommendations as to what to do - because your GSA will simply stop working once the 2-year license expires. I don't agree with Lauren...Continue reading

Posted in Blog, Business | Tagged elasticsearch, google, google search appliance, gsa, lucene, market, open source, SOLR | 1 Reply

XJoin for Solr, part 2: a click-through example

Posted on January 29, 2016 by Tom Winch

Reply

In my last blog post, I demonstrated how to set up and configure Solr to use the new XJoin search components we've developed for the BioSolr project, using an example from an e-commerce setting. This time, I'll show...Continue reading

Posted in Biotechnology, Blog, E-commerce, Reference, Technical | Tagged biosolr, click-through, e-commerce, example, filtering, indexing, lucene, SOLR | Leave a reply

The fun and frustration of writing a plugin for Elasticsearch for ontology indexing

Posted on January 27, 2016 by Matt Pearce

6

As part of our work on the BioSolr project, I have been continuing to work on the various Elasticsearch ontology annotation plugins (note that even though the project started with a focus on Solr - thus the name - we have also been developing some features for Ela...Continue reading

Posted in Biotechnology, Blog, Reference, Technical | Tagged bioinformatics, biosolr, elasticsearch, indexing, ontology, plugins, SOLR | 6 Replies

XJoin for Solr, part 1: filtering using price discount data

Posted on January 25, 2016 by Tom Winch

5

In this blog post I want to introduce you to a new Apache Solr plugin component called XJoin. I'll show how we can use this to solve a common problem in e-commerce - how to use price discount data, provided by an external web API, to either filter the results of a product search or boost scores. A further post will show another example, using click-through data to influence the score of subsequent searches.
What is XJoin?
...Continue reading

Posted in Biotechnology, Blog, E-commerce, Technical | Tagged bioinformatics, ecommerce, example, filtering, indexing, java, lucene, patch, python, SOLR, xjoin | 5 Replies

Post navigation

← Older posts

Newer posts →