durrants – Flax http://www.flax.co.uk The Open Source Search Specialists Thu, 10 Oct 2019 09:03:26 +0000 en-GB hourly 1 https://wordpress.org/?v=4.9.8 Search backwards – media monitoring with open source search http://www.flax.co.uk/blog/2012/03/08/search-backwards-media-monitoring-with-open-source-search/ http://www.flax.co.uk/blog/2012/03/08/search-backwards-media-monitoring-with-open-source-search/#comments Thu, 08 Mar 2012 12:00:45 +0000 http://www.flax.co.uk/blog/?p=722 We’re working with a number of clients on media monitoring solutions, which are a special case of search application (we’ve worked on this previously for Durrants). In standard search, you apply a single query to a large amount of documents, … More

The post Search backwards – media monitoring with open source search appeared first on Flax.

]]>
We’re working with a number of clients on media monitoring solutions, which are a special case of search application (we’ve worked on this previously for Durrants). In standard search, you apply a single query to a large amount of documents, expecting to get a ranked list of documents that match your query as a result. However in media monitoring you need to search each incoming document (for example, a news article or blog post) with many queries representing what the end user wants to monitor – and you need to do this quickly as you may have tens or hundreds of thousands of articles to monitor in close to real time (Durrants have over 60,000 client queries to apply to half a million articles a day). This ‘backwards’ search isn’t really what search engines were designed to do, so performance could potentially be very poor.

There are several ways around this problem: for example in most cases you don’t need to monitor every article for every client, as they will have told you they’re only interested in certain sources (for example, a car manufacturer might want to keep an eye on car magazines and the reviews in the back page of the Guardian Saturday magazine, but doesn’t care about the rest of the paper or fashion magazines). However, pre-filtering queries in this way can be complex especially when there are so many potential sources of data.

We’ve recently managed to develop a method for searching incoming articles using a brute-force approach based on Apache Lucene which in early tests is performing very well – around 70,000 queries applied to a single article in around a second on a standard MacBook. On suitable server hardware this would be even faster – and of course you have all the other features of Lucene potentially available, such as phrase queries, wildcards and highlighting. We’re looking forward to being able to develop some powerful – and economically scalable – media monitoring solutions based on this core.

The post Search backwards – media monitoring with open source search appeared first on Flax.

]]>
http://www.flax.co.uk/blog/2012/03/08/search-backwards-media-monitoring-with-open-source-search/feed/ 2
Just the job for a recruitment client http://www.flax.co.uk/blog/2011/10/18/just-the-job-for-a-recruitment-client/ http://www.flax.co.uk/blog/2011/10/18/just-the-job-for-a-recruitment-client/#respond Tue, 18 Oct 2011 11:05:16 +0000 http://www.flax.co.uk/blog/?p=645 We’re pleased to announce our work with Reed Specialist Recruitment, one of the UK’s largest recruitment companies, where we helped them implement an Apache Solr powered application to allow their 3000+ staff to search for and match candidates to jobs. … More

The post Just the job for a recruitment client appeared first on Flax.

]]>
We’re pleased to announce our work with Reed Specialist Recruitment, one of the UK’s largest recruitment companies, where we helped them implement an Apache Solr powered application to allow their 3000+ staff to search for and match candidates to jobs. We built an innovative indexing framework, a configuration tool and performance monitoring system for Reed and the system launched on time and under budget, a great testament to the flexibility and power of this open source software. The new system responds in under a second – a massive improvement on the previous response time of several minutes. You can read the press release here.

If you’d like to hear more I’ll be giving a presentation on the project at Lucene Eurocon in Barcelona tomorrow – Wednesday 19th October at 1.30 p.m. – slides and a video will be online after the event.

If you can’t make it to Barcelona I’ll also be talking in London, on the business benefits of open source search, at around 10am on Tuesday 25th October with our client Stephen Wicks, CTO of Gorkana Group as part of Enterprise Search Europe – there are still tickets available and you can even get a 20% discount if you join the Cambridge or London Enterprise Search Meetups, who are hosting a joint event on the Monday evening of the conference.

The post Just the job for a recruitment client appeared first on Flax.

]]>
http://www.flax.co.uk/blog/2011/10/18/just-the-job-for-a-recruitment-client/feed/ 0
A busy Autumn – forthcoming events http://www.flax.co.uk/blog/2011/09/06/a-busy-autumn-forthcoming-events/ http://www.flax.co.uk/blog/2011/09/06/a-busy-autumn-forthcoming-events/#respond Tue, 06 Sep 2011 13:32:55 +0000 http://www.flax.co.uk/blog/?p=626 The diary is filling up quickly already after the summer break (which turned out not to be much of a break at all, what with the HP/Autonomy news and everything). Here’s where you can hear us speak over the next … More

The post A busy Autumn – forthcoming events appeared first on Flax.

]]>
The diary is filling up quickly already after the summer break (which turned out not to be much of a break at all, what with the HP/Autonomy news and everything). Here’s where you can hear us speak over the next few months:

Hope to meet some of you at these exciting events (do get in touch if you’d like to arrange something more formal). There’s certainly a lot to talk about!

The post A busy Autumn – forthcoming events appeared first on Flax.

]]>
http://www.flax.co.uk/blog/2011/09/06/a-busy-autumn-forthcoming-events/feed/ 0
Whitepaper – Why you should be considering open source search http://www.flax.co.uk/blog/2011/06/22/whitepaper-why-you-should-be-considering-open-source-search/ http://www.flax.co.uk/blog/2011/06/22/whitepaper-why-you-should-be-considering-open-source-search/#respond Wed, 22 Jun 2011 09:49:50 +0000 http://www.flax.co.uk/blog/?p=585 I’ve uploaded a whitepaper I wrote a short while ago : “In these rapidly changing times we don’t know what we will need to search tomorrow – so it’s important to be adaptable, flexible and able to cope with data … More

The post Whitepaper – Why you should be considering open source search appeared first on Flax.

]]>
I’ve uploaded a whitepaper I wrote a short while ago :

“In these rapidly changing times we don’t know what we will need to search tomorrow – so it’s important to be adaptable, flexible and able to cope with data volumes that may not scale linearly. Maintaining control over the future of your search software is also key. Open source search has come of age and every modern business should be aware of its advantages.”

The post Whitepaper – Why you should be considering open source search appeared first on Flax.

]]>
http://www.flax.co.uk/blog/2011/06/22/whitepaper-why-you-should-be-considering-open-source-search/feed/ 0
The year open source search got serious http://www.flax.co.uk/blog/2010/12/17/the-year-open-source-search-got-serious/ http://www.flax.co.uk/blog/2010/12/17/the-year-open-source-search-got-serious/#respond Fri, 17 Dec 2010 10:22:35 +0000 http://www.flax.co.uk/blog/?p=461 It’s been an interesting and busy twelve months here at Flax – we’ve worked on some fantastic customer projects, spoken at conferences at home and abroad and made some great alliances and partnerships. We are talking to more people than … More

The post The year open source search got serious appeared first on Flax.

]]>
It’s been an interesting and busy twelve months here at Flax – we’ve worked on some fantastic customer projects, spoken at conferences at home and abroad and made some great alliances and partnerships. We are talking to more people than ever before about the advantages of open source search and we’ve even started a local Meetup group.

This has been the year when open source search moved out of the shadows and became a force to reckon with – whether handling billions of queries or millions of customers, powering innovative new APIs for open content from forward-looking media companies or simply making it easier for search applications to be developed. Commercial support is now available to rival anything offered by the closed source world and there are now fully packaged solutions built on open source. In some sectors open source may even become the default choice (see what IDC said about the embedded/OEM market).

There’s still significant change to come in the search sector – I expect a few vendors will be in trouble by this time next year as they realise their business models (often built on per-document charges) are out-of-date, and we might also see further acquisitions by the usual behemoths. All this leads to reduced choice and increased costs for customers, and this is where open source can help – you can build your search solution in-house, or engage companies like ours to help, but you’re no longer locked in to a vendor’s roadmap and shackled to their business plan (or the consequences of its failure!).

I’ll leave the final word to Matt Asay of Canonical, who says: “Open source is how we do business 10 years into this new millennium.”

The post The year open source search got serious appeared first on Flax.

]]>
http://www.flax.co.uk/blog/2010/12/17/the-year-open-source-search-got-serious/feed/ 0
Next-generation media monitoring with open source search http://www.flax.co.uk/blog/2010/12/13/next-generation-media-monitoring-with-open-source-search/ http://www.flax.co.uk/blog/2010/12/13/next-generation-media-monitoring-with-open-source-search/#respond Mon, 13 Dec 2010 14:05:40 +0000 http://www.flax.co.uk/blog/?p=456 Media monitoring is not a traditional search application: for a start, instead of searching a large number of documents with a single query, a media monitoring application must search every incoming news story with potentially thousands of queries, searching for … More

The post Next-generation media monitoring with open source search appeared first on Flax.

]]>
Media monitoring is not a traditional search application: for a start, instead of searching a large number of documents with a single query, a media monitoring application must search every incoming news story with potentially thousands of queries, searching for words and terms relevant to client requirements. This can be difficult to scale, especially when accuracy must be maintained – a client won’t be happy if their media monitors miss relevant stories or send them news that isn’t relevant.

We’ve been working with Durrants Ltd. of London for a while now on replacing their existing (closed source) search engine with a system built on open source. This project, which you can read more about in a detailed case study (PDF), has reduced the hardware requirements significantly and led to huge accuracy improvements (in some cases where 95% of the results passed through to human operators were irrelevant ‘false positives’, the new system is now 95% correct).

The new system is built on Xapian and Python and supports all the features of the previous engine, to ease migration – it even copes with errors introduced during automated scanning of printed news. The new system scales easily and cost effectively.

As far as we know this is one of the first large-scale media monitoring systems built on open source, and a great example of search as a platform, which we’ve discussed before.

The post Next-generation media monitoring with open source search appeared first on Flax.

]]>
http://www.flax.co.uk/blog/2010/12/13/next-generation-media-monitoring-with-open-source-search/feed/ 0