News – Flax http://www.flax.co.uk The Open Source Search Specialists Thu, 10 Oct 2019 09:03:26 +0000 en-GB hourly 1 https://wordpress.org/?v=4.9.8 Flax joins OpenSource Connections http://www.flax.co.uk/blog/2018/12/21/flax-joins-opensource-connections/ http://www.flax.co.uk/blog/2018/12/21/flax-joins-opensource-connections/#respond Fri, 21 Dec 2018 12:09:24 +0000 http://www.flax.co.uk/?p=4017 We have some news! From February 1st 2019 Flax’s Managing Director Charlie Hull will be joining OpenSource Connections (OSC), Flax’s long-standing US partner, as a senior Managing Consultant. Charlie will manage a new UK division of OSC who will also … More

The post Flax joins OpenSource Connections appeared first on Flax.

]]>
We have some news!

From February 1st 2019 Flax’s Managing Director Charlie Hull will be joining OpenSource Connections (OSC), Flax’s long-standing US partner, as a senior Managing Consultant. Charlie will manage a new UK division of OSC who will also acquire some of Flax’s assets and brands. OSC are a highly regarded organisation in the world of search and relevance, wrote the seminal book Relevant Search and run the popular Haystack relevance conference. Their clients include the US Patent Office, the Wikimedia Foundation and Under Armour and their services include comprehensive training, Discovery engagements, Trusted Advisor consulting and expert implementation.

Lemur Consulting Ltd., which as most of you will know trades as Flax, will continue to operate and to complete current projects but will not be taking on any new business after January 2019. For any new business we will be forwarding all future Flax enquiries to OSC where Charlie will as ever be very happy to discuss requirements and how OSC’s expert team (which may include some familiar faces!) might help.

We are all very excited about this new development as it will create a larger team of independent search & relevance experts with a global reach. We fully expect to build on Flax’s 17 year history of providing high quality search solutions as part of OSC. We intend to continue managing the London Lucene/Solr Meetup and running, attending and speaking at other events on search related topics.

If you have any questions about the above please do contact us. Merry Christmas and best wishes for the New Year!

The post Flax joins OpenSource Connections appeared first on Flax.

]]>
http://www.flax.co.uk/blog/2018/12/21/flax-joins-opensource-connections/feed/ 0
Just the facts with Solr & Luwak http://www.flax.co.uk/blog/2017/01/04/just-facts-solr-luwak/ http://www.flax.co.uk/blog/2017/01/04/just-facts-solr-luwak/#respond Wed, 04 Jan 2017 15:58:19 +0000 http://www.flax.co.uk/?p=3406 It won’t have escaped your notice that factchecking is very much in the news recently due to last year’s political upheavals in both the US and UK and the suspected influence of fake news on voters. Both traditional and social … More

The post Just the facts with Solr & Luwak appeared first on Flax.

]]>
It won’t have escaped your notice that factchecking is very much in the news recently due to last year’s political upheavals in both the US and UK and the suspected influence of fake news on voters. Both traditional and social media organisations are making efforts in this area; examples include Channel 4 and Facebook.

At our recent London Lucene/Solr Meetup UK charity Full Fact spoke eloquently on the need for automated factchecking tools to help identify and correct stories that are demonstrably false. They’ve also published a great report on The State of Automated Factchecking which mentions both Apache Solr and our powerful stored query library Luwak as components of their platform. We’ve been helping FullFact with their prototype factchecking tools for a while now but during the Meetup I suggested we might run a hackday to develop these further.

Thus I’m very pleased to announce that Facebook have offered us a venue in London for the hackday on January 20th (register here). Many Solr developers, including several committers and PMC members, are signed up to attend already. We’ll use Full Fact’s report and their experiences of factchecking newspapers, TV’s Question Time and Hansard to design and build practical, useful tools and identify a future roadmap. We’ll aim to publish what we build as open source software which should also benefit factchecking organisations across the world.

If you’re concerned about the impact of fake news on the political process and want to help, join the Meetup and/or donate to Full Fact.

The post Just the facts with Solr & Luwak appeared first on Flax.

]]>
http://www.flax.co.uk/blog/2017/01/04/just-facts-solr-luwak/feed/ 0
Talks: Replacing Autonomy IDOL with Solr, Elasticsearch for e-commerce & relevancy tuning http://www.flax.co.uk/blog/2015/11/04/talks-replacing-autonomy-idol-with-solr-elasticsearch-for-e-commerce-relevancy-tuning/ http://www.flax.co.uk/blog/2015/11/04/talks-replacing-autonomy-idol-with-solr-elasticsearch-for-e-commerce-relevancy-tuning/#respond Wed, 04 Nov 2015 11:48:33 +0000 http://www.flax.co.uk/?p=2769 I’ll be speaking at several events over the next few weeks, in the UK and abroad. On the 19th of November I’ll be at the FIBEP World Media Intelligence Congress in Vienna, to talk about how we helped our client … More

The post Talks: Replacing Autonomy IDOL with Solr, Elasticsearch for e-commerce & relevancy tuning appeared first on Flax.

]]>
I’ll be speaking at several events over the next few weeks, in the UK and abroad. On the 19th of November I’ll be at the FIBEP World Media Intelligence Congress in Vienna, to talk about how we helped our client Infomedia migrate from a closed-source search engine (Autonomy IDOL and Verity) to a new platform based on Apache Lucene/Solr and our own Luwak stored search library. Infomedia are Denmark’s leading provider of media monitoring and analysis and wanted to future-proof their search platform: we’ll talk about open source makes this possible and how we implemented stored search, handled highly complex queries and how the new platform is scalable and flexible.

On the 25th I’ll be presenting at the London Elasticsearch Usergroup with our client Westcoast, who we have been helping with an Elasticsearch implementation. Westcoast are a B2B supplier of electronics and white goods with yearly revenues of over £1billion, and we’ve helped them implement a powerful new search engine for their website. E-commerce is one sector where good search is an essential part of driving revenue.

Next, on the 26th I’ll be talking one of my favourite events of the year, the British Computer Society Information Retrieval Specialist Group’s Search Solutions, on how we might improve how search engine relevance is tested. I’ll suggest a more formal process of test-based relevance tuning and show some useful tools. Our client NLA media access are also talking about the new Clipshare platform we built on Apache Lucene/Solr.

Do let me know if you’re attending and would like to chat – I’ll also be publishing slides and more information about the projects above soon.

The post Talks: Replacing Autonomy IDOL with Solr, Elasticsearch for e-commerce & relevancy tuning appeared first on Flax.

]]>
http://www.flax.co.uk/blog/2015/11/04/talks-replacing-autonomy-idol-with-solr-elasticsearch-for-e-commerce-relevancy-tuning/feed/ 0
Rebrands and changing times for Elasticsearch http://www.flax.co.uk/blog/2015/03/11/rebrands-and-changing-times-for-elasticsearch/ http://www.flax.co.uk/blog/2015/03/11/rebrands-and-changing-times-for-elasticsearch/#comments Wed, 11 Mar 2015 14:42:12 +0000 http://www.flax.co.uk/blog/?p=1405 I’ve always been careful to distinguish between Elasticsearch (the open source search server based on Lucene) and Elasticsearch (the company formed by the authors of the former) and it seems someone was listening, as the latter has now rebranded as … More

The post Rebrands and changing times for Elasticsearch appeared first on Flax.

]]>
I’ve always been careful to distinguish between Elasticsearch (the open source search server based on Lucene) and Elasticsearch (the company formed by the authors of the former) and it seems someone was listening, as the latter has now rebranded as simply Elastic. This was one of the big announcements during their first conference, the other being that after acquiring Norwegian company Found they are now offering a fully hosted Elasticsearch-as-a-service (congratulations to Alex and others at Found!). As Ben Kepes of Forbes writes, this may be something to do with ‘managing tensions within the ecosystem’ (I’ve written previously on how this ecosystem is expanding to include closed-source commercial products, which may make open source enthusiasts nervous) but it’s also an attempt to move away from ‘search’ into a wider area encompassing the buzzwords-de-jour of Big Data Analytics.

In any case, it’s clear that Elastic (the company, and that’s hopefully the last time I’ll have to write this!) have a clear strategy for the future – to provide many different commercial options for Elasticsearch and its related projects for as many different use cases as possible. Of course, you can still take the open source route, which we’re helping several clients with at present – I hope to be able to present a case study on this very soon.

Meanwhile, Martin White has identified how a recent book on Elasticsearch describes literally hundreds of features and that ‘The skill lies in knowing which to implement given the nature of the content and the type of query that will be used’ – effective search, as ever, remains a difficult thing to get right, no matter what technology option you choose.

UPDATE: It seems that www.elasticsearch.org, the website for the open source project, is now redirecting to the commercial company website…there is now a new Github page for open source code at https://github.com/elastic

The post Rebrands and changing times for Elasticsearch appeared first on Flax.

]]>
http://www.flax.co.uk/blog/2015/03/11/rebrands-and-changing-times-for-elasticsearch/feed/ 1
How not to predict the future of search http://www.flax.co.uk/blog/2014/05/15/how-not-to-predict-the-future-of-search/ http://www.flax.co.uk/blog/2014/05/15/how-not-to-predict-the-future-of-search/#comments Thu, 15 May 2014 09:10:16 +0000 http://www.flax.co.uk/blog/?p=1206 I’ve just seen an article titled Enterprise Search: 14 Industry Experts Predict the Future of Search which presents a list of somewhat contradictory opinions. I’m afraid I have some serious issues with the experts chosen and the undeniably blinkered views … More

The post How not to predict the future of search appeared first on Flax.

]]>
I’ve just seen an article titled Enterprise Search: 14 Industry Experts Predict the Future of Search which presents a list of somewhat contradictory opinions. I’m afraid I have some serious issues with the experts chosen and the undeniably blinkered views some of them have presented.

Firstly, if you’re going to ask a set of experts to write about Enterprise Search, don’t choose an expert in SEO as part of your list. SEO is not Enterprise Search, in fact a lot of the time it isn’t anything at all (except snake oil) – it’s a way of attempting to game the algorithms of web search engines. Secondly, at least make some attempt to prevent your experts from just listing the capabilities of their own companies in their answers: in fact one ‘expert’ was actually a set of PR-friendly answers from a company rather than a person, including listing articles about their own software. The expert from Microsoft rather predictably failed to notice the impact of open source on the search market, before going on to put a positive spin on the raft of acquisitions of search companies over the last few years (and it’s certainly not all good, as a recent writedown has proved). Apparently the acquisition of specialist search companies by corporate behemoths will drive innovation – that is, unless that specialist knowledge vanishes into the behemoth’s Big Data strategy, never to be seen again. Woe betide the past customers that have to get used to a brand new pricing, availability and support plan as well.

Luckily it wasn’t all bad – there were some sensible viewpoints on the need for better interaction with the user, the rise of semantic analysis and how the rise of open source is driving out inefficiency in the market – but the article is absolutely peppered with buzzwords (Big Data being the most prevalent, of course) and contains some odd cliches: “I think a generation of people believes the computer should respond like HAL 9000″…didn’t HAL 9000 kill most of the crew and attempt to lock the survivor outside the airlock?

I’m pretty sure this isn’t a feature we want to replicate in an Enterprise Search system.

The post How not to predict the future of search appeared first on Flax.

]]>
http://www.flax.co.uk/blog/2014/05/15/how-not-to-predict-the-future-of-search/feed/ 1
ISKO UK – Taming the News Beast http://www.flax.co.uk/blog/2014/04/02/isko-uk-taming-the-news-beast/ http://www.flax.co.uk/blog/2014/04/02/isko-uk-taming-the-news-beast/#respond Wed, 02 Apr 2014 11:55:31 +0000 http://www.flax.co.uk/blog/?p=1180 I spent yesterday afternoon at UCL for ISKO UK‘s event on Taming the News Beast – I’m not sure if we found out how to tame it but we certainly heard how to festoon it with metadata and lock it … More

The post ISKO UK – Taming the News Beast appeared first on Flax.

]]>
I spent yesterday afternoon at UCL for ISKO UK‘s event on Taming the News Beast – I’m not sure if we found out how to tame it but we certainly heard how to festoon it with metadata and lock it up in a nice secure ontology. There were around 90 people attending from news, content, technology and academic organisations, including quite a few young journalism students visiting London from Missouri.

The first talk was by Matt Shearer of BBC News Labs who described how they are working on automatically extracting entities from video/audio content (including verbatim transcripts, contributors using face/voice recognition, objects using audio/image recognition, topics, actions and non-verbal events including clapping). Their prototype ‘Juicer’ extractor currently works with around 680,000 source items and applies 5.7 million tags – which represents around 9 man years for a manual tagger. They are using Stanford NLP and DBpedia heavily, as well as an internal BBC project ‘Mango’ – I hope that some of the software they are developing is eventually open sourced as after all this is a publically-funded broadcaster. His colleague Jeremy Tarling was next and described a News Storyline concept they had been working on a new basis for the BBC News website (which apparently hasn’t changed much in 17 years, and still depends on a lot of manual tagging by journalists). The central concept of a storyline (e.g. ‘US spy scandal’) can form a knowledge graph, linked to events (‘Snowden leaves airport’), videos, ‘explainer’ stories, background items etc. Topics can be used to link storylines together. This was a fascinating idea, well explained and something other news organisations should certainly take note of.

Next was Rob Corrao of LAC Group describing how they had helped ABC News revolutionize their existing video library which contains over 2 million assets. They streamlined the digitization process, moved little-used analogue assets out of expensive physical storage, re-organised teams and shift patterns and created a portal application to ease access to the new ‘video library as a service’. There was a focus on deep reviews of existing behaviour and a pragmatic approach to what did and didn’t need to be digitized. This was a talk more about process and management rather than technology but the numbers were impressive: at the end of the project they were handling twice the volume with half the people.

Ian Roberts from the University of Sheffield then described AnnoMarket, a cloud-based market platform for text analytics, which wraps the rather over-complex open source GATE project in an API with easy scalability. As they have focused on precision over recall, AnnoMarket beats other cloud-based NLP services such as OpenCalais and TextRazor in terms of accuracy, and can process impressive volumes of documents (10 million in a few hours was quoted). They have developed custom pipelines for news, biomedical and Twitter content with the former linked into the Press Association‘s ontology (PA is a partner in AnnoMarket). For those wanting to carry out entity extraction and similar processes on large volumes of content at low cost AnnoMarket certainly looks attractive.

Next was Pete Sowerbutts of PA on the prototype interface he had helped develop for tagging all of PA’s 3000 daily news stories with entity information. I hadn’t known how influential PA is in the UK news sector – apparently 30% of all UK news is a direct copy of a PA feed and they estimate 70% is influenced by PA’s content. The UI showed how entities that have been automatically extracted can be easily confirmed by PA’s staff, allowing for confirmation that the right entity is being used (the example being Chris Evans who could be both a UK MP, a television personality and an American actor). One would assume the extractor produces some kind of confidence measure which begs the question whether every single entity must be manually confirmed – but then again, PA must retain their reputation for high quality.

The event finished with a brief open discussion featuring some of the speakers on an informal panel, followed by networking over drinks and snacks. Thanks to all at ISKO especially Helen Lippell for organising what proved to be a very interesting day.

The post ISKO UK – Taming the News Beast appeared first on Flax.

]]>
http://www.flax.co.uk/blog/2014/04/02/isko-uk-taming-the-news-beast/feed/ 0
Autonomy & HP – a technology viewpoint http://www.flax.co.uk/blog/2012/11/21/autonomy-hp-a-technology-viewpoint/ http://www.flax.co.uk/blog/2012/11/21/autonomy-hp-a-technology-viewpoint/#comments Wed, 21 Nov 2012 10:37:36 +0000 http://www.flax.co.uk/blog/?p=904 I’m not going to comment on the various financial aspects of the recent news about HP’s write-down of the value of its Autonomy acquisition – others are able to do this far better than me – but I would urge … More

The post Autonomy & HP – a technology viewpoint appeared first on Flax.

]]>
I’m not going to comment on the various financial aspects of the recent news about HP’s write-down of the value of its Autonomy acquisition – others are able to do this far better than me – but I would urge anyone interested to re-read the documents Oracle released earlier this year. However, I am going to write about the IDOL technology itself (I’d also recommend Tony Byrne’s excellent post).

Autonomy’s ability to market its technology has never been in doubt: aggressive and fearless, it painted IDOL as unique and magical, able to understand the meaning of data in multiple forms. However, this has never been true; computers simply don’t understand ‘meaning’ like we do. IDOL’s foundation was just a search engine using Bayesian probabilistic ranking; although most other search technologies use the vector space model there are a few other examples of this approach: Muscat, a company founded a few years before and literally across the hall from Autonomy in a Cambridge incubator, grew to a £30m business with customers including Fujitsu and the Daily Telegraph newspaper. Sadly Muscat was a casualty of the dot-com years but it is where the founders of Flax first met and worked together on a project to build a half-billion-page web search engine.

Another even less well-known example is OmniQ, eventually acquired and subsequently shelved by Sybase. Digging in the archives reveals some familiar-sounding phrases such as “automatically capture and retrieve information based on concepts”.

Originally developed at Muscat, the open source library Xapian also uses Bayesian ranking and we’ve used this successfully to build systems for the Financial Times, Newspaper Licensing Agency and Tait Electronics. Recently, Apache Lucene/Solr version 4.0 has introduced the idea of ‘pluggable’ ranking models, with one option being the Bayesian BM25. It’s important to remember though that Bayesian ranking is only one way to approach a search problem and in many cases, simply unnecessary.

It certainly isn’t magic.

The post Autonomy & HP – a technology viewpoint appeared first on Flax.

]]>
http://www.flax.co.uk/blog/2012/11/21/autonomy-hp-a-technology-viewpoint/feed/ 5
Searching for (and finding) open source in the UK Government http://www.flax.co.uk/blog/2012/02/17/searching-and-finding-open-source-in-uk-government/ http://www.flax.co.uk/blog/2012/02/17/searching-and-finding-open-source-in-uk-government/#respond Fri, 17 Feb 2012 10:30:46 +0000 http://www.flax.co.uk/blog/?p=710 There have been some very encouraging noises recently about increased use of open source software by the UK Government: for example we’ve seen the creation of an Open Source Procurement Toolkit by the Cabinet Office, which lists Xapian and Apache … More

The post Searching for (and finding) open source in the UK Government appeared first on Flax.

]]>
There have been some very encouraging noises recently about increased use of open source software by the UK Government: for example we’ve seen the creation of an Open Source Procurement Toolkit by the Cabinet Office, which lists Xapian and Apache Lucene/Solr as alternatives to the usual closed source options. The CESG, the “UK Government’s National Technical Authority for Information Assurance”, has clarified its position on open source software, which has led to the Cabinet Office dispelling some of the old myths about security and open source. We know that the Cabinet Office’s ‘skunkworks’, the Government Digital Service, are using Solr for several of their projects. Francis Maude MP was recently in the USA with some of the GDS team and visited amongst others our US partners Lucid Imagination.

The British Computer Society have helped organise a series of Awareness Events for civil servants and I’m glad to be speaking at the first of these next Tuesday 21st February on open source search – hopefully this will further increase the momentum and make it even more clear that a modern Government needs to consider this modern, flexible and economically scalable approach to software.

The post Searching for (and finding) open source in the UK Government appeared first on Flax.

]]>
http://www.flax.co.uk/blog/2012/02/17/searching-and-finding-open-source-in-uk-government/feed/ 0
Mixed reactions as HP buys Autonomy http://www.flax.co.uk/blog/2011/08/19/mixed-reactions-as-hp-buys-autonomy/ http://www.flax.co.uk/blog/2011/08/19/mixed-reactions-as-hp-buys-autonomy/#respond Fri, 19 Aug 2011 10:43:09 +0000 http://www.flax.co.uk/blog/?p=621 The blogotweetosphere has been positively buzzing since last night’s announcement that Hewlett Packard will be buying Autonomy for £7.1bn, while divesting itself of its PC business. Many commentators have put a positive spin on this, pointing to Autonomy’s meteoric rise … More

The post Mixed reactions as HP buys Autonomy appeared first on Flax.

]]>
The blogotweetosphere has been positively buzzing since last night’s announcement that Hewlett Packard will be buying Autonomy for £7.1bn, while divesting itself of its PC business. Many commentators have put a positive spin on this, pointing to Autonomy’s meteoric rise from a small office in Cambridge to the behemoth it is today. It’s undoubtedly good news for Autonomy’s shareholders. Dave Kellogg correctly identifies Autonomy as a “finance company dressed in (meaning-based) technology company clothing” with a “happy ending”.

However the reaction isn’t all positive – the FT implies this deal is at the “lunatic end of the valuation spectrum”. Law Technology News says “Autonomy’s e-discovery revenue stream is high-end but unsustainable” and quotes users of the system with problems: “We had a lot of issues with the applications crashing, the documents tending not to get checked in”….””[Autonomy sales staff] were pricey, arrogant, and they couldn’t care less about us. … It cannot get any worse.”.

HP will have to work hard to integrate Autonomy into both its corporate culture and software frameworks – a problem currently faced by Microsoft since its acquisition of FAST a short while ago. Stephen Arnold thinks this process will be “risky”. What it means for the rest of the search sector is harder to guess, although Martin White of Intranet Focus says this deal indicates HP can see a “future in search applications” and, interestingly, “A number of privately-held search vendors are probably working out what their valuation would be”.

My view is that this is just the latest of huge shifts in the enterprise search market, partly spurred on by the rise of open source options and the gradual realisation that the huge license fees charged by some vendors may be unsustainable. I don’t think Autonomy will be the last company looking for a safe haven in the years to come.

The post Mixed reactions as HP buys Autonomy appeared first on Flax.

]]>
http://www.flax.co.uk/blog/2011/08/19/mixed-reactions-as-hp-buys-autonomy/feed/ 0
UK Government IT – a closed shop to SMEs and OSS? http://www.flax.co.uk/blog/2011/03/18/government-it-a-closed-shop-to-smes-and-oss/ http://www.flax.co.uk/blog/2011/03/18/government-it-a-closed-shop-to-smes-and-oss/#respond Fri, 18 Mar 2011 11:42:32 +0000 http://www.flax.co.uk/blog/?p=525 There’s a lot of buzz currently around the UK government and its approach to IT projects (which has been historically rather poor in terms of delivery, schedules and cost). We’ve written before about an Action Plan that recommends open source … More

The post UK Government IT – a closed shop to SMEs and OSS? appeared first on Flax.

]]>
There’s a lot of buzz currently around the UK government and its approach to IT projects (which has been historically rather poor in terms of delivery, schedules and cost). We’ve written before about an Action Plan that recommends open source and open standards, but it seems that actually implementing these is more of a problem, especially when you consider (flexible and more agile) smaller suppliers such as ourselves who may not even get a chance to compete for the business.

There’s an inquiry running currently that promises to look at this, and they have invited various people to put their views across. Unfortunately with one laudable exception these people were from (or mainly represent) very large IT companies who already supply the government and whose interest lies in maintaining the status quo.

As Mark Taylor of Sirius has already pointed out, this situation isn’t going to change until government procurement itself becomes an open process, so that we can all see how much could be wasted on outdated project management methods and overpriced closed source software.

The post UK Government IT – a closed shop to SMEs and OSS? appeared first on Flax.

]]>
http://www.flax.co.uk/blog/2011/03/18/government-it-a-closed-shop-to-smes-and-oss/feed/ 0