Archive for the ‘Business’ Category
I spent last night at a networking event organised by the Business Leaders Network on the subject of Open Source Business Models – this isn’t the usual sort of event I attend, being held in a very posh law firm’s offices overlooking the Thames and with some fellow attendees from venture capital firms and investment banks. Although the panel included speakers from Canonical, Rackspace and the Raspberry Pi foundation (the gently amusing Jack Lang, a Cambridge luminary who I could have happily listened to for the full hour) the theme was generally non-technical.
Questions from the floor (and via Twitter) showed that many outside the technical sector (and probably a few within it) are still bemused at how one can build a thriving business on open source, when the panel admitted that it can involve making your intellectual property available to your competitors, giving your product away for nothing and investing heavily in community building. One of the most interesting responses from the panel indicated that an open source entrant to an existing market can shrink that market by 40-50% – a venture capitalist I spoke to afterwards couldn’t understand why this can be a positive thing: however if a market is dominated by big players selling overpriced solutions, some disruptive deflation can re-shape the market considerably: this is certainly what we’ve seen in the search sector recently, and investment in the right place and time can still reap considerable rewards (consider Elasticsearch’s recent funding).
The panel also made the point that a key part of open source success is investment in people – both within a business and in the wider community. Another question about what an open source business is actually selling prompted a range of answers: a brand, peach of mind, happiness, experience, platform were the answers given. It was clear that the discussion could have continued for a lot longer as the audience were keen to hear more, and the BLN may thus be running further open source themed events – the appetite for knowledge about open source business models outside the technical community is large.
Thanks to Mark Littlewood for organising such an interesting evening and particular thanks for the free Raspberry Pi – we have a cunning plan about what to do with it so watch this space!
The most well known open source search engine, Apache Lucene/Solr, has a rival in Elasticsearch, also based on Apache Lucene. Or maybe it doesn’t. I’m not convinced that there’s an actual battle going on here, above and beyond the fact that the commercial companies formed to support each technology (Lucidworks and Elasticsearch [the company]) are obviously competitors. Let’s look at the evidence:
- Elasticsearch contains (by some measures) 64 years of effort, Solr only 55 years….a point to Elasticsearch!
- Elasticsearch commits are 31% down on last year, Solr commits are 85% up…a point to Solr!
- There are more books about Solr than Elasticsearch…a point to Solr!
- Elasticsearch, sorry elasticsearch, has a cool lower case logo and fancy website…a point to Elasticsearch!
This is of course before we get to any actual technical differences in terms of performance, scalability, ease-of-use etc. which are probably a lot more important than the list above. There are vocal critics and supporters of each project on Twitter and other media, but the great thing in our view is that there is a choice of two such excellent search technologies, both open source, so for real world applications one can try both at little cost and choose whichever is most appropriate (there are even proven migration routes between the two – we’ve helped one client with this process).
2012 has been a fascinating and stormy year for those of us in the search business. We’ve seen a raft of further acquisitions of commercial closed source search companies by bigger players, some convinced that what used to be called Enterprise Search is now a solution to Big Data (like Stephen Arnold we wonder what will succeed Big Data as the next marketing term – I love his phrase “In a quest for revenue, the vendors will wrap basic ideas in a cloud of unknowing”). One acquisition hasn’t gone so smoothly: Autonomy, bought by HP for a price that no-one in the search business thought was remotely sensible, has been accused of being oversold vapourware: this is a story that will continue to develop in 2013. If you want a great overview of the current market read Martin White’s latest research note.
Here in the slightly calmer waters of open source search, we’ve seen a huge rise in enquiries from often blue-chip companies, no longer needing persuasion that open source is a serious contender for even the largest search and content projects. Often these companies have considered large commercial solutions but are put off by both the price and high-pressure marketing tactics – in a world of reduced budgets you simply can’t sell magic beans for a pile of gold. We’ve also seen increased interest in related technologies such as machine learning and automatic categorisation – search really isn’t just about search any more.
At Flax we’re busier than we have ever been and we’re expected the trend to continue. We’re looking forward to running more Cambridge Search Meetups, visiting and helping organise conferences such as Enterprise Search Europe and Lucene Revolution, building our network of carefully chosen partners and of course working on exciting and cutting-edge development projects.
As the storms in our sector continue to rage overhead we’ll simply be getting on with what we do best, building effective search.
I’m not going to comment on the various financial aspects of the recent news about HP’s write-down of the value of its Autonomy acquisition – others are able to do this far better than me – but I would urge anyone interested to re-read the documents Oracle released earlier this year. However, I am going to write about the IDOL technology itself (I’d also recommend Tony Byrne’s excellent post).
Autonomy’s ability to market its technology has never been in doubt: aggressive and fearless, it painted IDOL as unique and magical, able to understand the meaning of data in multiple forms. However, this has never been true; computers simply don’t understand ‘meaning’ like we do. IDOL’s foundation was just a search engine using Bayesian probabilistic ranking; although most other search technologies use the vector space model there are a few other examples of this approach: Muscat, a company founded a few years before and literally across the hall from Autonomy in a Cambridge incubator, grew to a £30m business with customers including Fujitsu and the Daily Telegraph newspaper. Sadly Muscat was a casualty of the dot-com years but it is where the founders of Flax first met and worked together on a project to build a half-billion-page web search engine.
Another even less well-known example is OmniQ, eventually acquired and subsequently shelved by Sybase. Digging in the archives reveals some familiar-sounding phrases such as “automatically capture and retrieve information based on concepts”.
Originally developed at Muscat, the open source library Xapian also uses Bayesian ranking and we’ve used this successfully to build systems for the Financial Times, Newspaper Licensing Agency and Tait Electronics. Recently, Apache Lucene/Solr version 4.0 has introduced the idea of ‘pluggable’ ranking models, with one option being the Bayesian BM25. It’s important to remember though that Bayesian ranking is only one way to approach a search problem and in many cases, simply unnecessary.
It certainly isn’t magic.
We’re very happy to announce we’ve partnered with Sirius Corporation. Sirius are the leading U.K. provider of managed services, support and training for open source software with an impressive and growing list of clients including Canonical, Médecins Sans Frontières and the Met Office. We’ve recently carried out a major project for which Sirius will be providing ongoing support on a 24/7 SLA basis and we’re looking forward to further collaboration with this energetic, highly professional and skilled company.
We’re also happy to announce that Flax and Sirius will be co-hosting a free, half day event on Open Source Enterprise Search on Friday 20th July from 9.30 a.m. Held at the Sirius Corporation offices in Weybridge, Surrey, this will be an opportunity to find out how open source search can directly benefit your business. Whether you need search over documents on an intranet or database, pages on a website or more specialised applications such as media monitoring, taxonomy and classification, open source technologies can offer an economical and highly scalable route to success. The event will feature focussed briefings, networking and discussion with leading experts in the field. It’s completely free to attend and breakfast, refreshments and a riverside barbeque lunch will be provided.
You can register and find out more online.
There’s been a recent flurry of activity from search vendors (and those larger companies that have been buying them) around the theme of Big Data, which has become the fashionable marketing term for a sheaf of technologies including search, machine learning, Map Reduce and for scalability in general. If anyone impertinently asks why company X bought company Y the answer seems to be ‘because they have capability in Big Data and our customers will need this’.
Search companies like ours have been working with large datasets since the beginning – back in 1999/2000 the founders of Flax led a team to build a half-billion-page Web search engine, which as I recall ran on a cluster of 30 or so servers. Since then we’ve worked with other collections of tens or hundreds of millions of items. Even a relatively small company can have a few million files on their intranet, if you count all those emails, customer records and Powerpoint presentations. So yes, you could say we can do Big Data – we certainly know how to design and build systems that scale.
However it makes me nervous when a set of technologies that could (in theory) be used together are simply lumped together for marketing purposes as the Next Big Thing. The devil is as always in the detail (and the integration) and it’s important to remember that just because you can fit all your data into a system doesn’t mean that system will help you make any sense of it. A recent term for unstructured data (which of course us search developers have been working with for decades) is Dark Data, which implies that it is mysterious and hidden – but that doesn’t mean it has any actual value. Those considering a Big Data project should be aware that in any computer system GIGO is still an issue.
The blogotweetosphere has been positively buzzing since last night’s announcement that Hewlett Packard will be buying Autonomy for £7.1bn, while divesting itself of its PC business. Many commentators have put a positive spin on this, pointing to Autonomy’s meteoric rise from a small office in Cambridge to the behemoth it is today. It’s undoubtedly good news for Autonomy’s shareholders. Dave Kellogg correctly identifies Autonomy as a “finance company dressed in (meaning-based) technology company clothing” with a “happy ending”.
However the reaction isn’t all positive – the FT implies this deal is at the “lunatic end of the valuation spectrum”. Law Technology News says “Autonomy’s e-discovery revenue stream is high-end but unsustainable” and quotes users of the system with problems: “We had a lot of issues with the applications crashing, the documents tending not to get checked in”….”"[Autonomy sales staff] were pricey, arrogant, and they couldn’t care less about us. … It cannot get any worse.”.
HP will have to work hard to integrate Autonomy into both its corporate culture and software frameworks – a problem currently faced by Microsoft since its acquisition of FAST a short while ago. Stephen Arnold thinks this process will be “risky”. What it means for the rest of the search sector is harder to guess, although Martin White of Intranet Focus says this deal indicates HP can see a “future in search applications” and, interestingly, “A number of privately-held search vendors are probably working out what their valuation would be”.
My view is that this is just the latest of huge shifts in the enterprise search market, partly spurred on by the rise of open source options and the gradual realisation that the huge license fees charged by some vendors may be unsustainable. I don’t think Autonomy will be the last company looking for a safe haven in the years to come.
This week I was passed a link to a European Commission report on the Enterprise Search market, which I’ve just finished ploughing through (it’s 123 pages and not exactly light reading). It provides an overview of the history of the market and some current trends, but sadly misses out almost completely the rapidly growing open source sector. The authors say “…open source solutions have been disregarded because they do not seem yet to be a real alternative for company use…” – a point of view both I and our satisfied clients would disagree with. The report does at least acknowledge that “open source components are frequently used and integrated in some commercial solutions”.
However there are some very interesting numbers in the latter part of the report. For example, we hear that an Exalead customer, the automotive logistics specialist Gefco, paid 700,000 Euros for the solution built for them to track around 100,000 events a day regarding 1 million vehicles. Appendix 2 has a list of various search vendors and associated costs: for example “The average selling price for the [Autonomy] IDOL tool is $375,000″ and “The price for the Oracle Secure Enterprise Search is $34,500 per processor and $70 per referenced user (with a minimum of 100 users).”
I would question whether these prices are sustainable given that alternative solutions based on proven, scalable open source software are now available at a fraction of the cost. Perhaps the authors of the report should have considered more deeply how this might impact the enterprise search market.
We’re very happy to announce that we’ve been selected as an Authorized Partner by Lucid Imagination, the commercial company for Lucene and Solr. You can read the press release as a PDF here.
Apache Lucene and Solr, available as open source software from the Apache Software Foundation, are powerful, scalable, reliable and fully-featured search technologies. Solr is the Lucene Search Server, making it easy to build search applications for the enterprise.
With our long experience of customising, installing and supporting open source search engines, this partnership is a natural fit for us, and we’re excited by the opportunities it presents. In addition to our current offerings, Flax will now offer installation, integration and commercial support packages for Lucene and Solr, backed by Lucid Imagination.
We’ve now completely redesigned the Flax website – we hope you like it. We’ve tried to focus more on explaining exactly what we do and how the Flax open source search platform might be able to help your business.
Of course, there are sure to be teething problems – if you find anything that doesn’t work do let us know!