Posts Tagged ‘partner’

BioSolr – building better search for bioinformatics

The entire Flax technical team spent the day at the European Bioinformatics Institute yesterday discussing an exciting new project we’ll begin this coming September, BioSolr. Funded by the BBSRC this collaboration between Flax and the EBI aims “to significantly advance the state of the art with regard to indexing and querying biomedical data with freely available open source software”. Here we are with Dr. Sameer Valenkar and Gautier Koscielny of the EBI.

The EBI, located on the Wellcome Trust Genome Campus near Cambridge, maintains the world’s most comprehensive range of freely available and up-to-date molecular databases and is already using Apache Lucene/Solr extensively, for example in the Protein Databank in Europe which indexes over 100,000 items derived from experimental research – but this is just one of the many complex collections they provide. The BioSolr project will run for a full year, during which members of the Flax team will work directly with the EBI team to run workshops, demonstrate and document best practises in search application design, create, improve and extend open source software and learn a lot about the specialist search requirements of bioinformatics. This is a fantastic opportunity for us to push the boundaries of what is possible with Solr and associated software, to work with some incredibly rich data and to do all of this in the open to encourage collaboration from the wider software and biology communities.

We’ll be creating various open resources (software repositories, Wikis, blogs) to support the project later this year – do let us know if you would like to be involved and we will keep you informed.

As Hadoop gains, does Lucene benefit?

The last few weeks have seen a rush of investment in companies that offer Hadoop-powered Big Data platforms – the most recent being Intel’s investment in Cloudera, but Hortonworks has also snorted up $100m.

Gartner correctly explains that Hadoop isn’t just one project, but an ecosystem comprising an increasing number of open source projects (and some closed source distributions and add-ons). Once you’ve got your Big Data in a HDFS-shaped pile, there are many ways to make sense of it – and one of those is a search engine, so there’s been a lot of work recently trying to add Lucene-powered search engines such as Apache Solr and Elasticsearch into the mix. There’s also been some interesting partnerships.

I’m thus wondering whether this could signal a significant boost to the development of these search projects: there are already Lucene/Solr committers working at Hadoop-flavoured companies who have been working on distributed search and other improvements to scalability. Let’s hope some of the investment cash goes to search!

Solr and the changing landscape of search

This morning I was told about the launch of a new US-based search company, Heliosearch, founded by the creator of Apache Solr, Yonik Seeley. It seems the landscape of open source search and in particular Solr is changing again – Heliosearch are planning their own ‘certified’ distribution of Solr plus a raft of support, consulting and services. In the meantime, the company Yonik co-founded (and our partners) LucidWorks are recently launched an ‘App Store’ for search, the Solr Marketplace, offering add-ons to the core engine from both themselves and others.

What we’re seeing here is the further growth of an ecosystem based around what has almost become the default choice for new and migrating search applications. Some clients will want a packaged distribution of Solr, some will be happy to download the source from Apache, some will need help getting started and some will just need help when things get complicated, or support for a running application. We’ve seen all of these requirements and more in the last year.

Next week the largest conference on open source search, Lucene Revolution is held in Dublin, and four of the Flax team are attending. Do let us know if you’d like to meet up – I don’t think there’s going to be a lack of things to talk about!

Tags: , , , ,

Posted in News, events

October 29th, 2013

No Comments »

Finding the elephant in the room: open source search & Hadoop grow closer together

I’ve been lucky enough to attend two talks on Hadoop in the last few weeks which has made me take a closer look at this technology. In case you didn’t know, Hadoop is an Apache top level open source project comprising a framework for distributed computing and storage, originally created by Doug Cutting (also the creator of Apache Lucene) while at Yahoo! in 2005. Distributed computing is carried out using MapReduce (roughly speaking, the ‘map’ bit involves splitting a processing task up into chunks and distributing these among various processing nodes, the ‘reduce’ bit brings all the results together again) and the storage uses the Hadoop Distributed File System (HDFS). There are other parts of Hadoop including a database (HBase), data warehouse with SQL-like language (Hive), scripting language (Pig) and more.

Those I’ve spoken to who have attempted to build applications on Hadoop have said that it’s very much a kit of parts rather than an integrated platform, so not that easy to get started with – which has led to the emergence of various vendors providing ‘curated’ distributions and support, much as Lucidworks does for Apache Lucene/Solr. Cloudera, Hortonworks, and MapR are just some of the best-known of these vendors. With everyone jumping on the BigData bandwagon these days some of these vendors have attracted significant interest and funding.

As you might expect full-text search is often required for these distributed systems and there have been various attempts to bring Hadoop and search closer together. Hortonworks support integration with Elasticsearch, although this currently appears to mean that you can use Hive or Pig to move data from Hadoop on or off a separate Elasticsearch cluster, rather than the search engine running on the cluster itself. Cloudera’s integration of Hadoop with Solr appears to be tighter, with Solr storing its indexes on HDFS directly (perhaps not surprising considering Lucene/Solr committer Mark Miller, who is responsible for most recent SolrCloud development, works for Cloudera). Cloudera even has its own data conditioning framework Flume (yes, it seems we need yet another data conditioning/pipelining solution!) and allows for distributed indexing. MapR have partnered with LucidWorks and integrated LucidWorks Search into their distribution. All these vendors are heavy contributors to Hadoop of course and most also contribute to Lucene/Solr or Elasticsearch.

Since Hadoop has been linked with search from the beginning one can hope that these integration efforts will continue – applications that require distributed search are becoming increasingly common and Hadoop, despite its nature as a kit of parts requiring assembly, is a good foundation to build on.

New Year predictions: further search storms ahead!

2012 has been a fascinating and stormy year for those of us in the search business. We’ve seen a raft of further acquisitions of commercial closed source search companies by bigger players, some convinced that what used to be called Enterprise Search is now a solution to Big Data (like Stephen Arnold we wonder what will succeed Big Data as the next marketing term – I love his phrase “In a quest for revenue, the vendors will wrap basic ideas in a cloud of unknowing”). One acquisition hasn’t gone so smoothly: Autonomy, bought by HP for a price that no-one in the search business thought was remotely sensible, has been accused of being oversold vapourware: this is a story that will continue to develop in 2013. If you want a great overview of the current market read Martin White’s latest research note.

Here in the slightly calmer waters of open source search, we’ve seen a huge rise in enquiries from often blue-chip companies, no longer needing persuasion that open source is a serious contender for even the largest search and content projects. Often these companies have considered large commercial solutions but are put off by both the price and high-pressure marketing tactics – in a world of reduced budgets you simply can’t sell magic beans for a pile of gold. We’ve also seen increased interest in related technologies such as machine learning and automatic categorisation – search really isn’t just about search any more.

At Flax we’re busier than we have ever been and we’re expected the trend to continue. We’re looking forward to running more Cambridge Search Meetups, visiting and helping organise conferences such as Enterprise Search Europe and Lucene Revolution, building our network of carefully chosen partners and of course working on exciting and cutting-edge development projects.

As the storms in our sector continue to rage overhead we’ll simply be getting on with what we do best, building effective search.

Tags: , , , , ,

Posted in Business, News

January 3rd, 2013

No Comments »

Trading-up to open source – a safer route to effective search

It hasn’t taken long for some of Autonomy’s rivals to attempt to capitalise on the recent bad PR around HP’s acquisition – OpenText has offered a ’software trade-in’, Recommind has offered a ‘trade-up’ and Swiss company RSD has offered a free license for their governance software to Autonomy customers. No word yet from Exalead, Oracle (Endeca), Microsoft (FAST) or any of the other big commercial search companies but I’m sure their salespeople are making the most of the situation.

Migrating a search engine from one technology to another is rarely trouble-free: data must be re-indexed, query architectures rewritten, integration with external systems re-done, relevancy checked…however with sufficient forethought it can be done successfully. We’ve just helped one client migrate from a commercial engine to Apache Solr in a matter of weeks: although at first glance Solr didn’t seem to support all of the features the commercial engine provided, it proved possible to simulate them using multiple queries and with careful design for scalability, query performance is comparable.

Choosing one closed source engine to replace another doesn’t remove the risk that future corporate mergers & acquisitions will cause exactly the same lack of confidence that is no doubt affecting Autonomy customers – or huge increases in license fees, a drop in the quality of available support or the end of the product line altogether – and we’ve heard of all of these effects over the last few years. Moving to an open source search engine gives you freedom and control of the future of the technology your business is reliant upon, with a wealth of options for migration assistance, development and support.

So here’s our offer – we’d be happy to talk, for free (by phone or face-to-face for customers within reach of our Cambridge offices), to any Autonomy customers considering migration and to help them consider the open source options (some of these even have the Bayesian, probabilistic search features Autonomy IDOL provides) – and together with our partners we can also provide a level of ongoing support comparable to any closed source vendor. We don’t have salespeople, we don’t have a product to sell you and you’ll be talking directly to experts with decades of experience implementing search – and there’s no obligation to take things any further. We’d simply like to offer an alternative (and we believe, safer) route to effective search.

Tags: , , , , , , ,

Posted in News

December 5th, 2012

No Comments »

Flax partners with open source support specialists Sirius Corporation

We’re very happy to announce we’ve partnered with Sirius Corporation. Sirius are the leading U.K. provider of managed services, support and training for open source software with an impressive and growing list of clients including Canonical, Médecins Sans Frontières and the Met Office. We’ve recently carried out a major project for which Sirius will be providing ongoing support on a 24/7 SLA basis and we’re looking forward to further collaboration with this energetic, highly professional and skilled company.

We’re also happy to announce that Flax and Sirius will be co-hosting a free, half day event on Open Source Enterprise Search on Friday 20th July from 9.30 a.m. Held at the Sirius Corporation offices in Weybridge, Surrey, this will be an opportunity to find out how open source search can directly benefit your business. Whether you need search over documents on an intranet or database, pages on a website or more specialised applications such as media monitoring, taxonomy and classification, open source technologies can offer an economical and highly scalable route to success. The event will feature focussed briefings, networking and discussion with leading experts in the field. It’s completely free to attend and breakfast, refreshments and a riverside barbeque lunch will be provided.

You can register and find out more online.

Tags: , , , ,

Posted in Business, News

June 28th, 2012

No Comments »

Strange bedfellows? The rise of cloud based search

Last night our US partners Lucid Imagination announced that LucidWorks, their packaged and supported version of Apache Lucene/Solr, is available on Microsoft’s Azure cloud computing service. It seems like only a few weeks since Amazon announced their own CloudSearch system and no doubt other ’search as a service’ providers are waiting in the wings (we’re going to need a new acronym as SaaS is already taken!). At first the combination of a search platform based on open source Java code with Microsoft hosting might seem strange, and it raises some interesting questions about the future of Microsoft’s own FAST Search technology – is this final proof that FAST will only ever be part of Sharepoint and never a standalone product? However with search technology becoming more and more of a commodity this is a great option for customers looking for search over relatively small numbers of documents.

Lucid’s offering is considerably more flexible and full-featured than Amazon’s, which we hear is pretty basic with a lack of standard search features like contextual snippets and a number of bugs in the client software. You can see the latter in action at Runar Buvik’s excellent OpenTestSearch website. With prices for the Lucid service ranging from free for small indexes, this is certainly an option worth considering.

The year open source search got serious

It’s been an interesting and busy twelve months here at Flax – we’ve worked on some fantastic customer projects, spoken at conferences at home and abroad and made some great alliances and partnerships. We are talking to more people than ever before about the advantages of open source search and we’ve even started a local Meetup group.

This has been the year when open source search moved out of the shadows and became a force to reckon with – whether handling billions of queries or millions of customers, powering innovative new APIs for open content from forward-looking media companies or simply making it easier for search applications to be developed. Commercial support is now available to rival anything offered by the closed source world and there are now fully packaged solutions built on open source. In some sectors open source may even become the default choice (see what IDC said about the embedded/OEM market).

There’s still significant change to come in the search sector – I expect a few vendors will be in trouble by this time next year as they realise their business models (often built on per-document charges) are out-of-date, and we might also see further acquisitions by the usual behemoths. All this leads to reduced choice and increased costs for customers, and this is where open source can help – you can build your search solution in-house, or engage companies like ours to help, but you’re no longer locked in to a vendor’s roadmap and shackled to their business plan (or the consequences of its failure!).

I’ll leave the final word to Matt Asay of Canonical, who says: “Open source is how we do business 10 years into this new millennium.”

More about LucidWorks Enterprise

If you’re considering a Lucene/Solr powered search solution, you may be interested in LucidWorks Enterprise, produced by our partners Lucid Imagination. They’ve taken Lucene/Solr and added a powerful admin GUI, ReST API, web spiders, file crawlers, database connectors, alerts, a clickthrough framework and more. All this comes with a range of excellent support options backed by the experts at Lucid.

If you’d like to know more read this downloadable PDF or contact us for more information and a demo.

Tags: , , ,

Posted in Reference

November 5th, 2010

No Comments »