networking – Flax http://www.flax.co.uk The Open Source Search Specialists Thu, 10 Oct 2019 09:03:26 +0000 en-GB hourly 1 https://wordpress.org/?v=4.9.8 London Lucene/Solr Usergroup – website search and indexing the cloud http://www.flax.co.uk/blog/2015/09/11/london-lucenesolr-usergroup-website-search-and-indexing-the-cloud/ http://www.flax.co.uk/blog/2015/09/11/london-lucenesolr-usergroup-website-search-and-indexing-the-cloud/#respond Fri, 11 Sep 2015 08:58:52 +0000 http://www.flax.co.uk/?p=2437 This week’s London Lucene/Solr Meetup was hosted by asset management company BlackRock who also provided our first speakers. BlackRock manages an astonishing $4.7 trillion in assets (that’s more than the GDP of Germany) and operates 90 different websites with around … More

The post London Lucene/Solr Usergroup – website search and indexing the cloud appeared first on Flax.

]]>
This week’s London Lucene/Solr Meetup was hosted by asset management company BlackRock who also provided our first speakers. BlackRock manages an astonishing $4.7 trillion in assets (that’s more than the GDP of Germany) and operates 90 different websites with around 250,000 content items, so a good and accurate website search engine is essential. Although BlackRock use HP Autonomy‘s content management system and IDOL search engine, the latter is hard to tune (‘not deterministic, and why it ranks the way it does can be mysterious’) and Ife Nkechukwu and Erica Sundberg have been investigating Apache Solr as an alternative: being open source and with a powerful debugging features, Solr allows complete understanding of why a particular result is scored and ranked.

Starting with this great video (it’s from Google not BlackRock, but amusing and worth a look), Ife and Erica gave an engaging and clear presentation of their journey with Solr: how they explored the various options for crawling (Nutch and Heritrix were mentioned), how Analyzers are used to condition content for indexing and how Solr scoring ranking is actually calculated. This was one of the best ‘how to get started with Solr’ presentations I have seen and I was also very pleased to hear Ife say ‘you can’t just build search and forget it – you have to tune search like an instrument’ – entirely consistent with our own experience.

After a quick pizza break, Jim Liddle of Storage Made Easy was next up. Jim’s company provides appliances that connect to a myriad of cloud storage systems and provide a number of services (collaboration, sharing, governance, search) accessible via any computing or mobile device. Jim told us how they’d integrated Solr into their system to provide deep content search and filtering. Interestingly, Storage Made Easy chose Solr over Elasticsearch because they are ‘not quite sure where Elastic will end up in terms of commercials’ – even though Jim worked with Shay Banon (creator of Elasticsearch) at Gigaspaces. You can see Jim’s slides here where he explains how the hardest task was indexing permissions data. I was particularly interested in the ‘visual query builder’ they had developed for clients with very complex search requirements – this chimed with our own experience of working with complex media monitoring queries.

We finished with a Solr Q&A (Upayavira was kind enough to provide many of the answers) – BlackRock had kindly provided a prize for the best question (a mini quadcopter) – our winner was very happy! Thanks again to our hosts and presenters and I look forward to seeing you all again soon.

The post London Lucene/Solr Usergroup – website search and indexing the cloud appeared first on Flax.

]]>
http://www.flax.co.uk/blog/2015/09/11/london-lucenesolr-usergroup-website-search-and-indexing-the-cloud/feed/ 0
Open source search events roundup for late 2015 http://www.flax.co.uk/blog/2015/07/29/open-source-search-events-roundup-for-late-2015/ http://www.flax.co.uk/blog/2015/07/29/open-source-search-events-roundup-for-late-2015/#respond Wed, 29 Jul 2015 10:47:06 +0000 http://www.flax.co.uk/blog/?p=1542 Although it’s still high summer here in the UK (which means it’s probably raining) we’re already looking forward to the autumn and the events across the world we’re attending. In early September we’re running another free to attend London Lucene/Solr … More

The post Open source search events roundup for late 2015 appeared first on Flax.

]]>
Although it’s still high summer here in the UK (which means it’s probably raining) we’re already looking forward to the autumn and the events across the world we’re attending. In early September we’re running another free to attend London Lucene/Solr Usergroup Meetup, sponsored this time by Blackrock who are talking about using Solr for websites. At the end of September there is another Elasticsearch London Meetup which we will also attend (and may be speaking at this time).

October brings the biggest event in the Lucene/Solr calendar, Lucene Revolution in Austin, Texas, a 4-day event with training and a conference. We’re happy to announce that Alan Woodward and Matt Pearce from Flax will be presenting “Searching the Stuff of Life: BioSolr” about our work with the European Bioinformatics Institute where we’ve been developing Solr features for use by bioinformaticians (and any others who find them useful of course!), for example ontology indexing and external JOINs.

A week later we’ll be at Enterprise Search Europe, where I’ll be delivering the keynote on The Future of Search (you can see an earlier version of this talk from the IKO Singapore conference last month). We’re also running a Meetup on the evening of the 20th open to both conference attendees and others – an informal chance to chat with other search folks. During the conference itself I’m particularly looking forward to hearing from Ian Williams of NHS Wales on Powering the Single Patient Record in NHS Wales with Apache Solr – this is a very large scale and exciting project using Solr for healthcare data.

Looking further ahead, in November we have plans to attend (and possibly speak) I’m speaking on Test Driven Relevancy at Search Solutions 2015, a great one-day event in London which I highly recommend, and we are planning another event running a workshop on Taming Enterprise Search in Singapore together with a partner. As ever, do let us know if you would like to meet up at an event and talk open source search!

The post Open source search events roundup for late 2015 appeared first on Flax.

]]>
http://www.flax.co.uk/blog/2015/07/29/open-source-search-events-roundup-for-late-2015/feed/ 0
Lucene/Solr London Meetup – BioSolr and Query Deep Dive http://www.flax.co.uk/blog/2015/04/24/lucenesolr-london-meetup-biosolr-and-query-deep-dive/ http://www.flax.co.uk/blog/2015/04/24/lucenesolr-london-meetup-biosolr-and-query-deep-dive/#respond Fri, 24 Apr 2015 10:26:34 +0000 http://www.flax.co.uk/blog/?p=1426 This week we held another Lucene/Solr London User Group event, kindly hosted by Barclays at their funky Escalator space in Whitechapel. First to talk were two colleagues of mine, Matt Pearce and Tom Winch, on the BioSolr project: funded by … More

The post Lucene/Solr London Meetup – BioSolr and Query Deep Dive appeared first on Flax.

]]>
This week we held another Lucene/Solr London User Group event, kindly hosted by Barclays at their funky Escalator space in Whitechapel. First to talk were two colleagues of mine, Matt Pearce and Tom Winch, on the BioSolr project: funded by the BBSRC, this is an opportunity for us to work with bioinformaticians at the European Bioinformatics Institute on improving search facilities for systems including the Protein Databank in Europe (PDBe). Tom spoke about how we’ve added features to Solr for autocompleting searches using facets and a new way of integrating external similarity systems with Solr searches – in this case an EBI system that works with protein data – which we’ve named XJoin. Matt then spoke about various ways to index ontology data and how we’re hoping to work towards a standard method for working with ontologies using Solr. The code we’ve developed so far is available in our GitHub repository and the slides are available here.

Next was Upayavira of Odoko Ltd., expert Solr trainer and Apache Foundation member, with an engaging talk about Solr queries. Amongst other things he showed us some clever ways to parameterize queries so that a Solr endpoint can be customized for a particular purpose and how to combine different query parsers. His slides are available here.

Thanks all our speakers, to Barclays for providing the venue and for some very tasty food and to all who attended. We’re hoping the next event will be in the first week of June and will feature talks on measuring and improving relevancy with Solr.

The post Lucene/Solr London Meetup – BioSolr and Query Deep Dive appeared first on Flax.

]]>
http://www.flax.co.uk/blog/2015/04/24/lucenesolr-london-meetup-biosolr-and-query-deep-dive/feed/ 0
Elasticsearch London Meetup: Templates, easy log search & lead generation http://www.flax.co.uk/blog/2015/01/30/elasticsearch-london-meetup-templates-easy-log-search-lead-generation/ http://www.flax.co.uk/blog/2015/01/30/elasticsearch-london-meetup-templates-easy-log-search-lead-generation/#comments Fri, 30 Jan 2015 14:01:05 +0000 http://www.flax.co.uk/blog/?p=1363 After a long day at a Real Time Analytics event (of which more later) I dropped into the Elasticsearch London User Group, hosted by Red Badger and provided with a ridiculously huge amount of pizza (I have a theory that … More

The post Elasticsearch London Meetup: Templates, easy log search & lead generation appeared first on Flax.

]]>
After a long day at a Real Time Analytics event (of which more later) I dropped into the Elasticsearch London User Group, hosted by Red Badger and provided with a ridiculously huge amount of pizza (I have a theory that you’ll be able to spot an Elasticsearch developer in a few years by the size of their pizza-filled belly).

First up was Reuben Sutton of Artirix, describing how his team had moved away from the Elasticsearch Ruby libraries (which can be very slow, mainly due to the time taken to decode/encode data as JSON) towards the relatively new Mustache templating framework. This has allowed them to remove anything complex to do with search from their UI code, although they have had some trouble with Mustache’s support for partial templates. They found documentation was somewhat lacking, but they have contributed some improvements to this.

Next was David Laing of CityIndex describing Logsearch, a powerful way to spin up clusters of ELK (Elasticsearch+Logstash+Kibana) servers for log analysis. Based on the BOSH toolchain and open sourced, this allows CityIndex to create clusters in minutes for handling large amounts of data (they are currently processing 50GB of logs every day). David showed how the system is resilient to server failure and will automatically ‘resurrect’ failed nodes, and interestingly how this enables them to use Amazon spot pricing at around a tenth of the cost of the more stable AWS offerings. I asked how this powerful system might be used in the general case of Elasticsearch cluster management but David said it is targetted at log processing – but of course according to some everything will soon be a log anyway!

The last talk was by Alex Mitchell and Francois Bouet of Growth Intelligence who provide lead generation services. They explained how they have used Elasticsearch at several points in their data flow – as a data store for the web pages they crawl (storing these in both raw and processed form using multi-fields), for feature generation using the term vector API and to encode simple business rules for particular clients – as well as to power the search features of their website, of course.

A short Q&A with some of the Elasticsearch team followed: we heard that the new Shield security plugin has had some third-party testing (the details of which I suggested are published if possible) and a preview of what might appear in the 2.0 release – further improvements to the aggregrations features including derivatives and anomaly detection sound very useful. A swift drink and natter about the world of search with Mark Harwood and it was time to get the train home. Thanks to all the speakers and of course Yann for organising as ever – see you next time!

The post Elasticsearch London Meetup: Templates, easy log search & lead generation appeared first on Flax.

]]>
http://www.flax.co.uk/blog/2015/01/30/elasticsearch-london-meetup-templates-easy-log-search-lead-generation/feed/ 1
Enterprise Search & Discovery 2014, Washington DC http://www.flax.co.uk/blog/2014/11/12/enterprise-search-discovery-2014-washington-dc/ http://www.flax.co.uk/blog/2014/11/12/enterprise-search-discovery-2014-washington-dc/#respond Wed, 12 Nov 2014 10:49:57 +0000 http://www.flax.co.uk/blog/?p=1301 Last week I attended Enterprise Search & Discovery 2014, part of the KMWorld conference in Washington DC. I’d been asked to speak on Turning Search Upside Down and luckily had the first slot after the opening keynote: thanks to all … More

The post Enterprise Search & Discovery 2014, Washington DC appeared first on Flax.

]]>
Last week I attended Enterprise Search & Discovery 2014, part of the KMWorld conference in Washington DC. I’d been asked to speak on Turning Search Upside Down and luckily had the first slot after the opening keynote: thanks to all who came and for the great feedback (there are slides available to conference attendees, I’ll publish them more widely soon, but this talk was about media monitoring, our Luwak library and how we have successfully replaced Autonomy IDOL and Verity with a powerful open source solution for a Scandinavian monitoring firm).

Since ESSDC is co-located with KMWorld, Sharepoint Symposium and Taxonomy Bootcamp, it feels like a much larger event than the similar Enterprise Search Europe, although total numbers are probably comparable. It was clear to me that the event is far more focused on a business rather than technical audience, with most of the talks being high-level (and some being simply marketing pitches, which was a little disappointing). Mentions of open source search were common (from Dion Hinchcliffe’s use of it as an example of a collaborative community, to Kamran Kahn’s example of Apache Solr being used for very large scale search at the US National Archives). Unfortunately a lot of the presenters started with the ‘search sucks, everyone hates search’ theme (before explaining of course that their own solution would suck less) which I’m personally becoming a little tired of – if we as an industry continue pursuing this negative sentiment we’re unlikely to raise the profile of enterprise search: perhaps we should concentrate on more positive stories as they certainly do exist.

I spent a lot of time networking with other attendees and catching up with some old contacts (a shout out to Miles Kehoe, Eric Pugh, Jeff Fried and Alfresco founder John Newton, great to see you all again). My favourite presentation was Dave Snowden‘s fantastic and very funny debunking of knowledge management myths (complete with stories about London taxi drivers and a dig at American football) and I also enjoyed Raytion‘s realistic case studies (‘no-one is searching for the sake of searching – except us [search integrators] of course’). Presentations I enjoyed somewhat less included Brainspace (who stressed Transparency as a key value, then when I asked if their software was thus open source, explained that they would love it to be so but then they wouldn’t be able to get any investment – has anyone told Elasticsearch?) and Hewlett Packard, who tried to tell us that their new API to the venerable IDOL search engine was ‘free software’ – not by any definition I’m aware of, sorry. Other presentation themes included graph/semantic search – maybe this is finally something we can consider seriously, many years after Tim Berners Lee’s seminal paper.

Thanks to Information Today, Marydee Ojala and all others concerned for organising the event and making me feel so welcome.

The post Enterprise Search & Discovery 2014, Washington DC appeared first on Flax.

]]>
http://www.flax.co.uk/blog/2014/11/12/enterprise-search-discovery-2014-washington-dc/feed/ 0
Autumn events roundup – ESS DC, Solr vs Elasticsearch & a new Meetup http://www.flax.co.uk/blog/2014/10/27/autumn-events-roundup-ess-dc-solr-vs-elasticsearch-a-new-meetup/ http://www.flax.co.uk/blog/2014/10/27/autumn-events-roundup-ess-dc-solr-vs-elasticsearch-a-new-meetup/#respond Mon, 27 Oct 2014 16:05:24 +0000 http://www.flax.co.uk/blog/?p=1296 It’s looking like a busy Autumn for search events – first, I’m presenting at Enterprise Search & Discovery 2014 in Washington DC on November 5th, talking about ‘Turning Search Upside Down with open source software’. I’ll be describing how we’ve … More

The post Autumn events roundup – ESS DC, Solr vs Elasticsearch & a new Meetup appeared first on Flax.

]]>
It’s looking like a busy Autumn for search events – first, I’m presenting at Enterprise Search & Discovery 2014 in Washington DC on November 5th, talking about ‘Turning Search Upside Down with open source software’. I’ll be describing how we’ve replaced various underperforming, big name closed source search engines with faster & more scalable open source technology, including our own Luwak stored query engine. Do let me know if you’re in DC, I’d be very happy to meet up. The week after this is Lucene Revolution, which sadly we won’t be attending this year, but it is recommended if you’re interested in Lucene and Solr.

Towards the end of November there’s Search Solutions, a great day of presentations about all aspects of search held at the British Computer Society in Covent Garden. This year Tom Mortimer from Flax will be presenting some research we’ve done into performance comparisons between Lucene/Solr and Elasticsearch, and there are also presentations from Thomson Reuters, the British Library, Microsoft, Yahoo! and Google. I highly recommend this event, it’s always worth attending.

We’re also starting a new Meetup in London, a group for users of Apache Lucene/Solr (there’s an Elasticsearch London user group but strangely no equivalent for the other popular stack). Our first event is on November 28th, kindly hosted by Bloomberg (who are no strangers to Lucene/Solr themselves) and featuring Shalin Mangar, a Lucene/Solr committer from Lucidworks who is visiting Europe that week. We’re hoping that we can run these events every few months, but we need help from the community, so if you could talk, sponsor or host the Meetups do let us know.

In December we’ll be holding another Cambridge Search Meetup and will be talking about our work with the European Bioinformatics Institute on the BioSolr project – the date to be confirmed. Busy times!

The post Autumn events roundup – ESS DC, Solr vs Elasticsearch & a new Meetup appeared first on Flax.

]]>
http://www.flax.co.uk/blog/2014/10/27/autumn-events-roundup-ess-dc-solr-vs-elasticsearch-a-new-meetup/feed/ 0
Cambridge Search Meetup – Elasticsearch Hackday http://www.flax.co.uk/blog/2014/10/03/cambridge-search-meetup-elasticsearch-hackday/ http://www.flax.co.uk/blog/2014/10/03/cambridge-search-meetup-elasticsearch-hackday/#respond Fri, 03 Oct 2014 12:32:00 +0000 http://www.flax.co.uk/blog/?p=1287 Last Friday we hosted a hackday featuring Elasticsearch in Cambridge, following a similar event last year focused on Apache Lucene/Solr. Around 20 people attended from organisations working in sectors including analytics, digital music, bioinformatics and e-commerce, and all the Flax … More

The post Cambridge Search Meetup – Elasticsearch Hackday appeared first on Flax.

]]>
Last Friday we hosted a hackday featuring Elasticsearch in Cambridge, following a similar event last year focused on Apache Lucene/Solr. Around 20 people attended from organisations working in sectors including analytics, digital music, bioinformatics and e-commerce, and all the Flax team were there as well.

We started with a brief presentation on Elasticsearch and asked around the room for any data collections we might be able to use. Lee from Elasticsearch (the company) had brought collections of UK crime data and the complete works of Shakespeare; we also had several million rows of digital music metadata, Wikipedia edit data for all UK MPs (to follow last year’s theme!) and several years of data describing Premier League football. Unlike our Solr hackday where each team worked on the same general task, this time we split into four different teams who worked on all of the above except the Wikipedia edits. We’d also been provided with a very high-performance Elasticsearch cluster by BigStep for our use, which meant it was very quick to index the above data and start working with it.

By lunchtime (the food was sponsored by Elasticsearch, who also provided stickers, plush ELKs and lollypops – thanks guys!) we had some very basic information about the various datasets – such as which scene in which Shakespeare play has the most characters on stage (the answer is 21 in Richard III), and which football teams seemed to gain the most advantage from playing at home. Note that we had already moved beyond basic search functionality to use Elasticsearch as an analytic platform, answering particular questions, using features such as aggregations.

We continued during the afternoon to develop the various applications and finished with a ‘show and tell’. Some of the teams had managed to develop user interfaces for Elasticsearch, the most polished being a clickable Google Map that would show you which types of crime were significantly above and below the national average for the area you selected – unsurprisingly in Cambridge, stolen bicycles were very common! By the end of the day, everyone had gained experience of Elasticsearch, some for the first time. We finished the day, as is traditional, with a swift pint and further networking.

Thanks to Cambridge Business Lounge (a highly recommended co-working space) for the venue, BigStep for hosting and Elasticsearch for sponsoring lunch and providing the swag, and of course to all who attended. We’ll return with a further Cambridge Search Meetup soon!

The post Cambridge Search Meetup – Elasticsearch Hackday appeared first on Flax.

]]>
http://www.flax.co.uk/blog/2014/10/03/cambridge-search-meetup-elasticsearch-hackday/feed/ 0
Cambridge Search Meetup – Knowledge Discovery & Wayfinding http://www.flax.co.uk/blog/2014/07/03/cambridge-search-meetup-knowledge-discovery-wayfinding/ http://www.flax.co.uk/blog/2014/07/03/cambridge-search-meetup-knowledge-discovery-wayfinding/#comments Thu, 03 Jul 2014 11:53:56 +0000 http://www.flax.co.uk/blog/?p=1245 We were lucky enough to have two speakers from Cambridge text mining company Linguamatics at last night’s Meetup. Robin Newton kicked us off with an amusing and idiosyncratic view of the uses and mis-uses of search – starting with the … More

The post Cambridge Search Meetup – Knowledge Discovery & Wayfinding appeared first on Flax.

]]>
We were lucky enough to have two speakers from Cambridge text mining company Linguamatics at last night’s Meetup. Robin Newton kicked us off with an amusing and idiosyncratic view of the uses and mis-uses of search – starting with the problem that when you have text search software, every problem can look like search might solve it. He gave an example of his recent search for a new job: although matching his skills on paper with a potential employer’s needs is one thing, he also wants to be sure the employer ‘isn’t a crook’! With reference to Tyler Tate’s talks on Information Wayfinding, which in turn quotes urban planner Kevin Lynch, Robin told us how he felt that search ‘journeys’ weren’t always the most efficient way to discover an answer: his assertion was that finding a person who could tell you was more useful. Since even in the most efficient and well-run organisation not all information is held in documents one might agree that finding an ‘expert’ is the best way to get the answers one needs. He finished with a welcome message that informal networking in pubs and cafes (much like our Meetup) helps share a lot of very useful information – and this is how he eventually decided that Linguamatics was going to be a great place to work.

Next was CTO and co-founder of Linguamatics, Dr David Milward, who described his company’s capability in text mining, Natural Language Processing (NLP) and search. He described the challenges of extracting ‘concepts’ from text – how words and acronyms with multiple potential meanings are difficult to parse automatically without contextual knowledge. Linguamatics’ approach has been described as ‘Agile NLP’ and allows the quick development of new patterns for concept extraction. A powerful example he gave was how by specifying a relationship between two entities, in this case one company acquiring another, structured data can be extracted from unstructured text. Other examples focused on the medical and bioscience field (a particular interest of ours at present due to the upcoming BioSolr project) and showed how their software can cluster facts and find connections between disparate pieces of data (‘which X relates to Y via Z’). This process can also be used to generate new facets for searching from free text, including for numeric ranges, and these can even be tailored for different user groups. It’s clear that Linguamatics are experts in this area and David’s talk was of great interest to many in the room, including several from the European Bioinformatics Institute.

We finished with the usual chat, networking and drinks. Thanks to both our speakers – and do let me know if you have a suggestion for a presentation at a future event!

The post Cambridge Search Meetup – Knowledge Discovery & Wayfinding appeared first on Flax.

]]>
http://www.flax.co.uk/blog/2014/07/03/cambridge-search-meetup-knowledge-discovery-wayfinding/feed/ 1
Cambridge Search Meetup – Cassandra & Solr http://www.flax.co.uk/blog/2014/05/15/cambridge-search-meetup-cassandra-solr/ http://www.flax.co.uk/blog/2014/05/15/cambridge-search-meetup-cassandra-solr/#respond Thu, 15 May 2014 07:32:23 +0000 http://www.flax.co.uk/blog/?p=1202 A sunny evening last night for the latest Cambridge Search Meetup, which featured a couple of talks from Datastax on the highly scalable NoSQL database Apache Cassandra and how it is integrated with Apache Lucene/Solr. Jeremy Hanna started us off … More

The post Cambridge Search Meetup – Cassandra & Solr appeared first on Flax.

]]>
A sunny evening last night for the latest Cambridge Search Meetup, which featured a couple of talks from Datastax on the highly scalable NoSQL database Apache Cassandra and how it is integrated with Apache Lucene/Solr. Jeremy Hanna started us off with a brief history of the Facebook-incubated Cassandra, which is a fully distributed, highly reliable system used by many including Netflix and Spotify with some customers running thousands of nodes in multiple data centres. Cassandra has its own SQL-like language, CQL3 and some basic collections such as Lists and Maps, but due to its fully distributed nature does lack some traditional features such as JOINs. Datastax themselves are now responsible for most of the ongoing work on Cassandra and offer the usual array of training, support, management services and tools. One common application mentioned was high speed and reliable recording of sensor data, increasingly important now with the rise of the Internet of Things.

After a short break for drinks and snacks (which this time were kindly sponsored by Datastax) Sergio Bossa told us how Solr is integrated with Cassandra, also running in a distributed fashion. Interestingly, this integration doesn’t use the same Zookeeper system as SolrCloud (the standard way to run clusters of Solr servers) but relies instead on Cassandra’s own internal scaling systems, passing data about using ‘gossip‘ between nodes. Zookeeper is not always the easiest thing to get running so an alternative is very interesting! Data can be added to the system over HTTP or the aforementioned CQL3 and after being entered into Cassandra’s tables is subsequently indexed by Solr. Queries can then be made over HTTP as usual. Some work is still necessary to prevent duplication of effort (at present one needs to create data structures in Cassandra and subsequently in Solr).

It was pleasing so see that so much care has been taken with this integration process and also that Datastax offer their Datastax Enterprise Search stack not only free for non-production use, but free to startups. Thanks to Jeremy, Sergio and all who came along and we’ll be back with another Search Meetup soon.

The post Cambridge Search Meetup – Cassandra & Solr appeared first on Flax.

]]>
http://www.flax.co.uk/blog/2014/05/15/cambridge-search-meetup-cassandra-solr/feed/ 0
Enterprise Search Europe 2014 day 1 – Decisions, research and a Meetup quiz http://www.flax.co.uk/blog/2014/05/01/enterprise-search-europe-2014-day-1-decisions-research-and-a-meetup-quiz/ http://www.flax.co.uk/blog/2014/05/01/enterprise-search-europe-2014-day-1-decisions-research-and-a-meetup-quiz/#respond Thu, 01 May 2014 15:59:38 +0000 http://www.flax.co.uk/blog/?p=1185 This year’s Enterprise Search Europe was held near Victoria train station in London and unfortunately coincided with a two day strike on the London Underground – worrying for the organisers, but apart from a few notable absences it didn’t seem … More

The post Enterprise Search Europe 2014 day 1 – Decisions, research and a Meetup quiz appeared first on Flax.

]]>
This year’s Enterprise Search Europe was held near Victoria train station in London and unfortunately coincided with a two day strike on the London Underground – worrying for the organisers, but apart from a few notable absences it didn’t seem to affect the attendance too much. We started with a keynote from Dale Roberts, whose book on Decision Sourcing inspired a talk about a ‘rational decision making model’. When examining traditional relational database applications Dale said ‘if you peer at it long enough you can see the rows and columns’ and his point was that modern consumer social networking applications don’t exhibit this old pattern – so this is where search application designers should look for inspiration. His co-presenter Rooven Pakkiri said that Enterprise Search should attempt to ‘release the information from inside our heads’, which of course social networking might help with, connecting you with colleagues. I’m not sure that one can easily take lessons learnt from consumer applications and apply them to business use, and some later speakers agreed with me, but this was a high-energy and thought-provoking start.

Next I chaired the Open Source track, where we started with Cedric Ulmer of France Labs, who talked about a search application they built for a consultancy business with around 40 employees. Using Apache Solr, Apache ManifoldCF and their own Datafari open source framework they turned this project around very quickly – interestingly, the end clients needed no training to use the new system, which implies a very well designed UI. Our second talk from Ronald Hobbs of Reed Business International described a project on a much larger scale: 100 million documents, 72 business units and up to 190 queries per second – this was originally served by the FAST ESP engine but they moved to an Apache Solr system, replacing the FAST processing pipeline with Search Technologies Aspire project. His five steps for an effective migration (Prepare, Get the right tools, Get the right team, Migrate in chunks, Clean up) I can only agree with from our own experience of such projects, including one from FAST ESP to Solr. I was amused by his description of the Apache Zookeeper project as ‘a bipolar manic depressive’, although it seemed this was eventually overcome with a successful deployment on Amazon EC2. Next was Galina Hinova of Intrafind on a aftersales search application for MAN Truck and Bus – again at serious scale (MAN have around 1 billion vehicles in existence with 100-150 documents related to each). Interestingly the Euro6 regulations for emissions and standardized EU terms for automobile parts were direct drivers of the project, with Apache Lucene as the base technology. No longer is open source search just for small-scale projects it seems!

After a short break during which I chatted to John Newton, founder of Documentum Alfresco, and his team we returned to hear Dan Jackson give a description of how UCL had improved their website search – with a chaotic mix of low quality content and an ‘awful’ content management system, the challenges were myriad but with the help of experts such as our associate Tony Russell-Rose they have made significant improvements. Next was what was to prove a very popular talk from Nick Brown of AstraZeneca on a huge, well funded project to build applications to support research and development – again, this was at large scale with 75 million documents (including ‘all the patents and all the research papers’). The key here was their creation of many well-targeted ‘apps’ to enable particular uses of the Sinequa search engine they chose for the back end, including mobile apps to help find others in the company (or external to it) who are also working on a particular drug or disease. This presentation showed just what can be achieved if companies really understand the potential of search technology – knowledge sharing and discovery of previously unknown information.

After a short drinks reception we retired to a nearby pub for the combined Cambridge and London Search Meetup – I’d prepared a short quiz (feel free to have a go!) which was won by Tony Russell-Rose’s team. Networking and chatting continued long into the evening, with some people from the wider UK search community also attending.

To be continued! You can see most of the slides here.

The post Enterprise Search Europe 2014 day 1 – Decisions, research and a Meetup quiz appeared first on Flax.

]]>
http://www.flax.co.uk/blog/2014/05/01/enterprise-search-europe-2014-day-1-decisions-research-and-a-meetup-quiz/feed/ 0