Archive for February, 2011

Bicycles, beer and bands – the first Cambridge Enterprise Search Meetup

Last night we held the first of what we hope will be a series of Meetups in our home town of Cambridge, U.K. Attending were researchers, developers and entrepreneurs in the field of search – as is the norm in Cambridge many had cycled to the venue, and there was a friendly and informal feel to the group.

We started with my presentation on “Searching news media with open source software”, where I talked about our work for the NLA, Financial Times and others. We followed with John Snyder of Grapeshot on “Using Search to Connect Multiple Variants of An Object to One Central Object”. John showed a Grapeshot project for Virgin where different media assets can be automatically grouped together even if they have different metadata – for example an episode of the TV show “Heroes” is basically the same object whether it is broadcast, video-on-demand or a repeat, but differs from the Bowie album of the same name.

We then broke up for discussion (and beer) – great to catch up with some ex-colleagues and meet others for the first time. Downstairs there was live music and one of our colleagues even joined the band for a spell on drums! From the feedback we recieved there’s definitely interest in repeating the event, if you’d like to attend next time please join the Meetup group.

Tags: , , , ,

Posted in events

February 17th, 2011

1 Comment »

Cambridge Enterprise Search Meetup tomorrow

A quick reminder that our first Cambridge Enterprise Search Meetup is tomorrow, February 16th from 6.30pm. More details in my previous post. We now have two talks, one from myself on “Open Source Search for News” and one from John Snyder of Grapeshot on “Using Search to Connect Multiple Variants of An Object to One Central Object”.

If you’re able to come please let us know using the Meetup website so we can organise enough refreshments!

Tags: , ,

Posted in events

February 15th, 2011

1 Comment »

Enterprise Search London – Financial applications, SBA book and Solr searching 120m documents

Another excellent evening as part of the Enterprise Search London Meetup series; very busy as usual.

Amir Dotan started us off with details of his work in designing user interfaces for the financial services sector, describing some of the challenges involved in designing for a high-pressure and highly regulated environment. Although he didn’t talk about search specifically we heard a lot about how to design useful interfaces. Two quotes stood out: “The right user interface can help make billions”, and as a way to get feedback “find someone nice in the business and never let them go”.

Gregory Grefenstette of Exalead was next, talking about his new book on Search Based Applications. He explained how SBAs have advantages over traditional databases in the three areas of agility, usability and performance and went on to show some examples, before an unfortunate combination of a broken slide deck and a failing laptop battery brought him to a halt: in retrospect a great advertisement for a physical book over a computer!

Upayavira of Sourcesense was next with details of a new search built for online news aggregator Moreover. This dealt with scaling Lucene/Solr to cope with indexing 2 million new documents a day, for a rolling 2 month index. He showed how some initial memory and performance problems had been solved with a combination of pre-warming caches, tweaks to the JVM and Java garbage collector and eventually profiling of their custom code. Particularly interesting was how they had developed a system for spinning up a complete copy of the searchable database (for load balancing purposes) on the Amazon EC2 cloud – from a standing start they can allocate servers, install software and copy across searchable indexes in around 40 minutes. This was a great demonstration of the power of the open source model – no more licenses to buy! Search performance over this large collection is pretty good as well, with faceted queries returning in a second or two and unfaceted in half a second.

We also heard from Martin White about an exciting new search related conference to be held in October this year in London in association with Information Today, Inc., and I managed a quick plug for our inaugural Cambridge Enterprise Search Meetup on Wednesday 16th February.

Tags: , , , , , ,

Posted in events

February 10th, 2011

1 Comment »

Open Source action in UK government

I’ve been reading the revised Open Source, Open Standards and ReUse: Government Action Plan – it’s surprising (and heartening) to see this has existed in one form or another since as far back as 2004.

The key changes for this version are:

  • suppliers have to show evidence they’ve considered open source options – hopefully this will be more than a quick trawl through SourceForge
  • ’shadow license costs’ have to be shown in calculations to take account of previous purchases of ‘perpetual’ licenses – apparently in some cases this could make software license fees for a project appear as zero!
  • all purchases have to be on the basis of of re-use across the government sector – so no need to pay again if a system moves to the cloud in the future
  • This all sounds great for the open source community; let’s also hope that increased openness in government means that we’ll be able check the Action Plan is actually being followed!

    By the way a great example of open source in action on government data is They Work For You, which cleans up Hansard and makes more accessible – search is powered by Xapian.

    Tags: , , ,

    Posted in News

    February 2nd, 2011

    No Comments »