Archive for the ‘events’ Category

Search events for 2013

Here’s a quick roundup of search-related events coming soon:

Next week Lucene/Solr Revolution is to be held in San Diego, with a couple of days of training on April 29th & 30th and the main event on the 1st and 2nd May. This is probably the biggest event dedicated to Apache Lucene/Solr and features a huge array of presentations from Etsy, Wells Fargo, Lucidworks and even Microsoft who are increasingly supporting open source technologies.

Enterprise Search Europe is next on 15th and 16th May with a day of workshops on the 14th, including one from the Flax team. I’m looking forward to the various open source panels and presentations of course, and hearing from people from Ernst & Young, Neilsen Norman Group, Oracle and the University of Manchester. We’re also running a Meetup event on the first evening, open to all, with the usual informal mix of beer, snacks and search!

Some of the Flax team are hoping to attend Berlin Buzzwords on June 3rd & 4th – this conference promises to address “search”, “store” and “scale” – certainly sounds interesting! We know there will be lots of talks on elasticsearch and Lucene/Solr.

There’s more to come in the Autumn of course – more details when we know them. Hope to meet you at one of these great events!

Why we won’t pay to play at conferences

One unedifying result of having been asked to speak on open source search at various events and conferences over the last few years is the discovery that not all events are equal – some genuinely wish to create a programme of interesting talks of value to the audience, and some simply wish to sell as much sponsorship as possible to those who would like to present. Some of the larger analyst firms are guilty of this behaviour – their Summits and Forums are often packed with talks by big-budget solution providers (and their industry sector reports similarly reflect the fact that if you pay, you play). At Flax we don’t have much budget for sponsorship so we’re often excluded, even though the talks we give are seldom if ever pushing any particular solution – a benefit of the open source model is that even if you hear about it from us you can still go and download and use the software yourself without paying us or anyone else a penny.

Luckily there are events that don’t work like this – the excellent Search Solutions day run in late Autumn by the British Computer Society and of course Enterprise Search Europe (disclaimer: I’m on the programme committee for the latter). My view is this means we get a higher quality set of talks, presenters who know and can discuss their subject rather than just reading out the company-approved Powerpoint deck, and attendees can see a wider range of views and options.

Cambridge Search Meetup – a night of crawling and scraping

Last night was the busiest ever Cambridge Search Meetup, with two excellent talks and a lot of discussion and networking. First was Harry Waye of Arachnys, who provide access to data on emerging markets that no-one else has using a variety of custom crawling technology and heavy use of tools such Google Translate. If you want to trawl the Greek corporate registry or find out financial news from Kazakhstan a standard Google search is little help: Harry talked about how Arachnys have experimented with Google Custom Search Engine and the ‘headless browser’ PhantomJS to crawl sites.

Our second talk was from Shane Evans, who I first met when he led software development for our client Mydeco. While there he first worked on the development of an open source Python crawling framework, Scrapy: Shane showed how easy it is to get a Scrapy web spider running in a few lines of code, and how extensible and customisable Scrapy is for a huge variety of crawling and scraping situations. There’s even a fully hosted version at Scrapinghub with graphical tools for setting up web crawling and page scraping. We’re big fans of Scrapy at Flax and we’ve used it in a number of projects, so it was good to see an overview of why Scrapy exists and how it can be used.

Thanks to both our speakers who both travelled from out of town as did several other attendees: we’re pleased to say this was our 15th Meetup and we now have 100 members – we’re already planning further events, one will be on the evening of the first day of the Enterprise Search Europe conference.

Tags: , , , , ,

Posted in Technical, events

February 22nd, 2013

No Comments »

Search Meetups return with news of two search books

Last night the London Search Meetup returned after a year’s absence: it’s great to see it back. The venue was at St Pancras with the room overlooking Eurostar trains and statues, inside the beautifully restored station building.

The speakers were both there to talk about their recent books: prolific author Martin White of Intranet Focus has written a book on Enterprise Search with the strapline ‘Enhancing Business Performance’. Martin has decades of experience in the sector, an enviable collection of war stories from inside the enterprise and was as ever an engaging speaker.

Next up were Tony Russell-Rose and Tyler Tate to talk about their new book which focuses on the user experience of search. ‘Designing the Search Experience’ promises to be a rich resource on how, why and where people use search and how this impacts the design of user interfaces.

The evening ended with some lively discussion and a promise that after its long absence this Meetup will now be happening on a more regular basis. We’re also running our own Cambridge search meetup – the next event is on February 21st where we’ll be hearing about web crawling and scraping. Another date for your diary is the Enterprise Search Europe conference on May 15th & 16th this year – the programme has just been published and features speakers from Ernst & Young and Oracle. I’ll also be running a workshop the day before the conference on Getting the Best from Open Source Search.

Tags: , , , ,

Posted in events

February 12th, 2013

No Comments »

Business Leaders, Open Source and free Pi

I spent last night at a networking event organised by the Business Leaders Network on the subject of Open Source Business Models – this isn’t the usual sort of event I attend, being held in a very posh law firm’s offices overlooking the Thames and with some fellow attendees from venture capital firms and investment banks. Although the panel included speakers from Canonical, Rackspace and the Raspberry Pi foundation (the gently amusing Jack Lang, a Cambridge luminary who I could have happily listened to for the full hour) the theme was generally non-technical.

Questions from the floor (and via Twitter) showed that many outside the technical sector (and probably a few within it) are still bemused at how one can build a thriving business on open source, when the panel admitted that it can involve making your intellectual property available to your competitors, giving your product away for nothing and investing heavily in community building. One of the most interesting responses from the panel indicated that an open source entrant to an existing market can shrink that market by 40-50% – a venture capitalist I spoke to afterwards couldn’t understand why this can be a positive thing: however if a market is dominated by big players selling overpriced solutions, some disruptive deflation can re-shape the market considerably: this is certainly what we’ve seen in the search sector recently, and investment in the right place and time can still reap considerable rewards (consider Elasticsearch’s recent funding).

The panel also made the point that a key part of open source success is investment in people – both within a business and in the wider community. Another question about what an open source business is actually selling prompted a range of answers: a brand, peach of mind, happiness, experience, platform were the answers given. It was clear that the discussion could have continued for a lot longer as the audience were keen to hear more, and the BLN may thus be running further open source themed events – the appetite for knowledge about open source business models outside the technical community is large.

Thanks to Mark Littlewood for organising such an interesting evening and particular thanks for the free Raspberry Pi – we have a cunning plan about what to do with it so watch this space!

Tags: , , ,

Posted in Business, events

February 7th, 2013

No Comments »

Search Solutions 2012 – a review

Last Thursday I spent the day at the British Computer Society’s Search Solutions event, run by their Information Retrieval Specialist Group. Unlike some events I could mention, this isn’t a forum for sales pitches, over-inflated claims or business speak – just some great presentations on all aspects of search and some lively networking or discussion. It’s one of my favourite events of the year.

Milad Shokouhi of Microsoft Research started us off showing us how he’s worked on query trend analysis for Bing: he showed us how some queries are regular, some spike and go and some spike and remain – and how these trends can be modelled in various ways. Alex Jaimes of Yahoo! Barcelona talked about a human centred approach to search – I agree with his assertion that “we’re great at adapting to bad technology” – still sadly true for many search interfaces! Some of the demographic approaches have led to projects such as Yahoo! Clues which is worth a look.

Martin White of Intranet Focus was up next with some analysis of recent surveys and research, leading to some rather doom-laden conclusions about just how few companies are investing sufficiently in search. Again some great quotes: “Information Architects think they’ve failed if users still need a search engine” and a plea for search vendors (and open source exponents) to come clean about what search can and can’t do. Emma Bayne of the National Archives was next with a description of their new Discovery catalogue, a similar presentation to the one she gave earlier in the year at Enterprise Search Europe. Kristian Norling of Findwise finished with a laconic and amusing treatment of the results from Findwise’s survey on enterprise search – indicating that those who produce systems that users are “very satisfied” usually do the same things, such as regular user testing and employing a specialist internal search team.

Stella Dextre Clark talked next about a new ISO standard for thesauri, taxonomies and their interopability with other vocabularies – some great points on the need for thesauri to break down language barriers, help retrieval in enterprise situations where techniques such as PageRank aren’t so useful and to access data from decades past. Leo Sauermann was next with what was my personal favourite presentation of the day, about a project to develop a truly semantic search engine both for KDE Linux and currently the Cloud. This system, if more widely adopted, promises a true revolution in search, as relationships between data objects are stored directly by the underlying operating system. I spoke next about our Clade taxonomy/classification system and our Flax Media Monitor, which I hope was interesting.

Nicholas Kemp of DSTL was up next exploring how they research new technologies and approaches which might be of interest to the defence sector, followed by Richard Morgan of Funnelback on how to empower intranet searchers with ways to improve relevance. He showed how Funnelback’s own intranet allows users to adjust multiple factors that affect relevance – of course it’s debatable how these may be best applied to customer situations.

The day ended with a ‘fishbowl’ discussion during which a major topic was of course the Autonomy/HP debacle – there seemed to be a collective sense of relief that perhaps now marketing and hype wouldn’t dominate the search market as much as it had previously…but perhaps also that’s just my wishful thinking! All in all this was as ever an interesting and fun day and my thanks to the IRSG organisers for inviting me to speak. Most of the presentations should be available online soon.

Cambridge Search Meetup – Search for publication success and low-cost apps

After a short break the Cambridge Search Meetup returned last night with our usual mix of presentations, questions, networking, beer and snacks. We had a few issues with the projector and cables (one of these is on the shopping list for next time) so thanks to both presenters and audience for their patience!

First up was Liang Shen with a description of Journal Selector, a system for helping those publishing academic papers to find the correct journals to approach. The system allows one to copy and paste a chunk of a paper to a website and find which journals best match the subject matter, based on what they have published in the past. Running on the Amazon EC2 cloud the service indexes journals from feeds, HTML webpages and other sources, processes and stores this data in Amazon’s Hadoop-compatible database, indexes it with Apache Solr and then presents the results via the Drupal CMS. The results are impressive, allowing users to see exactly on what basis the system has recommended a journal to approach. You can see the presentation slides here.

Next was Rich Marr, who bravely offered to live-code a demonstration of his low-cost prototyping methodology for startups needing both NoSQL data storage and search across this data. In only 20 lines or so of code he showed us how to use Node.js to build a simple server that could accept messages (over Telnet, although HTTP or even IMAP would be as easy), store them in a CouchDB database and index them for searching (using a different message) with Elasticsearch. Rich’s demo prompted a lively discussion of how commoditized and componentized search technology is becoming, with open source components that allow one to build a prototype search engine in minutes.

Thanks to both our speakers – and the Meetups continue, with Rich Marr’s own London Open Source Search Social meeting on Tuesday 23rd October, and in Cambridge the Data Insights Meetup where I’ll be talking on November 1st.

Search and other events for Autumn 2012

The diary is beginning to fill up – here are a few events we’ll be involved with over the next few months. Firstly we’re running another Cambridge Search Meetup on October 17th – this is an informal gathering of people interested in search, we have one great talk already on ‘Making search accessible to low cost apps’ and another to be confirmed, plus snacks, beer and even some live music afterwards. If you’re in Cambridge or nearby (it’s only an hour or so from London by train) do come along.

We’ll be briefly visiting the trade stands at FIBEP 2012 on October 4th in the historic town of Krakow, Poland – this is part of a major media monitoring event, the 45th FIBEP Congress. We’re looking forward to meeting companies in the media monitoring sector and talking about some of our projects in that area.

On November 29th we’re planning to attend Search Solutions 2012 in at the BCS in Covent Garden, London – this is an excellent one-day event on all the technical aspects of search. You can read my review of last year’s event to find out more about what to expect.

There’s sure to be more to come!

Tags: , , ,

Posted in events

September 18th, 2012

No Comments »

Eleven years of open source search

It’s now eleven years since we started Flax (initially as Lemur Consulting Ltd) in late July 2001, deciding to specialise in search application development with a focus on open source software. At the time the fallout from the dotcom crash was still evident and like today the economic picture was far from rosy. Since few people even knew what a search engine was (Google was relatively new and had only started selling advertising a year before) it wasn’t always easy for us to find a market for our services.

When we visited clients they would list their requirements and we would then tell them how we believed open source search could help (often having to explain the open source movement first). Things are different these days: most of our enquiries come from those who have already chosen open source search software such as Apache Lucene/Solr but need our help in installing, integrating or supporting it. There’s also a rise in those clients considering applications and techniques outside the traditional site search or intranet search – web scraping and crawling for data aggregation, taxonomies and automatic classification, automatic media monitoring and of course massive scalability, distributed processing and Big Data. Even the UK government are using open source search.

So after all this time I’m tending to agree with Roger Magoulas of O’Reilly: open source won, and we made the right choice all those years ago.

An open day on open source search from Sirius & Flax

We spent Friday at the riverside offices of Sirius Corporation, our support partners, for the first and hopefully not the last of their Open Days on open source enterprise search. We were lucky to have Mike Davis, a very well known and highly experienced analyst to open the talks – despite suffering from flu he gave an engaging talk on why open source enterprise search software should be your first port of call, and how you should only consider closed source options when you need particular features they provide.

We then gave a quick Introduction to Open Source Search, detailing the various packages available (from Apache Lucene/Solr to Xapian and Sphinx) and showing a quick Solr-powered demo we’d built to search some pages from the BBC Music website. Using the programmer’s first choice for an example query (the ever reliable ‘foo*’) we discovered the wonderfully named Original Rabbit Foot Spasm Band – which interestingly you can’t find via the BBC’s own site search engine due to lack of wildcard support.

Andrew Savory, Sirius’ CTO and Apache Foundation member, then gave a presentation on what an Apache project actually is and how best to engage with an open source community – very useful for those considering open source for the first time. The morning finished with a delicious barbeque on the riverbank provided by Sirius. We thought the event went very well and we’d love to confirm the rumour that this will become a regular event. Thanks to all at Sirius for organising and hosting the day and we look forward to returning.