How we built a search engine for UK MP tweets with Solr, Python & StanfordNLP

Matt Pearce writes: We recently released UKMP, a search application built on work done on last year's Enterprise Search hack day. This presents the tweets of UK Members of Parliament with search options including filtering by party, retweet and favourite count, and entities (people, locations and organisations) ex...Continue reading

Cambridge Search Meetup – a night of crawling and scraping

Last night was the busiest ever Cambridge Search Meetup, with two excellent talks and a lot of discussion and networking. First was Harry Waye of Arachnys, who provide access to data on emerging markets that no-one else has using a variety of custom crawling technology and heavy use of tools such Google Translate. If you want to trawl the Greek corporate registry or find out financial news...Continue reading

Open source search engines and programming languages

So you're writing a search-related application in your favourite language, and you've decided to choose an open source search engine to power it. So far, so good - but how are the two going to communicate? Let's look at two engines, Xapian and Lucene, and compare how this might be done. Lucene is written in Java, Xapian in C/C++ - so if you're using those languages respectively, everything should be relatively simple - j...Continue reading

Packaged solutions and customisability, the Python way

With any large scale software installation, there is going to be some customisation and tweaking necessary, and enterprise search systems are no exception. Whatever features are packaged with a system, some of those you need will be missing and some won't be used at all. It's rare to see a situation where the search engine can just be installed straight out of the box. Our Flax system is based on the Xapian core, which has a set of bindings to various differe...Continue reading

Python and Flax presentation

My colleague Richard Boulton will be presenting at Europython in Birmingham, U.K. next week, specifically at 15.30 on Tuesday 30th June - an abstract is available. He'll be talking about Xapian, Xappy and Flax, and showing examples of these in action including one using a Django integration layer. Update: you can now <...Continue reading