Archive for July, 2009

Search Patterns, a great collection

Peter Morville has created a Flickr collection of ’search patterns’, showing the different kind of search interfaces available. I can highly recommend you take a look if you’d like some good examples of clustering, faceted navigation, auto-suggest and interfaces for certain sectors such as e-commerce. We often find these concepts difficult to explain to customers without some real-world examples.

Tags: ,

Posted in Reference

July 30th, 2009

No Comments »

Open Source Search event in Cambridge on 29th September

We’re sponsoring a one-day event on open source search – details here, there will be more announced soon. Hope some of you can make it!

Tags: , , ,

Posted in News

July 27th, 2009

No Comments »

New Events page

You can now see a list of events and conferences we’ll be attending – hope to meet some of you there!

Tags: ,

Posted in News

July 14th, 2009

No Comments »

Whitepaper on enterprise search

Our technical partners Cognidox have released a whitepaper detailing their view of the enterprise search market, titled “Why you can’t just ‘Google’ for Enterprise Knowledge” – it’s well worth a read. You can download the PDF from their archive.

Tags: , , ,

Posted in News

July 13th, 2009

No Comments »

Enterprise search – for free

We recently helped a small marine consultancy, running a Windows network, implement a completely free enterprise search solution. Even SMEs are now finding it hard to keep on top of the information they produce, and there are few low-cost options for searching their documents. Read the case study here (PDF).

Tags: , ,

Posted in Business

July 10th, 2009

No Comments »

Xapian compared

Vik Singh has been comparing various open source solutions for search. He only spent a weekend performing the comparison, which is probably not enough time to get any search software performing at its best, and his results reflect this. Xapian was marked down for being slow at indexing (he says 5x slower than SQLite – but then again, SQLite isn’t a search engine, it’s a RDBMS, and really isn’t suitable for search applications) and for producing large index files, much bigger than Lucene.

The reason for this is that Xapian stores different information to Lucene. For example, the full term list (un-inverted index) is retained, which makes it possible to do relevance feedback. Also, Lucene handles deletes by maintaining a separate list of deleted documents, which is merged at the next optimise step – which means that the internal statistics are wrong until this point, and that updates can be more complicated, as an updated document needs a new ID.

Neither approach is wrong and both have advantages – Lucene certainly has smaller index files. Some judicious use of the XAPIAN_FLUSH_THRESHOLD parameter, as suggested in some of the comments on the article, would have certainly speeded up Xapian indexing. We can also look forward to the release of the new Xapian ‘Chert’ backend, which will produce indexes at least 50% smaller than the current ‘Flint’ backend. It’s also hard to say how important index sizes are in these days of cheap storage.

On the search side, Xapian performed comparably to Lucene in terms of relevance and search speed (both were ahead of all the other solutions on these metrics, especially SQLite). There are some other metrics he quoted, such as a ’support’ figure, given as a score out of 5, which he admits is entirely subjective – you’d have to ask our customers about that one! There’s also no comparison of features, ease of integration and scalability to very large collections.

We’ve talked before about performance metrics. Vik should be applauded for his article and for releasing his test framework as open source, hopefully this can be a foundation for some more in-depth studies.

Perl client for Flax Search Server

Flax Search Server now has a Perl client, thanks to the guys at Cognidox, who have blogged about why they needed to improve the search facility for their powerful document management system.

Tags: , , , ,

Posted in Uncategorized

July 1st, 2009

No Comments »