We’ve been working on a client project where we needed to replace the dtSearch closed source search engine, which doesn’t perform that well at scale in this case. As the client has significant investment in stored queries (it’s for a monitoring application) they were keen that the new engine spoke exactly the same query language as the old – so we’ve built a version of Apache Lucene to replace dtSearch. There are a few other modifications we had to do as well, to return such things as positional information from deep within the Lucene code (this is particularly important in monitoring as you want to show clients where the keywords they were interested in appeared in an article – they may be checking their media coverage in detail, and position on the page is important).
First, we developed a new Lucene Analyzer that speaks the same syntax as dtSearch, allowing us to index text input. On the search side we have a Lucene QueryParser that shares this syntax. To make it easier to use we’ve wrapped the whole lot in a modified Solr server. As we needed some features of very recent Lucene code, our modifications are based on a patch to Lucene trunk (and so the source code isn’t for the faint hearted – if you need it let us know, but we’re not currently providing it for download).
We’re not sure if there’s anyone else out there who needs an open source alternative to dtSearch – but in case there please contact us.
UPDATE: We’ve had many people contact us in the 6 years since this article was written asking for the query parser code – I’m afraid the original code is very out of date and certainly wouldn’t work with modern Lucene or Lucene-based search engines like Solr/Elasticsearch. We’re thus not able to provide it. However, if you do have a dtSearch migration project we may be able to help you on a consultancy basis (we have carried out several similar projects for our clients) – do contact us for details.
More generally, what this project demonstrates is that even if you have significant investment in your existing search infrastructure it is entirely possible to move to an open source alternative, which may be faster and will almost certainly be more economically scalable. Does anyone else have a search engine they’d like to replace?