We have some news! From February 1st 2019 Flax's Managing Director Charlie Hull will be joining OpenSource Connections (OSC), Flax's long-standing US partner, as a senior Managing Consultant. Charlie will manage a new UK division of OSC who will also acquire some of Flax's assets and brands. OSC are a highly regarded organisation in the world of search and relevance, wrote the seminal book Continue reading
Tag Archives: flax
Meetups, genomes and hack days: Grant Ingersoll visits the UK
Lucene/Solr commiter, Mahout co-creator, LucidWorks co-founder and general all-round search expert Grant Ingersoll visited us last week on his way to the SIGIR conference in Dublin. We visited the Continue reading
Open source intranet search over millions of documents with full security
Last year my colleague Tom Mortimer talked about indexing security information within an open source enterprise search application, and we're happy to announce more details of the project. Our client is an international radio supplier, who had considered both closed source products and search appliances, but chose open source for greater flexibility and the much lower cost of scaling...Continue reading
Building a new press cuttings service for the Financial Times
Those of you who read my slides from Search Solutions 2010 will have spotted a case study on our work for the Financial Times, one of the world’s leading business news organisations. When the Financial Times decided to bring their digital press cuttings in-house in summer 2010, they asked us to build a powerful 'search server' that they could easily integrate into their existing product offerings. We built an indexer...Continue reading
Open source search engines and programming languages
So you're writing a search-related application in your favourite language, and you've decided to choose an open source search engine to power it. So far, so good - but how are the two going to communicate? Let's look at two engines, Xapian and Lucene, and compare how this might be done. Lucene is written in Java, Xapian in C/C++ - so if you're using those languages respectively, everything should be relatively simple - j...Continue reading
flax.crawler arrives
We've recently uploaded a new crawler framework to the Flax code repository. This is designed for use from Python to build a web crawler for your project. It's multithreaded and simple to use, here's a minimal example:
import crawler
crawler.dump = MyContentDumperImplementati...Continue reading
flax.core 0.1 available
Charlie wrote previously that we try and work with flexible, lightweight frameworks: flax.core is a Python library for conveniently adding functionality to Xapian projects. The current (and first!) version is 0.1, which can be checked out from the flaxcode repository. This version supports named fields for inde...Continue reading
Packaged solutions and customisability, the Python way
With any large scale software installation, there is going to be some customisation and tweaking necessary, and enterprise search systems are no exception. Whatever features are packaged with a system, some of those you need will be missing and some won't be used at all. It's rare to see a situation where the search engine can just be installed straight out of the box. Our Flax system is based on the Xapian core, which has a set of bindings to various differe...Continue reading
The Times they are a-changing….
News International have announced they will be charging for access to their Times and Sunday Times newspaper websites within a few months. At the same time we have the announcement that the Independent newspaper is to be bought by a Russian oligarch, and may end up as a free publication. This divergence of business models is interesting, but what concerns us at Flax is how ...Continue reading
Some new open source file filters & previewers
We've just released an early version of Flax Filters, which allow basic conversion of various proprietary formats to plain text ready for indexing. Currently the filters support Microsoft Word, Excel and Powerpoint, the Open Office equivalent formats, Adobe PDF, plain text and HTML, but we'll be adding more in the future (of course, we'd welcome contributions from third parties). We're already using these filte...Continue reading