<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Flax Blog &#187; Add new tag</title>
	<atom:link href="http://www.flax.co.uk/blog/tag/add-new-tag/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.flax.co.uk/blog</link>
	<description>Open source &#38; enterprise search</description>
	<lastBuildDate>Wed, 25 Jan 2012 14:56:17 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Hiring</title>
		<link>http://www.flax.co.uk/blog/2009/08/04/hiring/</link>
		<comments>http://www.flax.co.uk/blog/2009/08/04/hiring/#comments</comments>
		<pubDate>Tue, 04 Aug 2009 12:11:58 +0000</pubDate>
		<dc:creator>charlie</dc:creator>
				<category><![CDATA[Business]]></category>
		<category><![CDATA[Add new tag]]></category>

		<guid isPermaLink="false">http://www.flax.co.uk/blog/?p=195</guid>
		<description><![CDATA[<p>We&#8217;re finding more and more clients interested in the advantages of a powerful open source enterprise search engine. Thus, we&#8217;re looking at <a href="http://www.flax.co.uk/hiring.shtml">expanding the team</a> &#8211; can you help?</p>
]]></description>
			<content:encoded><![CDATA[<p>We&#8217;re finding more and more clients interested in the advantages of a powerful open source enterprise search engine. Thus, we&#8217;re looking at <a href="http://www.flax.co.uk/hiring.shtml">expanding the team</a> &#8211; can you help?</p>
]]></content:encoded>
			<wfw:commentRss>http://www.flax.co.uk/blog/2009/08/04/hiring/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Enterprise search &#8211; for free</title>
		<link>http://www.flax.co.uk/blog/2009/07/10/enterprise-search-for-free/</link>
		<comments>http://www.flax.co.uk/blog/2009/07/10/enterprise-search-for-free/#comments</comments>
		<pubDate>Fri, 10 Jul 2009 16:06:13 +0000</pubDate>
		<dc:creator>charlie</dc:creator>
				<category><![CDATA[Business]]></category>
		<category><![CDATA[Add new tag]]></category>
		<category><![CDATA[flax]]></category>
		<category><![CDATA[open source]]></category>

		<guid isPermaLink="false">http://www.flax.co.uk/blog/?p=172</guid>
		<description><![CDATA[<p>We recently helped a small marine consultancy, running a Windows network, implement a completely free enterprise search solution. Even SMEs are now finding it hard to keep on top of the information they produce, and there are few low-cost options&#8230;</p>]]></description>
			<content:encoded><![CDATA[<p>We recently helped a small marine consultancy, running a Windows network, implement a completely free enterprise search solution. Even SMEs are now finding it hard to keep on top of the information they produce, and there are few low-cost options for searching their documents. Read the case study <a href="http://www.flax.co.uk/downloads/free_enterprise_search.pdf">here</a> (PDF).</p>
]]></content:encoded>
			<wfw:commentRss>http://www.flax.co.uk/blog/2009/07/10/enterprise-search-for-free/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Xapian compared</title>
		<link>http://www.flax.co.uk/blog/2009/07/07/xapian-compared/</link>
		<comments>http://www.flax.co.uk/blog/2009/07/07/xapian-compared/#comments</comments>
		<pubDate>Tue, 07 Jul 2009 10:03:44 +0000</pubDate>
		<dc:creator>charlie</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Add new tag]]></category>
		<category><![CDATA[lucene]]></category>
		<category><![CDATA[open source]]></category>
		<category><![CDATA[performance]]></category>
		<category><![CDATA[xapian]]></category>

		<guid isPermaLink="false">http://www.flax.co.uk/blog/?p=167</guid>
		<description><![CDATA[<p>Vik Singh has been <a href="http://zooie.wordpress.com/2009/07/06/a-comparison-of-open-source-search-engines-and-indexing-twitter/">comparing various open source solutions for search</a>. He only spent a weekend performing the comparison, which is probably not enough time to get any search software performing at its best, and his results reflect this.&#8230;</p>]]></description>
			<content:encoded><![CDATA[<p>Vik Singh has been <a href="http://zooie.wordpress.com/2009/07/06/a-comparison-of-open-source-search-engines-and-indexing-twitter/">comparing various open source solutions for search</a>. He only spent a weekend performing the comparison, which is probably not enough time to get any search software performing at its best, and his results reflect this. Xapian was marked down for being slow at indexing (he says 5x slower than SQLite &#8211; but then again, SQLite isn&#8217;t a search engine, it&#8217;s a RDBMS, and really isn&#8217;t suitable for search applications) and for producing large index files, much bigger than Lucene.</p>
<p>The reason for this is that Xapian stores different information to Lucene. For example, the full term list (un-inverted index) is retained, which makes it possible to do <a href="http://en.wikipedia.org/wiki/Relevance_feedback">relevance feedback</a>. Also, Lucene handles deletes by maintaining a separate list of deleted documents, which is merged at the next optimise step &#8211; which means that the internal statistics are wrong until this point, and that updates can be more complicated, as an updated document needs a new ID. </p>
<p>Neither approach is wrong and both have advantages &#8211; Lucene certainly has smaller index files. Some judicious use of the XAPIAN_FLUSH_THRESHOLD parameter, as suggested in some of the comments on the article, would have certainly speeded up Xapian indexing. We can also look forward to the release of the new Xapian &#8216;Chert&#8217; backend, which will produce indexes at least 50% smaller than the current &#8216;Flint&#8217; backend. It&#8217;s also hard to say how important index sizes are in these days of cheap storage.</p>
<p>On the search side, Xapian performed comparably to Lucene in terms of relevance and search speed (both were ahead of all the other solutions on these metrics, especially SQLite). There are some other metrics he quoted, such as a &#8217;support&#8217; figure, given as a score out of 5, which he admits is entirely subjective &#8211; you&#8217;d have to ask our <a href="http://www.flax.co.uk/customers.shtml">customers</a> about that one! There&#8217;s also no comparison of features, ease of integration and scalability to very large collections.</p>
<p>We&#8217;ve talked before about <a href="http://www.flax.co.uk/blog/2009/03/04/performance-metrics/">performance</a> <a href="http://www.flax.co.uk/blog/2009/03/13/more-on-performance-metrics/">metrics</a>. Vik should be applauded for his article and for releasing his test framework as open source, hopefully this can be a foundation for some more in-depth studies.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.flax.co.uk/blog/2009/07/07/xapian-compared/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Perl client for Flax Search Server</title>
		<link>http://www.flax.co.uk/blog/2009/07/01/perl-client-for-flax-search-server/</link>
		<comments>http://www.flax.co.uk/blog/2009/07/01/perl-client-for-flax-search-server/#comments</comments>
		<pubDate>Wed, 01 Jul 2009 11:54:05 +0000</pubDate>
		<dc:creator>charlie</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Add new tag]]></category>
		<category><![CDATA[client]]></category>
		<category><![CDATA[flax]]></category>
		<category><![CDATA[open source]]></category>
		<category><![CDATA[perl]]></category>

		<guid isPermaLink="false">http://www.flax.co.uk/blog/?p=160</guid>
		<description><![CDATA[<p>Flax Search Server now has a <a href="http://code.google.com/p/flaxcode/source/browse/#svn/trunk/flax_search_clients/perl">Perl client</a>, thanks to the guys at <a href="http://www.cognidox.com">Cognidox</a>, who have <a href="http://www.cognidox.com/company/blog/Search-and-the-Flax-search-client.html">blogged</a> about why they needed to improve the search facility for their powerful document management system.</p>
]]></description>
			<content:encoded><![CDATA[<p>Flax Search Server now has a <a href="http://code.google.com/p/flaxcode/source/browse/#svn/trunk/flax_search_clients/perl">Perl client</a>, thanks to the guys at <a href="http://www.cognidox.com">Cognidox</a>, who have <a href="http://www.cognidox.com/company/blog/Search-and-the-Flax-search-client.html">blogged</a> about why they needed to improve the search facility for their powerful document management system.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.flax.co.uk/blog/2009/07/01/perl-client-for-flax-search-server/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Python and Flax presentation</title>
		<link>http://www.flax.co.uk/blog/2009/06/25/python-and-flax-presentation/</link>
		<comments>http://www.flax.co.uk/blog/2009/06/25/python-and-flax-presentation/#comments</comments>
		<pubDate>Thu, 25 Jun 2009 09:49:25 +0000</pubDate>
		<dc:creator>charlie</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Add new tag]]></category>
		<category><![CDATA[django]]></category>
		<category><![CDATA[flax]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[xapian]]></category>
		<category><![CDATA[xappy]]></category>

		<guid isPermaLink="false">http://www.flax.co.uk/blog/?p=154</guid>
		<description><![CDATA[<p>My colleague Richard Boulton will be presenting at <a href="http://www.europython.eu/">Europython</a> in Birmingham, U.K. next week, specifically at 15.30 on Tuesday 30th June &#8211; an <a href="http://www.europython.eu/talks/talk_abstracts/index.html#talk55">abstract</a> is available. He&#8217;ll be talking about Xapian, Xappy and Flax, and showing examples of&#8230;</p>]]></description>
			<content:encoded><![CDATA[<p>My colleague Richard Boulton will be presenting at <a href="http://www.europython.eu/">Europython</a> in Birmingham, U.K. next week, specifically at 15.30 on Tuesday 30th June &#8211; an <a href="http://www.europython.eu/talks/talk_abstracts/index.html#talk55">abstract</a> is available. He&#8217;ll be talking about Xapian, Xappy and Flax, and showing examples of these in action including one using a <a href="http://www.djangoproject.com/">Django</a> integration layer. </p>
<p><strong>Update</strong>: you can now <a href="http://flaxcode.googlecode.com/svn/trunk/flax_search_service/docs/presentations/europython2009.odp">download</a> the slides for Richard&#8217;s talk in OpenOffice format.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.flax.co.uk/blog/2009/06/25/python-and-flax-presentation/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Please don&#8217;t compete!</title>
		<link>http://www.flax.co.uk/blog/2009/04/21/please-dont-compete/</link>
		<comments>http://www.flax.co.uk/blog/2009/04/21/please-dont-compete/#comments</comments>
		<pubDate>Tue, 21 Apr 2009 08:46:13 +0000</pubDate>
		<dc:creator>charlie</dc:creator>
				<category><![CDATA[Business]]></category>
		<category><![CDATA[News]]></category>
		<category><![CDATA[Add new tag]]></category>
		<category><![CDATA[microsoft]]></category>
		<category><![CDATA[open source]]></category>

		<guid isPermaLink="false">http://www.flax.co.uk/blog/?p=104</guid>
		<description><![CDATA[<p>Microsoft have been asking open source companies not to compete on cost, but rather on value, according to <a href="http://blogs.zdnet.com/microsoft/?p=2353&#038;tag=mncol;txt">ZDNet</a>. Unfortunately the response to this hasn&#8217;t exactly been positive, as <a href="http://news.cnet.com/8301-13505_3-10222336-16.html">CNET</a> reports. I doubt many open source vendors will&#8230;</p>]]></description>
			<content:encoded><![CDATA[<p>Microsoft have been asking open source companies not to compete on cost, but rather on value, according to <a href="http://blogs.zdnet.com/microsoft/?p=2353&#038;tag=mncol;txt">ZDNet</a>. Unfortunately the response to this hasn&#8217;t exactly been positive, as <a href="http://news.cnet.com/8301-13505_3-10222336-16.html">CNET</a> reports. I doubt many open source vendors will be taking much notice of what Microsoft would like them to do, and suspect they will happily continue to make the point that if customers are looking at buying software &#038; services, taking the cost of software completely out of the equation is almost certain to save them money.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.flax.co.uk/blog/2009/04/21/please-dont-compete/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>More on performance metrics</title>
		<link>http://www.flax.co.uk/blog/2009/03/13/more-on-performance-metrics/</link>
		<comments>http://www.flax.co.uk/blog/2009/03/13/more-on-performance-metrics/#comments</comments>
		<pubDate>Fri, 13 Mar 2009 10:40:07 +0000</pubDate>
		<dc:creator>charlie</dc:creator>
				<category><![CDATA[Technical]]></category>
		<category><![CDATA[Add new tag]]></category>
		<category><![CDATA[flax]]></category>
		<category><![CDATA[lucene]]></category>
		<category><![CDATA[performance]]></category>
		<category><![CDATA[xapian]]></category>

		<guid isPermaLink="false">http://www.flax.co.uk/blog/?p=57</guid>
		<description><![CDATA[<p>Anurag Goel recently carried out a <a href=http://www.anur.ag/blog/2009/03/xapian-and-solr/>comparitive test of Xapian/Flax and Lucene/Solr. Some interesting results here: it seems Lucene is faster at building indexes, but Xapian is faster and possibly more accurate at searching. We can expect some further&#8230;</p>]]></description>
			<content:encoded><![CDATA[<p>Anurag Goel recently carried out a <a href=http://www.anur.ag/blog/2009/03/xapian-and-solr/>comparitive test of Xapian/Flax and Lucene/Solr</a>. Some interesting results here: it seems Lucene is faster at building indexes, but Xapian is faster and possibly more accurate at searching. We can expect some further speed improvements over the next few months as a new, more compact backend to Xapian is released.</p>
<p>By the way, the article mentions Xappy: this is a Python interface to Xapian that is a major part of our Flax enterprise search platform. You can get Xappy <a href="http://code.google.com/p/xappy/">here</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.flax.co.uk/blog/2009/03/13/more-on-performance-metrics/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Performance metrics</title>
		<link>http://www.flax.co.uk/blog/2009/03/04/performance-metrics/</link>
		<comments>http://www.flax.co.uk/blog/2009/03/04/performance-metrics/#comments</comments>
		<pubDate>Wed, 04 Mar 2009 15:08:32 +0000</pubDate>
		<dc:creator>charlie</dc:creator>
				<category><![CDATA[Technical]]></category>
		<category><![CDATA[Add new tag]]></category>
		<category><![CDATA[autonomy]]></category>
		<category><![CDATA[performance]]></category>

		<guid isPermaLink="false">http://www.flax.co.uk/blog/?p=48</guid>
		<description><![CDATA[<p><a href=http://arnoldit.com/wordpress/2009/03/03/autonomy-idol-metrics/> Stephen Arnold recently posted some rather impressive performance figures for Autonomy&#8217;s IDOL search engine. This kind of data is all very well, but without independent testing and more detail it&#8217;s hard to know how these figures apply to the&#8230;</p>]]></description>
			<content:encoded><![CDATA[<p><a href=http://arnoldit.com/wordpress/2009/03/03/autonomy-idol-metrics/> Stephen Arnold recently posted</a> some rather impressive performance figures for Autonomy&#8217;s IDOL search engine. This kind of data is all very well, but without independent testing and more detail it&#8217;s hard to know how these figures apply to the real world.</p>
<p>So here&#8217;s an idea. Why not create an openly available collection of test data, a set of searches and a set of conditions, then compare the performance of the various available engines for indexing and searching? Recording the software and hardware used as well, of course. Making the data and conditions public would allow for independent verification.</p>
<p>I&#8217;m not sure commercial search vendors would ever agree to this, but it&#8217;s a nice idea.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.flax.co.uk/blog/2009/03/04/performance-metrics/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>

