<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Flax Blog &#187; media</title>
	<atom:link href="http://www.flax.co.uk/blog/tag/media/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.flax.co.uk/blog</link>
	<description>Open source &#38; enterprise search</description>
	<lastBuildDate>Wed, 25 Jan 2012 14:56:17 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Search Solutions 2011 review</title>
		<link>http://www.flax.co.uk/blog/2011/11/17/search-solutions-2011-review/</link>
		<comments>http://www.flax.co.uk/blog/2011/11/17/search-solutions-2011-review/#comments</comments>
		<pubDate>Thu, 17 Nov 2011 16:29:38 +0000</pubDate>
		<dc:creator>charlie</dc:creator>
				<category><![CDATA[events]]></category>
		<category><![CDATA[archive]]></category>
		<category><![CDATA[enterprise search]]></category>
		<category><![CDATA[media]]></category>
		<category><![CDATA[microsoft]]></category>
		<category><![CDATA[networking]]></category>
		<category><![CDATA[ontology]]></category>
		<category><![CDATA[pipeline]]></category>
		<category><![CDATA[SOLR]]></category>
		<category><![CDATA[taxonomy]]></category>
		<category><![CDATA[yahoo]]></category>

		<guid isPermaLink="false">http://www.flax.co.uk/blog/?p=663</guid>
		<description><![CDATA[<p>I spent yesterday at the British Computer Society Information Retrieval Specialist Group&#8217;s annual <a href="http://irsg.bcs.org/SearchSolutions/2011/sse2011.php">Search Solutions</a> conference, which brings together theoreticians and practitioners to discuss the latest advances in search. </p>
<p>The day started with a talk by <a href="http://johntait.net/">John</a>&#8230;</p>]]></description>
			<content:encoded><![CDATA[<p>I spent yesterday at the British Computer Society Information Retrieval Specialist Group&#8217;s annual <a href="http://irsg.bcs.org/SearchSolutions/2011/sse2011.php">Search Solutions</a> conference, which brings together theoreticians and practitioners to discuss the latest advances in search. </p>
<p>The day started with a talk by <a href="http://johntait.net/">John Tait</a> on the challenges of patent search where different units are concerned &#8211; where for example a search for a plastic with a melting point of 200°C wouldn&#8217;t find a patent that uses °F or Kelvin. John presented a solution from <a href="http://www.max-recall.com/">max.recall</a>, a plugin for <a href="http://lucene.apache.org/solr/">Apache Solr</a> that promises to solve this issue. We then heard from Lewis Crawford of the <a href="http://www.webarchive.org.uk/ukwa/">UK Web Archive</a> on their very large index of 240m archived webpages &#8211; some great features were shown including a postcode-based browser. The system is based on Apache Solr and they are also using &#8216;big data&#8217; projects such as <a href="http://hadoop.apache.org/">Apache Hadoop</a> &#8211; which by the sound of it they&#8217;re going to need as they&#8217;re expecting to be indexing a lot more websites in the future, up to 4 or 5 million. The third talk in this segment came from Toby Mostyn of <a href="http://polecat.co/">Polecat</a> on their MeaningMine social media monitoring system, again built on Solr (a theme was beginning to emerge!). MeaningMine implements an iterative query method, using a form of <a href="http://en.wikipedia.org/wiki/Relevance_feedback">relevance feedback</a> to help users contribute more useful query information.</p>
<p>Before lunch we heard from <a href="http://research.yahoo.com/Ricardo_Baeza-Yates">Ricardo Baeza-Yates</a> of Yahoo! on moving beyond the &#8216;ten blue links&#8217; model of web search, with some fascinating ideas around how we should consider a Web of <em>objects</em> rather than web pages. <a href="http://www.gabriella-kazai.com/">Gabriella Kazai </a>of Microsoft Research followed, talking about how best to gather high-quality relevance judgements for testing search algorithms, using crowdsourcing systems such as Amazon&#8217;s <a href="https://www.mturk.com/mturk/welcome">Mechanical Turk</a>. Some good insights here as to how a high-quality task description can attract high-quality workers.</p>
<p>After lunch we heard from <a href="http://www.daedalusinfosystems.com/resume.htm">Marianne Sweeney</a> with a refreshingly candid treatment of how best to tune enterprise search products that very rarely live up to expectations &#8211; I liked one of her main points that &#8220;the product is never what was used in the demo&#8221;. Matt Taylor from <a href="http://www.funnelback.com/">Funnelback</a> followed with a brief overview of his company&#8217;s technology and some case studies. </p>
<p>The last section of the day featured Iain Fletcher of Search Technologies on the value of metadata and on their interesting new pipeline framework, <a href="http://www.searchtechnologies.com/aspire.html">Aspire</a>. (As an aside, Iain has also joined the <a href="http://www.meetup.com/SearchPipelines/">Pipelines meetup group</a> I set up recently). Next up was Jared McGinnis of the <a href="http://www.pressassociation.com/">Press Association</a> on their work on Semantic News &#8211; it was good to see an openly available <a href="http://data.press.net/ontology/">news ontology</a> as a result. <a href="http://www.social-tv.net/conference/speakers/265-ian-kegel-head-of-future-content-group-bt.html">Ian Kegel</a> of British Telecom came next with a talk about TV program recommendation systems, and we finished with <a href="http://sys64738.se/">Kristian Norling</a>&#8217;s talk on a healthcare information system that he worked on before joining <a href="http://www.findwise.se/">Findwise</a>. We ended with a brief Fishbowl discussion which asked amongst other things what the main themes of the day had been &#8211; my own contribution being &#8220;everyone&#8217;s using Solr!&#8221;.</p>
<p>It&#8217;s rare to find quite so many search experts in one room, and the quality of discussions outside the talks was as high as the quality of the talks themselves &#8211; congratulations are due to the organisers for putting together such an interesting programme. </p>
]]></content:encoded>
			<wfw:commentRss>http://www.flax.co.uk/blog/2011/11/17/search-solutions-2011-review/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Bicycles, beer and bands &#8211; the first Cambridge Enterprise Search Meetup</title>
		<link>http://www.flax.co.uk/blog/2011/02/17/bicycles-beer-and-bands-the-first-cambridge-enterprise-search-meetup/</link>
		<comments>http://www.flax.co.uk/blog/2011/02/17/bicycles-beer-and-bands-the-first-cambridge-enterprise-search-meetup/#comments</comments>
		<pubDate>Thu, 17 Feb 2011 10:31:14 +0000</pubDate>
		<dc:creator>charlie</dc:creator>
				<category><![CDATA[events]]></category>
		<category><![CDATA[media]]></category>
		<category><![CDATA[networking]]></category>
		<category><![CDATA[News]]></category>
		<category><![CDATA[open source]]></category>

		<guid isPermaLink="false">http://www.flax.co.uk/blog/?p=516</guid>
		<description><![CDATA[<p>Last night we held the <a href="http://www.meetup.com/Enterprise-Search-Cambridge-UK/events/16034930/">first</a> of what we hope will be a series of Meetups in our home town of Cambridge, U.K. Attending were researchers, developers and entrepreneurs in the field of search &#8211; as is the norm&#8230;</p>]]></description>
			<content:encoded><![CDATA[<p>Last night we held the <a href="http://www.meetup.com/Enterprise-Search-Cambridge-UK/events/16034930/">first</a> of what we hope will be a series of Meetups in our home town of Cambridge, U.K. Attending were researchers, developers and entrepreneurs in the field of search &#8211; as is the norm in Cambridge many had cycled to the venue, and there was a friendly and informal feel to the group.</p>
<p>We started with my presentation on <em>&#8220;Searching news media with open source software&#8221;</em>, where I talked about our work for the <a href="http://www.nla.co.uk">NLA</a>,  <a href="http://presscuttings.ft.com/presscuttings/search.htm">Financial Times</a> and others. We followed with John Snyder of <a href="http://www.grapeshot.co.uk">Grapeshot</a> on <em>&#8220;Using Search to Connect Multiple Variants of An Object to One Central Object&#8221;</em>. John showed a Grapeshot <a href="http://www.virginmedia.com/player/">project for Virgin</a> where different media assets can be automatically grouped together even if they have different metadata &#8211; for example an episode of the TV show &#8220;Heroes&#8221; is basically the same object whether it is broadcast, video-on-demand or a repeat, but differs from the Bowie album of the same name. </p>
<p>We then broke up for discussion (and beer) &#8211; great to catch up with some ex-colleagues and meet others for the first time. Downstairs there was live music and one of our colleagues even joined the band for a spell on drums! From the feedback we recieved there&#8217;s definitely interest in repeating the event, if you&#8217;d like to attend next time please join the <a href="http://www.meetup.com/Enterprise-Search-Cambridge-UK/">Meetup group.</a> </p>
]]></content:encoded>
			<wfw:commentRss>http://www.flax.co.uk/blog/2011/02/17/bicycles-beer-and-bands-the-first-cambridge-enterprise-search-meetup/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Next-generation media monitoring with open source search</title>
		<link>http://www.flax.co.uk/blog/2010/12/13/next-generation-media-monitoring-with-open-source-search/</link>
		<comments>http://www.flax.co.uk/blog/2010/12/13/next-generation-media-monitoring-with-open-source-search/#comments</comments>
		<pubDate>Mon, 13 Dec 2010 14:05:40 +0000</pubDate>
		<dc:creator>charlie</dc:creator>
				<category><![CDATA[News]]></category>
		<category><![CDATA[durrants]]></category>
		<category><![CDATA[media]]></category>
		<category><![CDATA[monitoring]]></category>
		<category><![CDATA[open source]]></category>
		<category><![CDATA[platform]]></category>
		<category><![CDATA[scaling]]></category>

		<guid isPermaLink="false">http://www.flax.co.uk/blog/?p=456</guid>
		<description><![CDATA[<p>Media monitoring is not a traditional search application: for a start, instead of searching a large number of documents with a single query, a media monitoring application must search every incoming news story with potentially thousands of queries, searching for&#8230;</p>]]></description>
			<content:encoded><![CDATA[<p>Media monitoring is not a traditional search application: for a start, instead of searching a large number of documents with a single query, a media monitoring application must search every incoming news story with potentially thousands of queries, searching for words and terms relevant to client requirements. This can be difficult to scale, especially when accuracy must be maintained &#8211; a client won&#8217;t be happy if their media monitors miss relevant stories or send them news that isn&#8217;t relevant.</p>
<p>We&#8217;ve been working with <a href="http://durrants.co.uk/">Durrants Ltd.</a> of London for a while now on replacing their existing (closed source) search engine with a system built on open source. This project, which you can read more about in a detailed <a href="http://www.flax.co.uk/downloads/durrants_case_study_091210.pdf">case study</a> (PDF), has reduced the hardware requirements significantly and led to huge accuracy improvements (in some cases where 95% of the results passed through to human operators were irrelevant &#8216;false positives&#8217;, the new system is now 95% correct).</p>
<p>The new system is built on <a href="http://www.xapian.org">Xapian</a> and <a href="http://www.python.org">Python</a> and supports all the features of the previous engine, to ease migration &#8211; it even copes with errors introduced during automated scanning of printed news. The new system scales easily and cost effectively.</p>
<p>As far as we know this is one of the first large-scale media monitoring systems built on open source, and a great example of search as a platform, which we&#8217;ve <a href="http://www.flax.co.uk/blog/2010/10/19/when-search-isnt-just-search-at-the-guardian/">discussed before</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.flax.co.uk/blog/2010/12/13/next-generation-media-monitoring-with-open-source-search/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Autumn events</title>
		<link>http://www.flax.co.uk/blog/2010/09/10/autumn-events/</link>
		<comments>http://www.flax.co.uk/blog/2010/09/10/autumn-events/#comments</comments>
		<pubDate>Fri, 10 Sep 2010 15:41:03 +0000</pubDate>
		<dc:creator>charlie</dc:creator>
				<category><![CDATA[events]]></category>
		<category><![CDATA[lucene]]></category>
		<category><![CDATA[media]]></category>
		<category><![CDATA[microsoft]]></category>
		<category><![CDATA[open source]]></category>

		<guid isPermaLink="false">http://www.flax.co.uk/blog/?p=358</guid>
		<description><![CDATA[<p>Autumn seems to be conference season: first is the <a href="http://www.lucenerevolution.com">Lucene Revolution</a> event in Boston, USA from October 7th-8th, where I&#8217;ll be on the closing panel whose subject is &#8220;Data Crossroads &#8211; At The Intersection Of Search And Open Source&#8221;.&#8230;</p>]]></description>
			<content:encoded><![CDATA[<p>Autumn seems to be conference season: first is the <a href="http://www.lucenerevolution.com">Lucene Revolution</a> event in Boston, USA from October 7th-8th, where I&#8217;ll be on the closing panel whose subject is &#8220;Data Crossroads &#8211; At The Intersection Of Search And Open Source&#8221;. </p>
<p>Next is the British Computer Society&#8217;s <a href="http://irsg.bcs.org/SearchSolutions/2010/sse2010.php">Search Solutions 2010</a> in London on October 21st, where I&#8217;m giving a presentation titled &#8220;What&#8217;s the story with open source? &#8211; Searching and monitoring news media with open-source technology&#8221;. </p>
<p>Both events feature a wide range of other speakers from organisations such as Cisco, LinkedIn, Twitter, Google and Microsoft.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.flax.co.uk/blog/2010/09/10/autumn-events/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The Times they are a-changing&#8230;.</title>
		<link>http://www.flax.co.uk/blog/2010/03/26/273/</link>
		<comments>http://www.flax.co.uk/blog/2010/03/26/273/#comments</comments>
		<pubDate>Fri, 26 Mar 2010 11:44:38 +0000</pubDate>
		<dc:creator>charlie</dc:creator>
				<category><![CDATA[Business]]></category>
		<category><![CDATA[News]]></category>
		<category><![CDATA[client]]></category>
		<category><![CDATA[flax]]></category>
		<category><![CDATA[media]]></category>
		<category><![CDATA[open source]]></category>

		<guid isPermaLink="false">http://www.flax.co.uk/blog/?p=273</guid>
		<description><![CDATA[<p>News International have announced <a href="http://news.bbc.co.uk/1/hi/business/8588432.stm">they will be charging for access to their Times and Sunday Times newspaper websites</a> within a few months. At the same time we have the announcement that the<a href="http://news.bbc.co.uk/1/hi/business/8587469.stm"> Independent newspaper is to be bought</a>&#8230;</p>]]></description>
			<content:encoded><![CDATA[<p>News International have announced <a href="http://news.bbc.co.uk/1/hi/business/8588432.stm">they will be charging for access to their Times and Sunday Times newspaper websites</a> within a few months. At the same time we have the announcement that the<a href="http://news.bbc.co.uk/1/hi/business/8587469.stm"> Independent newspaper is to be bought by a Russian oligarch</a>, and may end up as a free publication. This divergence of business models is interesting, but what concerns us at Flax is how technology will help newspaper websites differentiate themselves. </p>
<p>The NLA&#8217;s <a href="http://www.nla-clipshare.com/static/AboutClipShare.htm">ClipShare</a> and <a href="http://info.clipsearch.co.uk">ClipSearch</a> services, which are powered by Flax, are good models for monetizing newspaper content, and are already in use at some of the U.K.&#8217;s largest publishers. If you need to quickly find a particular story, see related articles and grasp an overview of coverage you need scalable, highly accurate search technology. Users have been conditioned to expect search to &#8216;just work&#8217;, and they <strong>simply won&#8217;t pay</strong> for anything that doesn&#8217;t come up to scratch. </p>
]]></content:encoded>
			<wfw:commentRss>http://www.flax.co.uk/blog/2010/03/26/273/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Finding French TV with Flax</title>
		<link>http://www.flax.co.uk/blog/2009/11/26/finding-french-tv-with-flax/</link>
		<comments>http://www.flax.co.uk/blog/2009/11/26/finding-french-tv-with-flax/#comments</comments>
		<pubDate>Thu, 26 Nov 2009 11:22:52 +0000</pubDate>
		<dc:creator>charlie</dc:creator>
				<category><![CDATA[News]]></category>
		<category><![CDATA[flax]]></category>
		<category><![CDATA[indexing]]></category>
		<category><![CDATA[media]]></category>

		<guid isPermaLink="false">http://www.flax.co.uk/blog/?p=225</guid>
		<description><![CDATA[<p>We&#8217;ve recently been working with <a href="http://www.myskreen.com">mySkreen</a>, who like <a href="http://www.hulu.com/">Hulu</a> in the U.S. provide a service for finding and viewing television programs via your web browser. mySkreen is the brainchild of Frédéric Sitterlé, previously Head of New Media at&#8230;</p>]]></description>
			<content:encoded><![CDATA[<p>We&#8217;ve recently been working with <a href="http://www.myskreen.com">mySkreen</a>, who like <a href="http://www.hulu.com/">Hulu</a> in the U.S. provide a service for finding and viewing television programs via your web browser. mySkreen is the brainchild of Frédéric Sitterlé, previously Head of New Media at the <a href="http://www.lefigaro.fr/">Le Figaro</a> media group.</p>
<p>mySkreen works with French-language content, and is currently indexing over 1.6 million programmes (and counting). Using Flax, you can search using programme title, actors, genres or time periods. We also added some innovative query parsing to translate fuzzy queries such as &#8216;tomorrow evening&#8217; into more exact time periods, and some clever ranking so that &#8216;more easily available&#8217; programmes appear higher in the search results. We also added faceted search and automatic spelling correction.</p>
<p>This was a fast-moving project with a very quick turnaround: we first visited mySkreen in Paris in August and delivered customised code to them less than four weeks later; the flexibility of Flax and the open source model helped to make this possible.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.flax.co.uk/blog/2009/11/26/finding-french-tv-with-flax/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

