information retrieval – Flax http://www.flax.co.uk The Open Source Search Specialists Thu, 10 Oct 2019 09:03:26 +0000 en-GB hourly 1 https://wordpress.org/?v=4.9.8 ECIR 2017 Industry Day, our book & a demo of live TV factchecking http://www.flax.co.uk/blog/2017/04/24/ecir-2017-industry-day-book-demo-live-tv-factchecking/ http://www.flax.co.uk/blog/2017/04/24/ecir-2017-industry-day-book-demo-live-tv-factchecking/#respond Mon, 24 Apr 2017 13:45:44 +0000 http://www.flax.co.uk/?p=3463 I visited Aberdeen before Easter to speak at Industry Day, a part of the European Conference on Information Retrieval. Following a reception at Aberdeen’s Town House (a wonderful building) hosted by the Lord Provost I spent an evening with various … More

The post ECIR 2017 Industry Day, our book & a demo of live TV factchecking appeared first on Flax.

]]>
I visited Aberdeen before Easter to speak at Industry Day, a part of the European Conference on Information Retrieval. Following a reception at Aberdeen’s Town House (a wonderful building) hosted by the Lord Provost I spent an evening with various information retrieval luminaries including Professor Udo Kruschwitz of the University of Essex. We had a chance to discuss the book we’re co-authoring (draft title ‘Searching the Enterprise’, designed as a review of the subject for those considering a PhD or those in business wanting to know the current state of the art – it should be out later this year) and I also caught up with our associate Tony Russell-Rose of UXLabs.

Industry Day started with a talk by Peter Mika of Norwegian media group Schibsted on modelling user behaviour for delivering personalised news. It was interesting to hear his views on Facebook and the recent controversy about their removal of a photo posted by a Schibsted group newspaper, and how this might be a reason Schibsted carry out their own internal developments rather than relying on the algorithms used by much larger companies. Edgar Meij was up next talking about search at Bloomberg (which we’ve been involved in) and it was interesting to hear that they might be contributing some of their alerting infrastructure back to Apache Lucene/Solr. James McMinn of startup Scoop Analytics followed, talking about real time news monitoring. They have built a prototype system based on PostgresSQL rather than a search engine, indexing around half a billion tweets, that allows one to spot breaking news much earlier than the main news outlets might report it.

The next session started with Michaela Regneri of OTTO on Newsleak.io, a project in collaboration with Der Speigel “producing a piece of software that allows to quickly and intuitively explore large amounts of textual data”. She stressed how important it is to have a common view of what is ‘good’ performance in collaborative projects like this. Richard Boulton (who worked at Flax many years ago) was next in his role as Head of Software Engineering at the Government Digital Service, talking about the ambitious project to create a taxonomy for all UK government content. So far, his team have managed to create an alpha version of this for educational content – not that they don’t have the time or resources in-house to tag content, so must therefore work with the relevant departments to do so. They have created various software tools to help including an automatic topic tagger using Latent Dirichlet Allocation – which given this is the GDS, is of course open source and available.

Unfortunately I missed a session after this due to a phone call, but managed to catch some of Elizabeth Daly of IBM talking about automatic claim detection using the Watson framework. Using Wikipedia as a source, this can identify statements that support a particular claim for an argument and tag them as ‘pro’ or ‘con’. This topic led neatly on to Will Moy of Full Fact who we have been working with recently, in a ‘sandwich’ session with myself. Will talked about how Full Fact has been working for many years to develop neutral, un-biased factchecking tools and services and I then spoke about the hackday we ran recently for FullFact and particularly about our Luwak library and how it can be used to spot known claims by politicians in streaming news. Will then surprised me and impressed the audience by showing a prototype service that watches several UK television channels in real time, extracts the subtitles and checks them against a list of previously factchecked claims – using the Luwak backend we built at the hackday. Yes, that’s live factchecking of television news, very exciting!

Thanks to Professor Kruschwitz and Tony Russell-Rose for putting together the agenda and inviting both me and Will to speak – it was great to be able to talk about the exciting work we’re doing with Full Fact and to hear about the other projects.

The post ECIR 2017 Industry Day, our book & a demo of live TV factchecking appeared first on Flax.

]]>
http://www.flax.co.uk/blog/2017/04/24/ecir-2017-industry-day-book-demo-live-tv-factchecking/feed/ 0
Not one, but three Lucene hackdays coming soon! http://www.flax.co.uk/blog/2016/08/24/not-one-three-lucene-hackdays-coming-soon/ http://www.flax.co.uk/blog/2016/08/24/not-one-three-lucene-hackdays-coming-soon/#respond Wed, 24 Aug 2016 14:07:51 +0000 http://www.flax.co.uk/?p=3353 We’re always keen to get more people involved in the Lucene search community – there’s always lots to do, from deep hacking of the core code, to testing with different frameworks and clients, to creating documentation and examples. It’s also … More

The post Not one, but three Lucene hackdays coming soon! appeared first on Flax.

]]>
We’re always keen to get more people involved in the Lucene search community – there’s always lots to do, from deep hacking of the core code, to testing with different frameworks and clients, to creating documentation and examples. It’s also just over fifteen years since Tom Mortimer and I founded Flax and we thought we should mark this birthday with some kind of event! So I’m thus very happy to announce we’ll be involved in three Lucene hackday events over the next two months:

Firstly, Dr. Leif Azzopardi has kindly invited us to speak and participate in the Lucene4IR Workshop to be held at the University of Strathclyde in Glasgow on 8th & 9th September 2016. The event is aimed at those in academia wanting to get more involved in practical applications of Lucene and we are also hoping they will also contribute ideas from cutting-edge information retrieval research. We’ll be giving the keynote talk on how we use Lucene-based search engines in industry and also getting involved in the coding sessions later. There are some free places (although registration is only £69) and there are even some travel grants available.

A month later on 7th October we’re running a Lucene Hackday in London as part of our London Lucene/Solr Usergroup (note that Elasticsearch users are also very welcome to this and the other events mentioned). Bloomberg are kindly providing a venue, we’ll have Lucene committers on hand to guide us.

The next week is the largest Lucene event of the year, Lucene Revolution – but we’ll be in Boston a couple of days early on Tuesday 10th October to run a Boston Lucene Hackday. BA Insight are our hosts this time and we’re hoping some of those coming to Revolution later in the week will be able to participate.

So all we need is you – bring a laptop, your ideas for new things to add to or do with Lucene. We’ll even provide cake for Flax’s birthday at the latter two events! Feel free to suggest what we should hack on in the comments below.

The post Not one, but three Lucene hackdays coming soon! appeared first on Flax.

]]>
http://www.flax.co.uk/blog/2016/08/24/not-one-three-lucene-hackdays-coming-soon/feed/ 0