Meetups – Flax http://www.flax.co.uk The Open Source Search Specialists Thu, 10 Oct 2019 09:03:26 +0000 en-GB hourly 1 https://wordpress.org/?v=4.9.8 Flax joins OpenSource Connections http://www.flax.co.uk/blog/2018/12/21/flax-joins-opensource-connections/ http://www.flax.co.uk/blog/2018/12/21/flax-joins-opensource-connections/#respond Fri, 21 Dec 2018 12:09:24 +0000 http://www.flax.co.uk/?p=4017 We have some news! From February 1st 2019 Flax’s Managing Director Charlie Hull will be joining OpenSource Connections (OSC), Flax’s long-standing US partner, as a senior Managing Consultant. Charlie will manage a new UK division of OSC who will also … More

The post Flax joins OpenSource Connections appeared first on Flax.

]]>
We have some news!

From February 1st 2019 Flax’s Managing Director Charlie Hull will be joining OpenSource Connections (OSC), Flax’s long-standing US partner, as a senior Managing Consultant. Charlie will manage a new UK division of OSC who will also acquire some of Flax’s assets and brands. OSC are a highly regarded organisation in the world of search and relevance, wrote the seminal book Relevant Search and run the popular Haystack relevance conference. Their clients include the US Patent Office, the Wikimedia Foundation and Under Armour and their services include comprehensive training, Discovery engagements, Trusted Advisor consulting and expert implementation.

Lemur Consulting Ltd., which as most of you will know trades as Flax, will continue to operate and to complete current projects but will not be taking on any new business after January 2019. For any new business we will be forwarding all future Flax enquiries to OSC where Charlie will as ever be very happy to discuss requirements and how OSC’s expert team (which may include some familiar faces!) might help.

We are all very excited about this new development as it will create a larger team of independent search & relevance experts with a global reach. We fully expect to build on Flax’s 17 year history of providing high quality search solutions as part of OSC. We intend to continue managing the London Lucene/Solr Meetup and running, attending and speaking at other events on search related topics.

If you have any questions about the above please do contact us. Merry Christmas and best wishes for the New Year!

The post Flax joins OpenSource Connections appeared first on Flax.

]]>
http://www.flax.co.uk/blog/2018/12/21/flax-joins-opensource-connections/feed/ 0
Lucene Hackdays in London & Montreal http://www.flax.co.uk/blog/2018/10/23/lucene-hackdays-in-london-montreal/ http://www.flax.co.uk/blog/2018/10/23/lucene-hackdays-in-london-montreal/#respond Tue, 23 Oct 2018 09:35:13 +0000 http://www.flax.co.uk/?p=3919 We ran a couple of Lucene Hackdays over the last couple of weeks: a chance to get together with other people working on open source search, learn from each other and to try and improve both Lucene and associated software. … More

The post Lucene Hackdays in London & Montreal appeared first on Flax.

]]>
We ran a couple of Lucene Hackdays over the last couple of weeks: a chance to get together with other people working on open source search, learn from each other and to try and improve both Lucene and associated software.

Our first Hackday was in London, hosted by Mimecast at their offices near Moorgate. Despite a fire alarm practice (during which we ended up under some flats at the Barbican, whose residents may have been a little surprised at quite how many people ended up milling around under their balconies) we had a busy day – we split into three groups to look at tools for inspecting Lucene indexes, various outstanding bugs and issues with Lucene and Solr and to review a well-known issue where different Solr replicas can provide slightly different result ordering. By 5.30 p.m. when we were scheduled to finish we were still frantically hacking on some last-minute Javascript to add a feature to our Marple index inspector – luckily a few minutes later to a collective sigh of relief we had it working and we repaired to a local pub for food and drink (kindly sponsored by Elastic).

The next week a number of us were in Montreal for the Activate conference (previously known as Lucene/Solr Revolution but now sprinkled with cutting-edge AI fairy dust!). Our second Hackday was hosted by Netgovern and we worked on various Lucene/Solr issues, some improvements to our Harahachibu proxy (which attempts to block Solr updates when disk space is low) and discussed in depth how to improve the Solr onboarded experience. Pizza (sponsored by OneMoreCloud) and coffee fueled the hacking and we also added some new features including a Query Parser for MinHash queries. Many Lucene/Solr committers attended and afterwards we met up for a drink & food nearby (thanks to Searchstax for sponsoring this!) where we were joined by a few others – including Yonik Seeley, creator of Solr.

Next it was time for Activate – of which more later! Thanks to everyone who attended – you can see some notes and links about what we worked on here. Work will be continuing on these issues I’m sure.

The post Lucene Hackdays in London & Montreal appeared first on Flax.

]]>
http://www.flax.co.uk/blog/2018/10/23/lucene-hackdays-in-london-montreal/feed/ 0
London Lucene/Solr Meetup – Introducing Marple & Solr Classification http://www.flax.co.uk/blog/2017/03/27/london-lucenesolr-meetup-introducing-marple-solr-classification/ http://www.flax.co.uk/blog/2017/03/27/london-lucenesolr-meetup-introducing-marple-solr-classification/#respond Mon, 27 Mar 2017 13:16:36 +0000 http://www.flax.co.uk/?p=3454 A small crowd for this month’s London Lucene/Solr Meetup, kindly hosted by Barclays in their sumptuous Canary Wharf offices. I introduced the Meetup and spoke briefly on how Flax is currently looking for team members (want to work on a … More

The post London Lucene/Solr Meetup – Introducing Marple & Solr Classification appeared first on Flax.

]]>
A small crowd for this month’s London Lucene/Solr Meetup, kindly hosted by Barclays in their sumptuous Canary Wharf offices. I introduced the Meetup and spoke briefly on how Flax is currently looking for team members (want to work on a variety of cutting-edge open source search projects in the UK and abroad? Get in touch!) before introducing Flax’s Alan Woodward who introduced our new Lucene index inspection tool, Marple.

Alan told us how Marple was conceived at the Lucene4IR event in Glasgow last year and how coding started at our Lucene Hackday in London. Although the well-known tool Luke allows one to dive deep into Lucene indexes, it hasn’t kept up with recent additions to Lucene index structures and we also wanted to build a tool with a RESTful API and separate GUI to allow it to be run easily on our client’s indexes in a read-only mode. Alan demonstrated Marple’s features including how it allows one to see the ‘hidden’ Lucene index fields that Elasticsearch creates. The first release of Marple is out and we’d welcome any feedback and contributions.

Next up was Alessandro Benedetti with an engaging talk about Solr’s built-in document classification features, useful for everything from spam filtering to automatic product categorisation. Unlike many classification methods, this uses the Lucene index itself as the training set – this index must contain some documents with manually assigned classification fields. Either K-Nearest-Neighbour and Naive Bayes algorithms can be used to perform the classification via Solr’s UpdateRequestProcessor chain, in Solr versions after 6.1. You can read more detail on Alessandro’s excellent blog.

We concluded with a brief Q&A session and then popped downstairs to a pub for some snacks and drinks. Thanks to both our speakers, our hosts and all who came – we’ll return in a couple of months with talks that will include René Kriegler on his neat Querqy query processor.

The post London Lucene/Solr Meetup – Introducing Marple & Solr Classification appeared first on Flax.

]]>
http://www.flax.co.uk/blog/2017/03/27/london-lucenesolr-meetup-introducing-marple-solr-classification/feed/ 0
Meetup at Big Data London – One-click Solr & Factchecking with Solr http://www.flax.co.uk/blog/2016/11/10/meetup-big-data-london-one-click-solr-factchecking-solr/ http://www.flax.co.uk/blog/2016/11/10/meetup-big-data-london-one-click-solr-factchecking-solr/#respond Thu, 10 Nov 2016 11:22:26 +0000 http://www.flax.co.uk/?p=3381 Last week I spoke at the Big Data London conference, a very busy event with several thousand people attending. My session was on using open source search to make sense of Big Data – you can get slides here. In … More

The post Meetup at Big Data London – One-click Solr & Factchecking with Solr appeared first on Flax.

]]>
Last week I spoke at the Big Data London conference, a very busy event with several thousand people attending. My session was on using open source search to make sense of Big Data – you can get slides here.

In the evening we ran another Lucene/Solr London Usergroup event with speakers Upayavira and Full Fact. After a brief but friendly fight with the Datastax team over pizza we settled down to see Upayavira show us his method for creating a fully functional SolrCloud stack and search application with a single command line using tools such as Docker, Rancher and Exhibitor. Upayavira’s system only needs to be given details of an Amazon Web Services cloud hosting account and it will create host instances, install and start Zookeeper, wait for a quorum to be established, install and start Solr and create a SolrCloud cluster and finally install and start a search application. The whole thing is managed by his own script Uberstack and is undeniably impressive.

Our second talk (and I think my favourite talk from all our Solr Meetups) was from Will Moy and Mevan Babakar of Full Fact, a charity who monitor the news for accuracy (something we increasingly require in these ‘post-truth’ days). Will told us how false and misleading claims can be amplified by the media and may end up directly influencing government policy, even though the underlying facts are wrong. FullFact are attempting to build open source, freely available systems for automating the factchecking process using Apache Lucene/Solr and our own stored query library Luwak and Flax have been donating some time to help them with this process. Their Hawk system currently indexes over 70 million sentences. This project is a wonderful example of how free, open source software can be used to create tools that benefit us all and at the end of this inspiring talk many of the audience offered ideas and even direct assistance with the project. I urge you to read Full Fact’s recent report on automated factchecking and get involved if you can. One idea was to run a Hackday for Full Fact – more details when we have them.

Thanks to Big Data London for inviting me to speak and hosting the Meetup and to Elsevier for sponsoring pizza and drinks. We’ll be back with another Meetup soon!

The post Meetup at Big Data London – One-click Solr & Factchecking with Solr appeared first on Flax.

]]>
http://www.flax.co.uk/blog/2016/11/10/meetup-big-data-london-one-click-solr-factchecking-solr/feed/ 0
London Lucene Solr Meetup – Enterprising attitudes to open source search & query completion strategies http://www.flax.co.uk/blog/2016/05/18/london-lucene-solr-meetup-enterprising-attitudes-open-source-search-query-completion-strategies/ http://www.flax.co.uk/blog/2016/05/18/london-lucene-solr-meetup-enterprising-attitudes-open-source-search-query-completion-strategies/#respond Wed, 18 May 2016 15:34:23 +0000 http://www.flax.co.uk/?p=3279 Last night the London Lucene Solr Meetup was hosted by Elsevier in their Finsbury Square offices. Our first speaker was Martin White, expert consultant, author of many books about enterprise search and intranets and visiting professor at the University of … More

The post London Lucene Solr Meetup – Enterprising attitudes to open source search & query completion strategies appeared first on Flax.

]]>
Last night the London Lucene Solr Meetup was hosted by Elsevier in their Finsbury Square offices. Our first speaker was Martin White, expert consultant, author of many books about enterprise search and intranets and visiting professor at the University of Sheffield (oh, and Flax partner). Martin showed us some scary numbers about the terribly low level of satisfaction with enterprise search, drawing on research from AIIM and Findwise (I highly recommend you contribute to their ongoing survey if you can, it’s a great resource). An example is that around 55% of people in enterprises find it ‘very difficult’ to find information which can have a huge effect on productivity. Martin suggested that there is a huge opportunity for open source search in the enterprise market, but that we need a way of communicating the benefits to non-technical staff – as these people are generally the ones in charge of budgets. He ended with a suggestion that a trade association for smaller, independent search companies could be formed, an idea I’m going to further explore.

After a short break we continued with Tomasz Sobczak of Findwise (who had travelled from Poland especially to speak) on query completion strategies – you’ll have seen this feature where a search system suggests endings for the query you’ve begun to type. He described the various applications of this (including completing place names in map searches and available products in e-commerce) and described the many ways it can be implemented in Solr: facet.prefix, facet.contains, using N-grams, Shingles, the Suggester component, queries using synonyms and the Terms component. He noted the various pros and cons of each approach including how they may affect performance and suggested how a separate Solr index might be used purely for query completion. Data for query completion should also be clean and secure (you don’t want to show something the user isn’t allowed to know exists via query completion!). He finished with an example from Findwise’s work for Ericsson.

After the talks we had a brief discussion around how some of the less exciting features of Solr might be improved (we’ve blogged about our search for sponsorship for fixing some of these issues) and the suggestion arose that we might run some more Solr hackdays, in London or even the U.S.A. We’ll be looking into this possibility.

Thanks to our hosts, speakers and indeed everyone who came – see you next time!

The post London Lucene Solr Meetup – Enterprising attitudes to open source search & query completion strategies appeared first on Flax.

]]>
http://www.flax.co.uk/blog/2016/05/18/london-lucene-solr-meetup-enterprising-attitudes-open-source-search-query-completion-strategies/feed/ 0
Apache Kafka London Meetup – Real time search and insights http://www.flax.co.uk/blog/2016/04/14/apache-kafka-london-meetup-real-time-search-insights/ http://www.flax.co.uk/blog/2016/04/14/apache-kafka-london-meetup-real-time-search-insights/#respond Thu, 14 Apr 2016 09:50:05 +0000 http://www.flax.co.uk/?p=3202 The rise of Apache Kafka as a streaming data solution is something we’ve been watching for a while – as part of a collection of Big Data tools, it provides a ‘TiVo for data‘ feature. We’ve begun to use it … More

The post Apache Kafka London Meetup – Real time search and insights appeared first on Flax.

]]>
The rise of Apache Kafka as a streaming data solution is something we’ve been watching for a while – as part of a collection of Big Data tools, it provides a ‘TiVo for data‘ feature. We’ve begun to use it in client projects covering both search and log analysis and we’ve recently partnered with Confluent, founded by the creators of Kafka.

Last night we spoke at the Apache Kafka London Meetup – hosted by British Gas Connected Homes, it was well supplied with drinks, pizza and snacks and also very well attended – there was a great buzz of conversation before the talks had even started! Alan Woodward of Flax started with an updated talk about our proof-of-concept integration of Kafka, Apache Samza and our own Luwak streaming search library (slides are available here). This allows full-text search within a Kafka stream, with the search queries supplied as another stream, for a truly real-time solution – as opposed to the more usual (and much higher latency) approach of indexing the endpoint of a stream. Alan has also tried the very new Kafka Streams feature which can be used as an alternative to Apache Samza – there is some very early code available, although note that this still needs some work! (We’ll update this blog when it’s finished).

The second talk was by one of our hosts, Josep Casals, on how British Gas have used Kafka, Spark Streaming and Apache Cassandra to build a platform for analyzing data from smart meters, boilers and thermostats. Over 2 million smart meters are installed across the UK and there are also over 300,000 connected thermostats, plus many other data sources, and these devices can report every 30 minutes and 2 minutes respectively, so their system has to cope with around 30,000 messages/second. One interesting feature for me was how machine learning is used to disaggregrate power consumption data, so the consumption for say, a fridge can be split out from the overall figure. Apache Samza is also used in this system to provide estimates of consumption and interpolate between readings, allowing data to be fed back to an app on the customer’s mobile device. Further use cases include spotting outlier events, which might indicate failing heating devices or even unusual patterns in an elderly person’s home to alert relatives or carers.

Both talks were live streamed and you can watch them here.

We concluded with some informal discussion and a chance to meet some of Confluent’s UK-based team. Thanks to the organisers and hosts and we look forward to returning! If you have a Kafka project and you’d like any help or advice, do let us know.

The post Apache Kafka London Meetup – Real time search and insights appeared first on Flax.

]]>
http://www.flax.co.uk/blog/2016/04/14/apache-kafka-london-meetup-real-time-search-insights/feed/ 0
Elasticsearch London Meetup – Exploring the Graph API & SearchKit UI components http://www.flax.co.uk/blog/2016/03/24/elasticsearch-london-meetup-exploring-graph-api-searchkit-ui-components/ http://www.flax.co.uk/blog/2016/03/24/elasticsearch-london-meetup-exploring-graph-api-searchkit-ui-components/#respond Thu, 24 Mar 2016 11:14:44 +0000 http://www.flax.co.uk/?p=3156 This month’s Elasticsearch Meetup was hosted by Argos at their Victoria Digital Hub with a relatively small crowd this time – I suspect quite a few who registered didn’t actually turn up or release their tickets, which is a shame … More

The post Elasticsearch London Meetup – Exploring the Graph API & SearchKit UI components appeared first on Flax.

]]>
This month’s Elasticsearch Meetup was hosted by Argos at their Victoria Digital Hub with a relatively small crowd this time – I suspect quite a few who registered didn’t actually turn up or release their tickets, which is a shame as there was a waiting list.

Mark Harwood of Elastic was first with a talk about the new Graph API and visualisation components, which will shortly be available to Elastic subscription customers. Mark’s talks are always fascinating and entertaining and this one was no exception, covering how to derive network graphs from data in Elasticsearch and discover how indexed items are connected. Using publically available data he showed us how a Swedish metal band had proportionally more listeners in Finland than in Sweden (and how many bands of this genre seem to be named after unpleasant medical conditions), how clickthrough data can reveal who is buying food mixers and who is buying audio mixers and amusingly how a mysterious person called ‘Ravi’ has registered for hundreds of different Meetup events without attending a single one (as far as we know). Building on the significant terms aggregation, these graph features are a powerful tool for discovery (especially in a forensics context) of real and unexpected connections within your data.

Siavash and Joseph from TenEleven then showed us their component library for building Elasticsearch user interfaces, SearchKit. Based on React and allows one to “rapidly create beautiful search applications using declarative components, and without being an ElasticSearch expert.” They showed us a range of impressive demos with search interfaces created with only a few lines of configuration. SearchKit is open source under the Apache 2 license and they have seen huge interest – as of today the project has attracted over 1500 stars on Github! We’ll certainly be considering SearchKit for future Elasticsearch projects and we think the project has a bright future.

The evening ended with a Q&A session – thanks to our hosts Argos and both speakers, see you next time!

The post Elasticsearch London Meetup – Exploring the Graph API & SearchKit UI components appeared first on Flax.

]]>
http://www.flax.co.uk/blog/2016/03/24/elasticsearch-london-meetup-exploring-graph-api-searchkit-ui-components/feed/ 0
London Lucene/Solr Meetup – Learning to Rank and Hibernate Search http://www.flax.co.uk/blog/2016/02/24/london-lucenesolr-meetup-learning-rank-hibernate-search/ http://www.flax.co.uk/blog/2016/02/24/london-lucenesolr-meetup-learning-rank-hibernate-search/#comments Wed, 24 Feb 2016 10:49:38 +0000 http://www.flax.co.uk/?p=3039 Back to the very impressive Bloomberg lecture theatre for this month’s Lucene/Solr Meetup, with an good turnout (I’m guessing 60-70 people). Our first talk came from Diego Ceccarelli of Bloomberg on how his team have created a Solr implementation of … More

The post London Lucene/Solr Meetup – Learning to Rank and Hibernate Search appeared first on Flax.

]]>
Back to the very impressive Bloomberg lecture theatre for this month’s Lucene/Solr Meetup, with an good turnout (I’m guessing 60-70 people). Our first talk came from Diego Ceccarelli of Bloomberg on how his team have created a Solr implementation of Learning to Rank, an improved way to rank search results using machine learning. Diego first took us through the basics of Lucene’s ranking methods, based on the venerable TF/IDF algorithm (although note that BM25 will be the default very soon). Bloomberg’s implementation first retrieves 1000 search results using standard TF/IDF (which is fast) and then extracts ‘features’ (a simple example might be ‘does the title match the search query?’) which are then fed to a machine learning model. This model is then used to re-rank the 1000 initial results and the top 10 supplied to the user. Interestingly, they have chosen to implement the features as Lucene queries, allowing for easy re-use. Initial tests have shown some metrics such as ‘clicks on the first result’ up by 10%, which is encouraging. There is now a Solr patch (SOLR-8542) which they hope to commit to Solr soon, and you can find slides and a video of a previous presentation on this topic online. I first heard about Learning to Rank from Microsoft Research some years ago and it’s great to see an open source implementation.

Next Sanne Grinovero of RedHat talked about Hibernate Search, an implementation of full-text search for users of this Java ORM. He gave us some great examples of how relational databases can be bad at full text search and thus the need for a full-text engine like Lucene. His implementation hides some of the finer details of Lucene but allows use of advanced Lucene API calls where necessary, and automatically keeps the Lucene index in sync with a relational database. A simple query DSL is available which he demonstrated in use for indexing and querying Twitter data. He then told us about Infinispan, a highly scalable key-value store which can also be used for storing Lucene indexes and mentioned ongoing work to add Elasticsearch and Solr integration.

We finished with a brief informal Q&A session outside; thanks to both presenters and to my co-hosts at Bloomberg for helping to organise the event. We hope to run another Meetup in a couple of months – as ever, offers of talks, a venue and sponsorship of snacks & drinks are very welcome!

The post London Lucene/Solr Meetup – Learning to Rank and Hibernate Search appeared first on Flax.

]]>
http://www.flax.co.uk/blog/2016/02/24/london-lucenesolr-meetup-learning-rank-hibernate-search/feed/ 2
Enterprise Search Europe 2015 review – day 1 http://www.flax.co.uk/blog/2015/10/28/enterprise-search-europe-2015-review-day-1/ http://www.flax.co.uk/blog/2015/10/28/enterprise-search-europe-2015-review-day-1/#comments Wed, 28 Oct 2015 15:07:53 +0000 http://www.flax.co.uk/?p=2753 This year’s Enterprise Search Europe started early for me – I had been invited to give the opening keynote, so I made sure I arrived early enough to make sure my laptop would play nicely with the projector, always a … More

The post Enterprise Search Europe 2015 review – day 1 appeared first on Flax.

]]>
This year’s Enterprise Search Europe started early for me – I had been invited to give the opening keynote, so I made sure I arrived early enough to make sure my laptop would play nicely with the projector, always a worry! The keynote was well recieved and I’m very grateful for the opportunity to talk about Big Data Analytics and streaming search.

Next up were Hans-Josef Jeanrond of Sinequa and Steve Woodward of AstraZeneca, a return visit after an excellent presentation by Steve’s colleague Nick Brown last year. AstraZeneca have a committed approach to search led directly by their CTO office, running hackathons, pilots and larger projects to rapidly deliver a raft of applications built on their core search platform. One key feature was delivering ‘cards’ in some cases – like Google, a calculator ‘card’ when a maths query is entered, or a calendar when someone asks about booking meetings. AstraZeneca are also building mobile apps, including a ‘people search’ that allows one to call or email with a single click. It’s great to see a large company putting significant resources into enterprise search and the benefits this can bring.

Dayle Collins of PwC and Vince McNamara of Dahu were next with a talk about PwC’s Exalead-powered enterprise search across a range of business-critical content. Dayle talked about how analysis and interviews were carried out to identify recurring search patterns in the business and identify a strategic focus and Vince then explained some of the technical features developed, including custom relevance ranking. Interestingly, entity extraction is also used at query time to classify which type of query a user has entered – are they asking about a company, product or employee for example. They mentioned how a ‘gold standard’ for search relevance is being developed – it seems this is being recorded in spreadsheets currently: perhaps they should consider a more interactive tool.

The next talk came from Ian Williams of NHS Wales Informatics Service who are building a large scale patient record service using Apache Solr. Ian explained the pressures facing the NHS (austerity, difficulty with staffing, ageing populations) and how patient records are currently distributed across a number of locations and sometimes still paper-based. This exciting project (which should be an example both the the rest of the NHS and other healthcare providers) uses Solr to create a single Welsh Clinical Portal, where healthcare providers can find information on 3 million patients in 135 hospitals and 400 GP practices across Wales. We’ve been lucky enough to work with Ian’s team on this project in a small way and it was very exciting to find out more details and hear about their future plans.

After lunch, Lesley Holmes of Nottinghamshire County Council told us about how they have attempted to improve search by focussing on metadata quality – using tools from ConceptSearching to automatically apply tags. Their content is spread across many servers and often duplicated but improving search can be have huge value to their users, who often provide services for vulnerable people where accurate and up-to-date information is essential. Cedric Ulmer of FranceLabs was next, describing with Alban Ferignac of the IFCE a project to replace Exalead with Apache Solr. Interestingly this talk contained some concrete numbers – Exalead was costing €75,000 plus a €15,000 support fee for a maximum of 6 million documents (their target is 50m), with updates costing even more, and IFCE were finding it difficult to obtain reactive support. The open source Solr (supported by a worldwide community and under constant development) gave them far more flexibility, no effective limit on the number of documents indexed and the migration process cost only €15,000 – as clear an indication of the benefits of open source search as I have seen.

Next I ran a roundtable discussion on implementing open source search which was well-attended and interactive – we discussed search engine pipelines for indexing thousands of sources amongst other subjects, and the discussions continued well after we had to vacate the room! I had to rush off soon afterwards to run the evening Meetup at a local pub, where I demonstrated the Quepid search relevance tool we’ve been using for client projects recently.

The post Enterprise Search Europe 2015 review – day 1 appeared first on Flax.

]]>
http://www.flax.co.uk/blog/2015/10/28/enterprise-search-europe-2015-review-day-1/feed/ 2
London Lucene/Solr Usergroup – website search and indexing the cloud http://www.flax.co.uk/blog/2015/09/11/london-lucenesolr-usergroup-website-search-and-indexing-the-cloud/ http://www.flax.co.uk/blog/2015/09/11/london-lucenesolr-usergroup-website-search-and-indexing-the-cloud/#respond Fri, 11 Sep 2015 08:58:52 +0000 http://www.flax.co.uk/?p=2437 This week’s London Lucene/Solr Meetup was hosted by asset management company BlackRock who also provided our first speakers. BlackRock manages an astonishing $4.7 trillion in assets (that’s more than the GDP of Germany) and operates 90 different websites with around … More

The post London Lucene/Solr Usergroup – website search and indexing the cloud appeared first on Flax.

]]>
This week’s London Lucene/Solr Meetup was hosted by asset management company BlackRock who also provided our first speakers. BlackRock manages an astonishing $4.7 trillion in assets (that’s more than the GDP of Germany) and operates 90 different websites with around 250,000 content items, so a good and accurate website search engine is essential. Although BlackRock use HP Autonomy‘s content management system and IDOL search engine, the latter is hard to tune (‘not deterministic, and why it ranks the way it does can be mysterious’) and Ife Nkechukwu and Erica Sundberg have been investigating Apache Solr as an alternative: being open source and with a powerful debugging features, Solr allows complete understanding of why a particular result is scored and ranked.

Starting with this great video (it’s from Google not BlackRock, but amusing and worth a look), Ife and Erica gave an engaging and clear presentation of their journey with Solr: how they explored the various options for crawling (Nutch and Heritrix were mentioned), how Analyzers are used to condition content for indexing and how Solr scoring ranking is actually calculated. This was one of the best ‘how to get started with Solr’ presentations I have seen and I was also very pleased to hear Ife say ‘you can’t just build search and forget it – you have to tune search like an instrument’ – entirely consistent with our own experience.

After a quick pizza break, Jim Liddle of Storage Made Easy was next up. Jim’s company provides appliances that connect to a myriad of cloud storage systems and provide a number of services (collaboration, sharing, governance, search) accessible via any computing or mobile device. Jim told us how they’d integrated Solr into their system to provide deep content search and filtering. Interestingly, Storage Made Easy chose Solr over Elasticsearch because they are ‘not quite sure where Elastic will end up in terms of commercials’ – even though Jim worked with Shay Banon (creator of Elasticsearch) at Gigaspaces. You can see Jim’s slides here where he explains how the hardest task was indexing permissions data. I was particularly interested in the ‘visual query builder’ they had developed for clients with very complex search requirements – this chimed with our own experience of working with complex media monitoring queries.

We finished with a Solr Q&A (Upayavira was kind enough to provide many of the answers) – BlackRock had kindly provided a prize for the best question (a mini quadcopter) – our winner was very happy! Thanks again to our hosts and presenters and I look forward to seeing you all again soon.

The post London Lucene/Solr Usergroup – website search and indexing the cloud appeared first on Flax.

]]>
http://www.flax.co.uk/blog/2015/09/11/london-lucenesolr-usergroup-website-search-and-indexing-the-cloud/feed/ 0