real time – Flax http://www.flax.co.uk The Open Source Search Specialists Thu, 10 Oct 2019 09:03:26 +0000 en-GB hourly 1 https://wordpress.org/?v=4.9.8 Enterprise Search Europe 2015 review – day 2 http://www.flax.co.uk/blog/2015/10/29/enterprise-search-europe-2015-review-day-2/ http://www.flax.co.uk/blog/2015/10/29/enterprise-search-europe-2015-review-day-2/#respond Thu, 29 Oct 2015 15:44:11 +0000 http://www.flax.co.uk/?p=2762 Not such an early start for me for Day 2 (I’d been up pretty late running the Meetup the night before) but I did manage to catch the very end of Findwise‘s presentation on their annual Enterprise Search and Findability … More

The post Enterprise Search Europe 2015 review – day 2 appeared first on Flax.

]]>
Not such an early start for me for Day 2 (I’d been up pretty late running the Meetup the night before) but I did manage to catch the very end of Findwise‘s presentation on their annual Enterprise Search and Findability Survey. This is a unique and valuable benchmark of the state of enterprise search – I urge you to read it, if for no other reason than to be optimistic about the fact that in 2015 nearly 50% of the organisations surveyed have a strategy for search and findability – compared to only 20% in 2012.

Sadly I missed COWI‘s talk on migrating from Autonomy to Sharepoint 2013 (as you might expect I would have asked why move from one closed source solution to another when open source options exist). I did however catch Kurt Kragh Sørenson of Intrateam talking about lessons learned from their Enterprise Search Community of Practice in Denmark and Sweden – one particular phrase that stood out for me was “If your colleagues have given up on your search function it will take a long time to re-establish trust in your search function again”. Next was Anita Wilcox of University College Cork with a talk on their implementation of an open source system, reSearcher, which I hadn’t heard of before. She also talked about a federated search built using exploreIT from Deep Web Technologies and added that one should focus on developing a minimal viable product rather than lots of ‘nice to have’ features. Note that the library is named after George Boole, father of the Boolean logic used in most search engines to construct complex queries!

Next was the presentation of the Tony Kent Strix Award to Professor Peter Ingwersen, which started with an amusing tale of how it may be difficult to take a statue of an owl through airport security. After lunch, I had to step out for a meeting so missed a talk about the Port of Antwerp, but was glad to return to hear from Paul Cleverley of Robert Gordon University who has done some fascinating work on the ‘why’ of enterprise search and how to measure the impact of search. I’ll be using his research to inform my forthcoming presentation at Search Solutions next month.

The day finished with a ‘search clinic’ panel chaired by Valentin Richter of Raytion. I was very glad to hear Steve Woodward of AstraZeneca confirm that he can see a role for real-time analytics driven by search – confirming what I had said in my keynote the day before.

This year’s event was in my mind the best in terms of the content of the presentations – including some inspirational case studies from very large companies on how enterprise search can deliver better ways of working. I was also particularly pleased to see so many mentions of open source search software – back in 2011, at the first Enterprise Search Europe conference, this was still a relatively unknown option. Thanks as ever to the conference chair, Martin White, and Information Today for running the event.

The post Enterprise Search Europe 2015 review – day 2 appeared first on Flax.

]]>
http://www.flax.co.uk/blog/2015/10/29/enterprise-search-europe-2015-review-day-2/feed/ 0
Innovations in Knowledge Organisation, Singapore: a review http://www.flax.co.uk/blog/2015/06/12/innovations-in-knowledge-organisation-singapore-a-review/ http://www.flax.co.uk/blog/2015/06/12/innovations-in-knowledge-organisation-singapore-a-review/#respond Fri, 12 Jun 2015 09:48:41 +0000 http://www.flax.co.uk/blog/?p=1506 I’m just back from Singapore: my first visit to this amazing, dynamic and everchanging city-state, at the kind invitation of Patrick Lambe, to speak at the first Innovations in Knowledge Organisation conference. I think this was probably one of the … More

The post Innovations in Knowledge Organisation, Singapore: a review appeared first on Flax.

]]>
I’m just back from Singapore: my first visit to this amazing, dynamic and everchanging city-state, at the kind invitation of Patrick Lambe, to speak at the first Innovations in Knowledge Organisation conference. I think this was probably one of the best organised and most interesting events I’ve attended in the last few years.

The event started with an enthusiastic keynote from Patrick, introducing the topics we’d discuss over the next two days: knowledge management, taxonomies, linked data and search, a wide range of interlinked and interdependent themes. Next was a series of quick-fire PechaKucha sessions – 20 slides, 20 seconds each – a great way to introduce the audience to the topics under discussion, although slightly terrifying to deliver! I spoke on open source search, covering Elasticsearch & Solr and how to start a project using them, and somehow managed to draw breath occasionally. I think my fellow presenters also found it somewhat challenging although nobody lost the pace completely! Next was a quick, interactive panel discussion (roving mics rather than a row of seats) that set the scene for how the event would work – reactive, informal and exciting, rather than the traditional series of audience-facing Powerpoint presentations which don’t necessarily combine well with jetlag.

After lunch, showcasing Singapore’s multicultural heritage (I don’t think I’ve ever had pasta with Chinese peppered beef before, but I hope to again) we moved on to the first set of case studies. Each presenter had 6 minutes to sell their case study (my own was about how we helped Reed Specialist Recruitment build an open source search platform) and then attendees could choose which tables to join to discuss the cases further, for three 20-minute sessions. I had some great discussions including hearing about how a local government employment agency has used Solr. We then moved on to a ‘knowledge cafe’, with tables again divided up by topics chosen by the audience – so this really was a conference about what attendees wanted to discuss, not just what the presenters thought was important.

I was scheduled to deliver the keynote the next day, having been asked to speak on ‘The Future of Search’ – I chose to introduce some topics around Big Data and Streaming Analytics, and how search software might be used to analyze the huge volumes of data we might expect from the Internet of Things. I had some great feedback from the audience (although I’m pretty sure I inspired and confused them in equal measure) – perhaps Singapore was the right place to deliver this talk, as the government are planning to make it the world’s first ‘smart nation‘ – handling data will absolutely key to making this possible.

More case study pitches followed, and since I wasn’t delivering one myself this time I had a chance to listen to some of the studies. I particularly enjoyed hearing from Kia Siang Hock about the National Library Board Singapore‘s OneSearch service, which allowed a federated search across tens of millions of items from many different repositories (e.g. books, newspaper articles, audio transcripts). The technologies used included Veridian, Solr, Vocapia for speech transcription and Mahout for building a recommendation system. In particular, Solr was credited for saving ‘millions of Singapore dollars’ in license fees compared to the previous closed source search system it replaced. Also of interest was Straits Knowledge‘s system for capturing the knowledge assets of an organisation with a system built on a graph database, and Haliza Jailani on using named entity recognition and Linked Data (again for the National Library Board Singapore).

We then moved into the final sessions of the day, ‘knowledge clinics’ – like the ‘knowledge cafes’ these were table-based, informal and free-form discussions around topics chosen by attendees. Matt Moore then gave the last session of the day with an amusing take on Building Competencies, dividing KM professionals into individuals, tribes and organisations. Patrick and Maish Nichani then closed the event with a brief summary.

Singapore is a long way to go for an event, but I’m very glad I did. The truly international mix of attendees, the range of subjects and the dynamic and focused way the conference was organised made for a very interesting and engaging two days: I also made some great contacts and had a chance to see some of this beautiful city. Congratulations to Patrick, Maish and Dave Clarke on a very successful inaugural event and I’m looking forward to hearing about the next one! Slides and videos are already appearing on the IKO blog.

The post Innovations in Knowledge Organisation, Singapore: a review appeared first on Flax.

]]>
http://www.flax.co.uk/blog/2015/06/12/innovations-in-knowledge-organisation-singapore-a-review/feed/ 0
Searching for opportunities in Real-Time Analytics http://www.flax.co.uk/blog/2015/02/02/searching-for-opportunities-in-real-time-analytics/ http://www.flax.co.uk/blog/2015/02/02/searching-for-opportunities-in-real-time-analytics/#respond Mon, 02 Feb 2015 17:18:22 +0000 http://www.flax.co.uk/blog/?p=1374 I spent a day last week at a new event from UNICOM, a conference on Real-Time Analytics. Mike Ferguson chaired the event and was kind enough to spend time with me over lunch exploring how search software might fit into … More

The post Searching for opportunities in Real-Time Analytics appeared first on Flax.

]]>
I spent a day last week at a new event from UNICOM, a conference on Real-Time Analytics. Mike Ferguson chaired the event and was kind enough to spend time with me over lunch exploring how search software might fit into the mix, something that has been on my mind since hearing about the Unified Log concept a few weeks ago.

Real-Time Analytics is a field where sometimes vast amounts of data in motion is gathered, filtered, cleaned and analysed to trigger various actions to benefit a business: building on earlier capabilities in Business Intelligence, the endgame is a business that adapts automatically to changing conditions in real-time – for example, automating the purchasing of extra stock based on changing behaviour of customers. The analysis part of this chain is driven by complex models, often based on sets of training data. Complex Event Processing or CEP is an older term for this kind of process (if you’re already suffering from buzzword overflow, Martin Kleppman has put some of these terms in context for those more familiar with web paradigms). Tools mentioned included Amazon Kinesis and from the Apache stable Cassandra, Hadoop, Kafka, Yarn, Storm and Spark. I particularly enjoyed Michael Cutler‘s presentation on Tumra’s Spark-based system.

One of the central problems identified was due to the rapid growth of data (including from the fabled Internet of Things) it will shortly be impossible to store every data point produced – so we must somehow sort the wheat from the chaff. Options for the analysis part include SQL-like query languages and more complex machine learning algorithms. I found myself wondering if search technology, using a set of stored queries, could be used somehow to reduce the flow of this continuous stream of data, using something like this prototype implementation based on Apache Samza. One could use this approach to transform unstructured data (say, a stream of text-based customer comments) into more structured data for later timeline analysis, split streams of events into several parts for separate processing or just to watch for sets of particularly interesting and complex events. Although search platforms such as Elasticsearch are already being integrated into the various Real-Time Analytics frameworks, these seem to be being used for offline processing rather than acting directly on the stream itself.

One potential advantage is that it might be a lot easier for analysts to generate a stored search than to learn SQL or the complexities of machine learning – just spend some time with a collection of past events and refine your search terms, facets and filters until your results are useful, and save the query you have generated.

This was a very interesting introduction to a relatively new field and thanks to UNICOM for the invitation. We’re going to continue to explore the possibilities!

The post Searching for opportunities in Real-Time Analytics appeared first on Flax.

]]>
http://www.flax.co.uk/blog/2015/02/02/searching-for-opportunities-in-real-time-analytics/feed/ 0
Elasticsearch London Meetup: Templates, easy log search & lead generation http://www.flax.co.uk/blog/2015/01/30/elasticsearch-london-meetup-templates-easy-log-search-lead-generation/ http://www.flax.co.uk/blog/2015/01/30/elasticsearch-london-meetup-templates-easy-log-search-lead-generation/#comments Fri, 30 Jan 2015 14:01:05 +0000 http://www.flax.co.uk/blog/?p=1363 After a long day at a Real Time Analytics event (of which more later) I dropped into the Elasticsearch London User Group, hosted by Red Badger and provided with a ridiculously huge amount of pizza (I have a theory that … More

The post Elasticsearch London Meetup: Templates, easy log search & lead generation appeared first on Flax.

]]>
After a long day at a Real Time Analytics event (of which more later) I dropped into the Elasticsearch London User Group, hosted by Red Badger and provided with a ridiculously huge amount of pizza (I have a theory that you’ll be able to spot an Elasticsearch developer in a few years by the size of their pizza-filled belly).

First up was Reuben Sutton of Artirix, describing how his team had moved away from the Elasticsearch Ruby libraries (which can be very slow, mainly due to the time taken to decode/encode data as JSON) towards the relatively new Mustache templating framework. This has allowed them to remove anything complex to do with search from their UI code, although they have had some trouble with Mustache’s support for partial templates. They found documentation was somewhat lacking, but they have contributed some improvements to this.

Next was David Laing of CityIndex describing Logsearch, a powerful way to spin up clusters of ELK (Elasticsearch+Logstash+Kibana) servers for log analysis. Based on the BOSH toolchain and open sourced, this allows CityIndex to create clusters in minutes for handling large amounts of data (they are currently processing 50GB of logs every day). David showed how the system is resilient to server failure and will automatically ‘resurrect’ failed nodes, and interestingly how this enables them to use Amazon spot pricing at around a tenth of the cost of the more stable AWS offerings. I asked how this powerful system might be used in the general case of Elasticsearch cluster management but David said it is targetted at log processing – but of course according to some everything will soon be a log anyway!

The last talk was by Alex Mitchell and Francois Bouet of Growth Intelligence who provide lead generation services. They explained how they have used Elasticsearch at several points in their data flow – as a data store for the web pages they crawl (storing these in both raw and processed form using multi-fields), for feature generation using the term vector API and to encode simple business rules for particular clients – as well as to power the search features of their website, of course.

A short Q&A with some of the Elasticsearch team followed: we heard that the new Shield security plugin has had some third-party testing (the details of which I suggested are published if possible) and a preview of what might appear in the 2.0 release – further improvements to the aggregrations features including derivatives and anomaly detection sound very useful. A swift drink and natter about the world of search with Mark Harwood and it was time to get the train home. Thanks to all the speakers and of course Yann for organising as ever – see you next time!

The post Elasticsearch London Meetup: Templates, easy log search & lead generation appeared first on Flax.

]]>
http://www.flax.co.uk/blog/2015/01/30/elasticsearch-london-meetup-templates-easy-log-search-lead-generation/feed/ 1
Out and about in January and February http://www.flax.co.uk/blog/2015/01/27/out-and-about-in-january-and-february/ http://www.flax.co.uk/blog/2015/01/27/out-and-about-in-january-and-february/#respond Tue, 27 Jan 2015 11:08:39 +0000 http://www.flax.co.uk/blog/?p=1360 We’re speaking at a couple of events soon: if you’re in London and interested in Apache Lucene/Solr we’re also planning another London User Group Meetup soon. Firstly my colleague Alan Woodward is speaking with Martin Kleppman at FOSDEM in Brussels … More

The post Out and about in January and February appeared first on Flax.

]]>
We’re speaking at a couple of events soon: if you’re in London and interested in Apache Lucene/Solr we’re also planning another London User Group Meetup soon.

Firstly my colleague Alan Woodward is speaking with Martin Kleppman at FOSDEM in Brussels (31st January-1st February) on Searching over streams with Luwak and Apache Samza – about some fascinating work they’ve been doing to combine the powerful ‘reverse search’ facilities of our Luwak library with Apache Samza‘s distributed, stream-based processing. We’re hoping this means we can scale Luwak beyond its current limits (although those limits are pretty accomodating, as we know of systems where a million or so stored searches are applied to a million incoming messages every day). If you’re interested in open source search the Devroom they’re speaking in has lots of other great talks planned.

Next I’m talking about the wider applications of this kind of reverse search in the area of media monitoring, and how open source software in general can help you turn your organisation’s infrastructure upside down, at the Intrateam conference event in Copenhagen from February 24th-26th. Scroll down to find my talk at 11.35 am on Thursday 26th.

If you’d like to meet us at either of these events do get in touch.

The post Out and about in January and February appeared first on Flax.

]]>
http://www.flax.co.uk/blog/2015/01/27/out-and-about-in-january-and-february/feed/ 0
Searching & monitoring the Unified Log http://www.flax.co.uk/blog/2014/12/05/searching-monitoring-the-unified-log/ http://www.flax.co.uk/blog/2014/12/05/searching-monitoring-the-unified-log/#respond Fri, 05 Dec 2014 11:10:25 +0000 http://www.flax.co.uk/blog/?p=1331 This week I dropped into the Unified Log Meetup held at the rather hard to find offices of Just Eat (luckily there was some pizza left). The Unified Log movement is interesting and there’s a forthcoming book on the subject … More

The post Searching & monitoring the Unified Log appeared first on Flax.

]]>
This week I dropped into the Unified Log Meetup held at the rather hard to find offices of Just Eat (luckily there was some pizza left). The Unified Log movement is interesting and there’s a forthcoming book on the subject from Snowplow’s Alex Dean – the short version is this is all about massive scale logging of everything a business does in a resilient fashion and the eventual insights one might gain from this data. We’re considering streams of data rather than silos or repositories we usually index here, and I was interested to see how search technology might fit into the mix.

The first talk by Ian Meyers from AWS was about Amazon Kinesis, a hosted platform for durable storage of stream data. Kinesis focuses on durability and massive volume – 1 MB/sec was mentioned as a common input rate, and data is stored across multiple availability zones. The price of this durability is latency (from a HTTP PUT to the associated GET might be as much as three seconds) but you can be pretty sure that your data isn’t going anywhere unexpectedly. Kinesis also allows processing on the data stream and output to more permanent storage such as Amazon S3, or Elasticsearch for indexing. The analytics options allow for counting, bucketing and some filtering using regular expressions, for real-time stream analysis and dashboarding, but nothing particularly advanced from a search point of view.

Next up was Martin Kleppman (taking a sabbatical from LinkedIn and also writing a book) to talk about some open source options for stream handling and processing, Apache Kafka and Apache Samza. Martin’s slides described how LinkedIn handles 7-8 million messages a second using Kafka, which can be thought of an append-only file – to get data out again, you simply start reading from a particular place in the file, with all the reliable storage done for you under the hood. It’s a much simpler system than RabbitMQ which we’ve used on client projects at Flax in the past.

Martin explored how Samza can be used as a stream processing layer on top of Kafka, and even how oft-used databases can be moved into local storage within a Samza process. Interestingly, he described how a database can be expressed simply as a change log, with Kafka’s clever log compaction algorithms making this an efficient way to represent it. He then moved on to describe a prototype integration with our Luwak stored query library, allowing for full-text search within a stream, with the stored queries and matches themselves being of course just more Kafka streams.

It’s going to be interesting to see how this concept develops: the Unified Log movement and stream processing world in general seems to lack this kind of advanced text matching capability, and we’ve already developed Luwak as a highly scalable solution for some of our clients who may need to apply a million stored queries to a million new stories a day. The volumes discussed at the Meetup are a magnitude beyond that of course but we’re pretty confident Luwak and Samza can scale. Watch this space!

The post Searching & monitoring the Unified Log appeared first on Flax.

]]>
http://www.flax.co.uk/blog/2014/12/05/searching-monitoring-the-unified-log/feed/ 0
ElasticSearch London Meetup – a busy and interesting evening! http://www.flax.co.uk/blog/2014/02/26/elasticsearch-london-meetup-a-busy-and-interesting-evening/ http://www.flax.co.uk/blog/2014/02/26/elasticsearch-london-meetup-a-busy-and-interesting-evening/#respond Wed, 26 Feb 2014 13:44:43 +0000 http://www.flax.co.uk/blog/?p=1139 I was lucky enough to attend the London ElasticSearch User Group’s Meetup last night – around 130 people came to the Goldman Sachs offices in Fleet Street with many more on the waiting list. It signifies quite how much interest … More

The post ElasticSearch London Meetup – a busy and interesting evening! appeared first on Flax.

]]>
I was lucky enough to attend the London ElasticSearch User Group’s Meetup last night – around 130 people came to the Goldman Sachs offices in Fleet Street with many more on the waiting list. It signifies quite how much interest there is in ElasticSearch these days and the event didn’t disappoint, with some fascinating talks.

Hugo Pickford-Wardle from Rely Consultancy kicked off with a discussion about how ElasticSearch allows for rapid ‘hard prototyping’ – a way to very quickly test the feasibility of a business idea, and/or to demonstrate previously impossible functionality using open source software. His talk focussed on how a search engine can help to surface content from previously unconnected and inaccessible ‘data islands’ and can help promote re-use and repurposing of the data, and can lead clients to understand the value of committing to funding further development. Examples included a new search over planning applications for Westminster City Council. Interestingly, Hugo mentioned that during one project ElasticSearch was found to be 10 times faster than the closed source (and very expensive) Autonomy IDOL search engine.

Next was Indy Tharmakumar from our hosts Goldman Sachs, showing how his team have built powerful support systems using ElasticSearch to index log data. Using 32 1 core CPU instances the system they have built can store 1.2 billion log lines with a throughput up to 40,000 messages a second (the systems monitored produce 5TB of log data every day). Log data is queued up in Redis, distributed to many Logstash processes, indexed by Elasticsearch with a Kibana front end. They learned that Logstash can be particularly CPU intensive but Elasticsearch itself scales extremely well. Future plans include considering Apache Kafka as a data backbone.

The third presentation was by Clinton Gormley of ElasticSearch, talking about the new cross field matching features that allow term frequencies to be summed across several fields, preventing certain cases where traditional matching techniques based on Lucene‘s TF/IDF ranking model can produce some unexpected behaviour. Most interesting for me was seeing Marvel, a new product from ElasticSearch (the company), containing the Sense developer console allowing for on-the-fly experimentation. I believe this started as a Chrome plugin.

The last talk, by Mark Harwood, again from ElasticSearch, was the most interesting for me. Mark demonstrated how to use a new feature (planned for the 1.1 release, or possibly later), an Aggregator for significant terms. This allows one to spot anomalies in a data set – ‘uncommon common’ occurrences as Mark described it. His prototype showed a way to visualise UK crime data using Google Earth, identifying areas of the country where certain crimes are most reported – examples including bike theft here in Cambridge (which we’re sadly aware of!). Mark’s Twitter account has some further information and pictures. This kind of technique allows for very powerful analytics capabilities to be built using Elasticsearch to spot anomalies such as compromised credit cards and to use visualisation to further identify the guilty party, for example a hacked online merchant. As Mark said, it’s important to remember that the underlying Lucene search library counts everything – and we can use those counts in some very interesting ways.
UPDATE Mark has posted some code from his demo here.

The evening closed with networking, pizza and beer with a great view over the City – thanks to Yann Cluchey for organising the event. We have our own Cambridge Search Meetup next week and we’re also featuring ElasticSearch, as does the London Search Meetup a few weeks later – hope to see you there!

The post ElasticSearch London Meetup – a busy and interesting evening! appeared first on Flax.

]]>
http://www.flax.co.uk/blog/2014/02/26/elasticsearch-london-meetup-a-busy-and-interesting-evening/feed/ 0
Search backwards – media monitoring with open source search http://www.flax.co.uk/blog/2012/03/08/search-backwards-media-monitoring-with-open-source-search/ http://www.flax.co.uk/blog/2012/03/08/search-backwards-media-monitoring-with-open-source-search/#comments Thu, 08 Mar 2012 12:00:45 +0000 http://www.flax.co.uk/blog/?p=722 We’re working with a number of clients on media monitoring solutions, which are a special case of search application (we’ve worked on this previously for Durrants). In standard search, you apply a single query to a large amount of documents, … More

The post Search backwards – media monitoring with open source search appeared first on Flax.

]]>
We’re working with a number of clients on media monitoring solutions, which are a special case of search application (we’ve worked on this previously for Durrants). In standard search, you apply a single query to a large amount of documents, expecting to get a ranked list of documents that match your query as a result. However in media monitoring you need to search each incoming document (for example, a news article or blog post) with many queries representing what the end user wants to monitor – and you need to do this quickly as you may have tens or hundreds of thousands of articles to monitor in close to real time (Durrants have over 60,000 client queries to apply to half a million articles a day). This ‘backwards’ search isn’t really what search engines were designed to do, so performance could potentially be very poor.

There are several ways around this problem: for example in most cases you don’t need to monitor every article for every client, as they will have told you they’re only interested in certain sources (for example, a car manufacturer might want to keep an eye on car magazines and the reviews in the back page of the Guardian Saturday magazine, but doesn’t care about the rest of the paper or fashion magazines). However, pre-filtering queries in this way can be complex especially when there are so many potential sources of data.

We’ve recently managed to develop a method for searching incoming articles using a brute-force approach based on Apache Lucene which in early tests is performing very well – around 70,000 queries applied to a single article in around a second on a standard MacBook. On suitable server hardware this would be even faster – and of course you have all the other features of Lucene potentially available, such as phrase queries, wildcards and highlighting. We’re looking forward to being able to develop some powerful – and economically scalable – media monitoring solutions based on this core.

The post Search backwards – media monitoring with open source search appeared first on Flax.

]]>
http://www.flax.co.uk/blog/2012/03/08/search-backwards-media-monitoring-with-open-source-search/feed/ 2
Enterprise Search Meetup: exploratory search, TravelMatch and Stephen Arnold http://www.flax.co.uk/blog/2010/12/02/enterprise-search-meetup-exploratory-search-travelmatch-and-stephen-arnold/ http://www.flax.co.uk/blog/2010/12/02/enterprise-search-meetup-exploratory-search-travelmatch-and-stephen-arnold/#respond Thu, 02 Dec 2010 11:34:07 +0000 http://www.flax.co.uk/blog/?p=433 Last night I went to another excellent Enterprise Search London Meetup, at Skinkers near London Bridge. I’d been at the Online show all day, which was rather tiring, so it was great to sit down with beer and nibbles and … More

The post Enterprise Search Meetup: exploratory search, TravelMatch and Stephen Arnold appeared first on Flax.

]]>
Last night I went to another excellent Enterprise Search London Meetup, at Skinkers near London Bridge. I’d been at the Online show all day, which was rather tiring, so it was great to sit down with beer and nibbles and hear some excellent speakers.

Max Wilson kicked off with a talk on exploratory search and ‘searching for leisure’. His Search Interface Inspector looks like a fascinating resource, and we heard about how he and his team have been constructing a taxonomy for the different kinds of search people do, using Twitter as a data source.

Martina Schell was next with details of Travel Match, a holiday search engine that’s trying to do for holidays what our customer Mydeco is doing for interior design: scrape/feed/gather as much holiday data as you can, put it all into a powerful search engine and build innovative interfaces on top. They’ve tried various interfaces including a ‘visual search’, but after much user testing have reined back their ambitions somewhat – however they’re still unique in allowing some very complex queries of their data. Interestingly, one challenge they identified is how to inform users that one choice (say, airport to fly from) may affect the available range of other choices (say, destinations) – apparently users often click repeatedly on ‘greyed-out’ options, unsure as to why they’re not working…

The inimitable Stephen Arnold concluded the evening with a realistic treatment of the current fashion for ‘real-time’ search. His point was that unless you’re Google, with their fibre-connected, hardware-accelerated gigascale architecture, you’re not going to be able to do real-time web search or anything close to it; on a smaller scale, for financial trading, military and other serious applications you again need to rely on the hardware – so for proper real-time (that means very close to zero latency), your engineering capability, not your software capability is what counts. I’m inclined to agree – I trained as an electronic engineer and worked on digital audio, back when this was also only possible with clever hardware design. Of course, eventually the commodity hardware gets fast enough to move away from specialised devices, and at this point even the laziest coder can create responsive systems, but we’re far away from that point. Perhaps the marketing departments of some search companies should take note – if you say you can do real-time indexing, we’re not going to believe you.

Thanks again to Tyler Tate and all at TwigKit for continuing to organise and support this excellent event.

The post Enterprise Search Meetup: exploratory search, TravelMatch and Stephen Arnold appeared first on Flax.

]]>
http://www.flax.co.uk/blog/2010/12/02/enterprise-search-meetup-exploratory-search-travelmatch-and-stephen-arnold/feed/ 0
Predictions http://www.flax.co.uk/blog/2010/01/20/predictions/ http://www.flax.co.uk/blog/2010/01/20/predictions/#comments Wed, 20 Jan 2010 11:17:23 +0000 http://www.flax.co.uk/blog/?p=257 A new year, and a chance to think about what might happen in the world of enterprise search over the next twelve months. I’ll make a stab at some predictions: Price cuts – possibly driven by even harsher competition between … More

The post Predictions appeared first on Flax.

]]>
A new year, and a chance to think about what might happen in the world of enterprise search over the next twelve months. I’ll make a stab at some predictions:

  1. Price cuts – possibly driven by even harsher competition between Google and Microsoft FAST, I can see prices coming down for packaged enterprise search. Autonomy will probably raise theirs 🙂
  2. Real time search matures – not just Twitter or Facebook, but real time data from many sources being part of enterprise search results
  3. More geolocation-aware search – in the U.K. at least, we’re seeing signs that the source data is finally being freed up, which should make it a lot simpler and cheaper to build location-aware solutions
  4. A few less second-tier players in the market – it’s still difficult out there, I’m afraid not every company will survive the next year.

You’re welcome to take any of these with a generous pinch of salt!

The post Predictions appeared first on Flax.

]]>
http://www.flax.co.uk/blog/2010/01/20/predictions/feed/ 1