field – Flax http://www.flax.co.uk The Open Source Search Specialists Thu, 10 Oct 2019 09:03:26 +0000 en-GB hourly 1 https://wordpress.org/?v=4.9.8 How to remove a stored field in Lucene http://www.flax.co.uk/blog/2011/06/24/how-to-remove-a-stored-field-in-lucene/ http://www.flax.co.uk/blog/2011/06/24/how-to-remove-a-stored-field-in-lucene/#respond Fri, 24 Jun 2011 12:12:42 +0000 http://www.flax.co.uk/blog/?p=598 While working on a customer project recently we found a very large field that was stored unnecessarily in the Lucene index, taking up a lot of space. As it would have taken a very long time to re-index (there are … More

The post How to remove a stored field in Lucene appeared first on Flax.

]]>
While working on a customer project recently we found a very large field that was stored unnecessarily in the Lucene index, taking up a lot of space. As it would have taken a very long time to re-index (there are tens of millions of complex documents in this case) we looked for a way to remove the stored field in-place.

There’s an interesting set of slides from last year’s Apache Lucene Eurocon which discuss this kind of Lucene index post-processing, but we didn’t find any tools to do this particular task (although this doesn’t mean they don’t exist – for example Luke may be helpful). So we wrote our own, based on some examples in the ‘contrib’ directory of Solr 4. We override the document() methods of FilterIndexReader to remove the required field from each returned Document’s field list. Terms aren’t interfered with, so it really is like changing the field from being stored to not being stored; it’s still indexed.

The code is available here. It’s written against Lucene 2.9.3 (which is contained in Solr 1.4.1).

The post How to remove a stored field in Lucene appeared first on Flax.

]]>
http://www.flax.co.uk/blog/2011/06/24/how-to-remove-a-stored-field-in-lucene/feed/ 0