Lucene Index Dumper
LuceneAnalyzer is a quick hack for dumping and inspecting a Lucene index. Something for the ‘sort-uniq-cut-awk’ guys out there. :-)
- release 0.0.4 (for Lucene 3.1)
- release 0.0.3 (for Lucene 2.x)
Show global statistics of the index:
shell> ./luceneanalyzer -g /dir_to_some_lucene_index
Global Information:
===================
number of documents: 17
total number of features: 955
total number of tokens: 1442
version: 1328361447856
still current: true
maximal document number: 17
has deletions: false
Show field information:
shell> ./luceneanalyzer -f /dir_to_some_lucene_index
Field Information:
==================
Fields of type 'ALL':
store_0_coordinate
text
...
Fields of type 'INDEXED_WITH_TERMVECTOR':
includes
Fields of type 'TERMVECTOR':
Fields of type 'TERMVECTOR_WITH_OFFSET':
Fields of type 'TERMVECTOR_WITH_POSITION':
Fields of type 'TERMVECTOR_WITH_POSITION_OFFSET':
includes
Fields of type 'UNINDEXED':
store
Show information about terms, statistics and positions:
shell> ./luceneanalyzer -t -vv /dir_to_some_lucene_index
Terms:
======
cat camera 12[0]
cat connector 3[0],4[0]
cat copier 11[0]
cat electronics 1[0],2[0],3[0],4[0],5[0],6[0],7[0],8[0],9[0],10[0],11[0],12[0],15[0],16[0]
...
ext using 13[415]
text utf 14[3]
text v 8[2]
text va902b 9[1]
text valueselect 7[1]
A Git repository is accessible at git://git.andreasbaumann.cc/LuceneAnalyzer.git (or at http://git.andreasbaumann.cc/cgit/LuceneAnalyzer/ )
In case of questions, contact me via email.