题记:今天在公司研究elasticsearch,突然看到一篇博客说elasticsearch具有索引修复功能,顿感好奇,于是点进去看了下,发现原来是Lucene Core自带的功能。说实话之前学习Lucene文件格式的时候就想做一个索引文件解析和检测的工具,也动手写了一部分,最后没想到发现了一个已有的工具,正好对照着学习下。

索引的修复主要是用到CheckIndex.java这个类,可以直接查看类的Main函数来了解下。

1、 CheckIndex的使用;

首先使用以下命令来查看lucenecore.jar怎么使用:

192:lib rcf$ java -cp lucene-core-4.8-SNAPSHOT.jar -ea:org.apache.lucene... org.apache.lucene.index.CheckIndex 

ERROR: index path not specified

Usage: java org.apache.lucene.index.CheckIndex pathToIndex [-fix] [-crossCheckTermVectors] [-segment X] [-segment Y] [-dir-impl X]

  -fix: actually write a new segments_N file, removing any problematic segments
  -crossCheckTermVectors: verifies that term vectors match postings; THIS IS VERY SLOW!
  -codec X: when fixing, codec to write the new segments_N file with
  -verbose: print additional details
  -segment X: only check the specified segments.  This can be specified multiple
              times, to check more than one segment, eg '-segment _2 -segment _a'.
              You can't use this with the -fix option
  -dir-impl X: use a specific FSDirectory implementation. If no package is specified the org.apache.lucene.store package will be used.

**WARNING**: -fix should only be used on an emergency basis as it will cause
documents (perhaps many) to be permanently removed from the index.  Always make
a backup copy of your index before running this!  Do not run this tool on an index
that is actively being written to.  You have been warned!

Run without -fix, this tool will open the index, report version information
and report any exceptions it hits and what action it would take if -fix were
specified.  With -fix, this tool will remove any segments that have issues and
write a new segments_N file.  This means all documents contained in the affected
segments will be removed.

This tool exits with exit code 1 if the index cannot be opened or has any
corruption, else 0.

当敲java -cp lucene-core-4.8-SNAPSHOT.jar -ea:org.apache.lucene... org.apache.lucene.index.CheckIndex 这个就能看到相当于help的信息啦,但是为什么这里用这么一串奇怪的命令呢?通过java -help来查看-cp 以及 -ea就可以发现,-cp其实等同于-classpath 提供类和jar的搜索路径,-ea等同于-enableassertions提供是否启动断言设置。所以上述的命令其实可以简化为java -cp lucene-core-4.8-SNAPSHOT.jar org.apache.lucene.index.CheckIndex 。

首先来检查下索引的情况:可以看出信息蛮清楚明了的。

userdeMacBook-Pro:lib rcf$ java -cp lucene-core-4.8-SNAPSHOT.jar -ea:org.apache.lucene.index...  org.apache.lucene.index.CheckIndex ../../../../../../solr/Solr/test/data/index

Opening index @ ../../../../../../solr/Solr/test/data/index

Segments file=segments_r numSegments=7 version=4.8 format= userData={commitTimeMSec=1411221019854}
  1 of 7: name=_k docCount=18001
    codec=Lucene46
    compound=false
    numFiles=10
    size (MB)=0.493
    diagnostics = {timestamp=1411221019346, os=Mac OS X, os.version=10.9.4, mergeFactor=10, source=merge, lucene.version=4.8-SNAPSHOT Unversioned directory - rcf - 2014-09-20 21:11:36, os.arch=x86_64, mergeMaxNumSegments=-1, java.version=1.7.0_60, java.vendor=Oracle Corporation}
    no deletions
    test: open reader.........OK
    test: check integrity.....OK
    test: check live docs.....OK
    test: fields..............OK [3 fields]
    test: field norms.........OK [1 fields]
    test: terms, freq, prox...OK [36091 terms; 54003 terms/docs pairs; 18001 tokens]
    test: stored fields.......OK [54003 total field count; avg 3 fields per doc]
    test: term vectors........OK [0 total vector count; avg 0 term/freq vector fields per doc]
    test: docvalues...........OK [0 docvalues fields; 0 BINARY; 0 NUMERIC; 0 SORTED; 0 SORTED_SET]

  2 of 7: name=_l docCount=1000
    codec=Lucene46
    compound=false
    numFiles=10
    size (MB)=0.028
    diagnostics = {timestamp=1411221019406, os=Mac OS X, os.version=10.9.4, source=flush, lucene.version=4.8-SNAPSHOT Unversioned directory - rcf - 2014-09-20 21:11:36, os.arch=x86_64, java.version=1.7.0_60, java.vendor=Oracle Corporation}
    no deletions
    test: open reader.........OK
    test: check integrity.....OK
    test: check live docs.....OK
    test: fields..............OK [3 fields]
    test: field norms.........OK [1 fields]
    test: terms, freq, prox...OK [2002 terms; 3000 terms/docs pairs; 1000 tokens]
    test: stored fields.......OK [3000 total field count; avg 3 fields per doc]
    test: term vectors........OK [0 total vector count; avg 0 term/freq vector fields per doc]
    test: docvalues...........OK [0 docvalues fields; 0 BINARY; 0 NUMERIC; 0 SORTED; 0 SORTED_SET]

  3 of 7: name=_m docCount=1000
    codec=Lucene46
    compound=false
    numFiles=10
    size (MB)=0.028
    diagnostics = {timestamp=1411221019478, os=Mac OS X, os.version=10.9.4, source=flush, lucene.version=4.8-SNAPSHOT Unversioned directory - rcf - 2014-09-20 21:11:36, os.arch=x86_64, java.version=1.7.0_60, java.vendor=Oracle Corporation}
    no deletions
    test: open reader.........OK
    test: check integrity.....OK
    test: check live docs.....OK
    test: fields..............OK [3 fields]
    test: field norms.........OK [1 fields]
    test: terms, freq, prox...OK [2002 terms; 3000 terms/docs pairs; 1000 tokens]
    test: stored fields.......OK [3000 total field count; avg 3 fields per doc]
    test: term vectors........OK [0 total vector count; avg 0 term/freq vector fields per doc]
    test: docvalues...........OK [0 docvalues fields; 0 BINARY; 0 NUMERIC; 0 SORTED; 0 SORTED_SET]

  4 of 7: name=_n docCount=1000
    codec=Lucene46
    compound=false
    numFiles=10
    size (MB)=0.028
    diagnostics = {timestamp=1411221019552, os=Mac OS X, os.version=10.9.4, source=flush, lucene.version=4.8-SNAPSHOT Unversioned directory - rcf - 2014-09-20 21:11:36, os.arch=x86_64, java.version=1.7.0_60, java.vendor=Oracle Corporation}
    no deletions
    test: open reader.........OK
    test: check integrity.....OK
    test: check live docs.....OK
    test: fields..............OK [3 fields]
    test: field norms.........OK [1 fields]
    test: terms, freq, prox...OK [2002 terms; 3000 terms/docs pairs; 1000 tokens]
    test: stored fields.......OK [3000 total field count; avg 3 fields per doc]
    test: term vectors........OK [0 total vector count; avg 0 term/freq vector fields per doc]
    test: docvalues...........OK [0 docvalues fields; 0 BINARY; 0 NUMERIC; 0 SORTED; 0 SORTED_SET]

  5 of 7: name=_o docCount=1000
    codec=Lucene46
    compound=false
    numFiles=10
    size (MB)=0.028
    diagnostics = {timestamp=1411221019629, os=Mac OS X, os.version=10.9.4, source=flush, lucene.version=4.8-SNAPSHOT Unversioned directory - rcf - 2014-09-20 21:11:36, os.arch=x86_64, java.version=1.7.0_60, java.vendor=Oracle Corporation}
    no deletions
    test: open reader.........OK
    test: check integrity.....OK
    test: check live docs.....OK
    test: fields..............OK [3 fields]
    test: field norms.........OK [1 fields]
    test: terms, freq, prox...OK [2002 terms; 3000 terms/docs pairs; 1000 tokens]
    test: stored fields.......OK [3000 total field count; avg 3 fields per doc]
    test: term vectors........OK [0 total vector count; avg 0 term/freq vector fields per doc]
    test: docvalues...........OK [0 docvalues fields; 0 BINARY; 0 NUMERIC; 0 SORTED; 0 SORTED_SET]

  6 of 7: name=_p docCount=1000
    codec=Lucene46
    compound=false
    numFiles=10
    size (MB)=0.028
    diagnostics = {timestamp=1411221019739, os=Mac OS X, os.version=10.9.4, source=flush, lucene.version=4.8-SNAPSHOT Unversioned directory - rcf - 2014-09-20 21:11:36, os.arch=x86_64, java.version=1.7.0_60, java.vendor=Oracle Corporation}
    no deletions
    test: open reader.........OK
    test: check integrity.....OK
    test: check live docs.....OK
    test: fields..............OK [3 fields]
    test: field norms.........OK [1 fields]
    test: terms, freq, prox...OK [2002 terms; 3000 terms/docs pairs; 1000 tokens]
    test: stored fields.......OK [3000 total field count; avg 3 fields per doc]
    test: term vectors........OK [0 total vector count; avg 0 term/freq vector fields per doc]
    test: docvalues...........OK [0 docvalues fields; 0 BINARY; 0 NUMERIC; 0 SORTED; 0 SORTED_SET]

  7 of 7: name=_q docCount=1000
    codec=Lucene46
    compound=false
    numFiles=10
    size (MB)=0.027
    diagnostics = {timestamp=1411221019863, os=Mac OS X, os.version=10.9.4, source=flush, lucene.version=4.8-SNAPSHOT Unversioned directory - rcf - 2014-09-20 21:11:36, os.arch=x86_64, java.version=1.7.0_60, java.vendor=Oracle Corporation}
    no deletions
    test: open reader.........OK
    test: check integrity.....OK
    test: check live docs.....OK
    test: fields..............OK [3 fields]
    test: field norms.........OK [1 fields]
    test: terms, freq, prox...OK [2001 terms; 3000 terms/docs pairs; 1000 tokens]
    test: stored fields.......OK [3000 total field count; avg 3 fields per doc]
    test: term vectors........OK [0 total vector count; avg 0 term/freq vector fields per doc]
    test: docvalues...........OK [0 docvalues fields; 0 BINARY; 0 NUMERIC; 0 SORTED; 0 SORTED_SET]

No problems were detected with this index.

由于我的索引文件是正常的,那么通过网上的例子来查看下错误的情况下是什么样子的,并且-fix是怎么样子的效果:

来自网友:http://blog.csdn.net/laigood/article/details/8296678

Segments file=segments_2cg numSegments=26 version=3.6.1 format=FORMAT_3_1 [Lucene 3.1+] userData={translog_id=1347536741715}
  1 of 26: name=_59ct docCount=4711242
    compound=false
    hasProx=true
    numFiles=9
    size (MB)=6,233.694
    diagnostics = {mergeFactor=13, os.version=2.6.32-71.el6.x86_64, os=Linux, lucene.version=3.6.1 1362471 - thetaphi - 2012-07-17 12:40:12, source=merge, os.arch=amd64, mergeMaxNumSegments=-1, java.version=1.6.0_24, java.vendor=Sun Microsystems Inc.}
    has deletions [delFileName=_59ct_1b.del]
    test: open reader.........OK [3107 deleted docs]
    test: fields..............OK [25 fields]
    test: field norms.........OK [10 fields]
    test: terms, freq, prox...OK [36504908 terms; 617641081 terms/docs pairs; 742052507 tokens]
    test: stored fields.......ERROR [read past EOF: MMapIndexInput(path="/usr/local/sas/escluster/data/cluster/nodes/0/indices/index/5/index/_59ct.fdt")]
java.io.EOFException: read past EOF: MMapIndexInput(path="/usr/local/sas/escluster/data/cluster/nodes/0/indices/index/5/index/_59ct.fdt")
        at org.apache.lucene.store.MMapDirectory$MMapIndexInput.readBytes(MMapDirectory.java:307)
        at org.apache.lucene.index.FieldsReader.addField(FieldsReader.java:400)
        at org.apache.lucene.index.FieldsReader.doc(FieldsReader.java:253)
        at org.apache.lucene.index.SegmentReader.document(SegmentReader.java:492)
        at org.apache.lucene.index.IndexReader.document(IndexReader.java:1138)
        at org.apache.lucene.index.CheckIndex.testStoredFields(CheckIndex.java:852)
        at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:581)
        at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:1064)
    test: term vectors........OK [0 total vector count; avg 0 term/freq vector fields per doc]
FAILED
    WARNING: fixIndex() would remove reference to this segment; full exception:
java.lang.RuntimeException: Stored Field test failed
        at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:593)
        at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:1064)


WARNING: 1 broken segments (containing 4708135 documents) detected
WARNING: 4708135 documents will be lost

在检查结果中可以看到,分片5的_59ct.fdt索引文件损坏,.fdt文件主要存储lucene索引中存储的fields,所以在检查test: stored fields时出错。

下面的警告是说有一个损坏了的segment,里面有4708135个文档。

在原来的命令基础上加上-fix参数可以进行修复索引操作(ps:在进行修改前最好对要修复的索引进行备份,不要在正在执行写操作的索引上执行修复。)

java -cp lucene-core-3.6.1.jar -ea:org.apache.lucene... org.apache.lucene.index.CheckIndex /usr/local/sas/escluster/data/cluster/nodes/0/indices/index/5/index/ -fix
NOTE: will write new segments file in 5 seconds; this will remove 4708135 docs from the index. THIS IS YOUR LAST CHANCE TO CTRL+C!
  5...
  4...
  3...
  2...
  1...
Writing...
OK
Wrote new segments file "segments_2ch"

还可以通过检查某一个segment:

userdeMacBook-Pro:lib rcf$ java -cp lucene-core-4.8-SNAPSHOT.jar -ea:org.apache.lucene.index...  org.apache.lucene.index.CheckIndex ../../../../../../solr/Solr/test/data/index -segment _9

Opening index @ ../../../../../../solr/Solr/test/data/index

Segments file=segments_r numSegments=7 version=4.8 format= userData={commitTimeMSec=1411221019854}

Checking only these segments: _9:
No problems were detected with this index.

还可以通过-verbose查看更多详细信息,这里就不在详述。

2、 CheckIndex的源码;

接着我们再来学习下CheckIndex的源码是怎么来实现以上功能的,检查索引的功能主要集中在checkindex()函数上,

public Status checkIndex(List<String> onlySegments) throws IOException {
..
   final int numSegments = sis.size();                         //获取segment个数
   final String segmentsFileName = sis.getSegmentsFileName();  //获取segment_N名字
   // note: we only read the format byte (required preamble) here!
   IndexInput input = null;
   try {
     input = dir.openInput(segmentsFileName, IOContext.READONCE);//读取segment_N文件
   } catch (Throwable t) {
     msg(infoStream, "ERROR: could not open segments file in directory");
     if (infoStream != null)
       t.printStackTrace(infoStream);
     result.cantOpenSegments = true;
     return result;
   }
   int format = 0;
   try {
     format = input.readInt();  //读取segment_N version
   } catch (Throwable t) {
     msg(infoStream, "ERROR: could not read segment file version in directory");
     if (infoStream != null)
       t.printStackTrace(infoStream);
     result.missingSegmentVersion = true;
     return result;
   } finally {
     if (input != null)
       input.close();
   }

   String sFormat = "";
   boolean skip = false;

   result.segmentsFileName = segmentsFileName;//segment_N名字
   result.numSegments = numSegments;      //segment个数
   result.userData = sis.getUserData();   //获取user信息,如userData={commitTimeMSec=1411221019854}
   String userDataString;
   if (sis.getUserData().size() > 0) {
     userDataString = " userData=" + sis.getUserData();
   } else {
     userDataString = "";
   }
    //获取版本信息,如version=4.8
   String versionString = null;
   if (oldSegs != null) {
     if (foundNonNullVersion) {
       versionString = "versions=[" + oldSegs + " .. " + newest + "]";
     } else {
       versionString = "version=" + oldSegs;
     }
   } else {
     versionString = oldest.equals(newest) ? ( "version=" + oldest ) : ("versions=[" + oldest + " .. " + newest + "]");
   }

   msg(infoStream, "Segments file=" + segmentsFileName + " numSegments=" + numSegments
       + " " + versionString + " format=" + sFormat + userDataString);

   if (onlySegments != null) {
     result.partial = true;
     if (infoStream != null) {
       infoStream.print("\nChecking only these segments:");
       for (String s : onlySegments) {
         infoStream.print(" " + s);
       }
     }
     result.segmentsChecked.addAll(onlySegments);
     msg(infoStream, ":");
   }

   if (skip) {
     msg(infoStream, "\nERROR: this index appears to be created by a newer version of Lucene than this tool was compiled on; please re-compile this tool on the matching version of Lucene; exiting");
     result.toolOutOfDate = true;
     return result;
   }


   result.newSegments = sis.clone();
   result.newSegments.clear();
   result.maxSegmentName = -1;
   //开始遍历segment,检查segment
   for(int i=0;i<numSegments;i++) {
     final SegmentCommitInfo info = sis.info(i); //获取segment信息
     int segmentName = Integer.parseInt(info.info.name.substring(1), Character.MAX_RADIX);
     if (segmentName > result.maxSegmentName) {
       result.maxSegmentName = segmentName;
     }
     if (onlySegments != null && !onlySegments.contains(info.info.name)) {
       continue;
     }
     Status.SegmentInfoStatus segInfoStat = new Status.SegmentInfoStatus();
     result.segmentInfos.add(segInfoStat);
     //获取segments编号,名字,document个数,如下信息 1 of 7: name=_k docCount=18001
     msg(infoStream, "  " + (1+i) + " of " + numSegments + ": name=" + info.info.name + " docCount=" + info.info.getDocCount());
     segInfoStat.name = info.info.name;
     segInfoStat.docCount = info.info.getDocCount();
     
     final String version = info.info.getVersion();
     if (info.info.getDocCount() <= 0 && version != null && versionComparator.compare(version, "4.5") >= 0) {
       throw new RuntimeException("illegal number of documents: maxDoc=" + info.info.getDocCount());
     }

     int toLoseDocCount = info.info.getDocCount();

     AtomicReader reader = null;

     try {
       final Codec codec = info.info.getCodec(); //获取codec信息,如codec=Lucene46
       msg(infoStream, "    codec=" + codec);
       segInfoStat.codec = codec;
       msg(infoStream, "    compound=" + info.info.getUseCompoundFile());//获取复合文档格式标志位:compound=false
       segInfoStat.compound = info.info.getUseCompoundFile();
       msg(infoStream, "    numFiles=" + info.files().size());
       segInfoStat.numFiles = info.files().size(); //获取段内文件个数numFiles=10
       segInfoStat.sizeMB = info.sizeInBytes()/(1024.*1024.);//获取segment大小如size (MB)=0.493
       if (info.info.getAttribute(Lucene3xSegmentInfoFormat.DS_OFFSET_KEY) == null) {
         // don't print size in bytes if its a 3.0 segment with shared docstores
         msg(infoStream, "    size (MB)=" + nf.format(segInfoStat.sizeMB));
       }
       //获取诊断信息,diagnostics = {timestamp=1411221019346, os=Mac OS X, os.version=10.9.4, mergeFactor=10, 
       //source=merge, lucene.version=4.8-SNAPSHOT Unversioned directory - rcf - 2014-09-20 21:11:36, 
       //os.arch=x86_64, mergeMaxNumSegments=-1, java.version=1.7.0_60, java.vendor=Oracle Corporation}
       Map<String,String> diagnostics = info.info.getDiagnostics();
       segInfoStat.diagnostics = diagnostics;
       if (diagnostics.size() > 0) {
         msg(infoStream, "    diagnostics = " + diagnostics);
       }
       //判断是否有document删除,如输出no deletions 或者 has deletions [delFileName=_59ct_1b.del]
       if (!info.hasDeletions()) {
         msg(infoStream, "    no deletions");
         segInfoStat.hasDeletions = false;
       }
       else{
         msg(infoStream, "    has deletions [delGen=" + info.getDelGen() + "]");
         segInfoStat.hasDeletions = true;
         segInfoStat.deletionsGen = info.getDelGen();
       }
       
       //通过新建SegmentReader对象来检查是否能获取索引reader,如果扔出错误说明不能打开。例如输出test: open reader.........OK,
       if (infoStream != null)
         infoStream.print("    test: open reader.........");
       reader = new SegmentReader(info, DirectoryReader.DEFAULT_TERMS_INDEX_DIVISOR, IOContext.DEFAULT);
       msg(infoStream, "OK");

       segInfoStat.openReaderPassed = true;
       //通过checkIntegrity来检查文件的完整性,例如输出:test: check integrity.....OK
       //checkIntegrity函数是通过CodecUtil.checksumEntireFile()实现来检查文件的完整。
       if (infoStream != null)
         infoStream.print("    test: check integrity.....");
       reader.checkIntegrity();
       msg(infoStream, "OK");
       
       //检查document数量是否正确,如果有删除的document时候,通过reader.numDocs() 
       //== info.info.getDocCount() - info.getDelCount()以及统计以及检测livedocs的个数是否于reader.numDocs() 一致。
       //注意:没有删除的document时候,livedocs为null;
       //当没有删除的document时候,reader.maxDoc() == info.info.getDocCount()来检查是否一致。
       //输出结果比如:test: check live docs.....OK 
       //solr的管理界面可以显示: 现有document,删除的document以及所有的document数量,获取方法就是如下。
       if (infoStream != null)
         infoStream.print("    test: check live docs.....");
       final int numDocs = reader.numDocs();
       toLoseDocCount = numDocs;
       if (reader.hasDeletions()) {
         if (reader.numDocs() != info.info.getDocCount() - info.getDelCount()) {
           throw new RuntimeException("delete count mismatch: info=" + (info.info.getDocCount() - info.getDelCount()) + " vs reader=" + reader.numDocs());
         }
         if ((info.info.getDocCount()-reader.numDocs()) > reader.maxDoc()) {
           throw new RuntimeException("too many deleted docs: maxDoc()=" + reader.maxDoc() + " vs del count=" + (info.info.getDocCount()-reader.numDocs()));
         }
         if (info.info.getDocCount() - numDocs != info.getDelCount()) {
           throw new RuntimeException("delete count mismatch: info=" + info.getDelCount() + " vs reader=" + (info.info.getDocCount() - numDocs));
         }
         Bits liveDocs = reader.getLiveDocs();
         if (liveDocs == null) {
           throw new RuntimeException("segment should have deletions, but liveDocs is null");
         } else {
           int numLive = 0;
           for (int j = 0; j < liveDocs.length(); j++) {
             if (liveDocs.get(j)) {
               numLive++;
             }
           }
           if (numLive != numDocs) {
             throw new RuntimeException("liveDocs count mismatch: info=" + numDocs + ", vs bits=" + numLive);
           }
         }
         
         segInfoStat.numDeleted = info.info.getDocCount() - numDocs;
         msg(infoStream, "OK [" + (segInfoStat.numDeleted) + " deleted docs]");
       } else {
         if (info.getDelCount() != 0) {
           throw new RuntimeException("delete count mismatch: info=" + info.getDelCount() + " vs reader=" + (info.info.getDocCount() - numDocs));
         }
         Bits liveDocs = reader.getLiveDocs();
         if (liveDocs != null) {
           // its ok for it to be non-null here, as long as none are set right?
           // 这里好像有点问题,当delete document不存在时候,liveDocs应该为null。
           for (int j = 0; j < liveDocs.length(); j++) {
             if (!liveDocs.get(j)) {
               throw new RuntimeException("liveDocs mismatch: info says no deletions but doc " + j + " is deleted.");
             }
           }
         }
         msg(infoStream, "OK");
       }
       if (reader.maxDoc() != info.info.getDocCount()) {
         throw new RuntimeException("SegmentReader.maxDoc() " + reader.maxDoc() + " != SegmentInfos.docCount " + info.info.getDocCount());
       }

       // Test getFieldInfos()
       // 获取域状态以及数量 如:test: fields..............OK [3 fields]
       if (infoStream != null) {
         infoStream.print("    test: fields..............");
       }         
       FieldInfos fieldInfos = reader.getFieldInfos();
       msg(infoStream, "OK [" + fieldInfos.size() + " fields]");
       segInfoStat.numFields = fieldInfos.size();
       
       // Test Field Norms
       // 获取Field Norms的状态以及数量 test: field norms.........OK [1 fields]
       segInfoStat.fieldNormStatus = testFieldNorms(reader, infoStream);

       // Test the Term Index
       // 获取Field Index的状态以及数量 test: terms, freq, prox...OK [36091 terms; 54003 terms/docs pairs; 18001 tokens]
       segInfoStat.termIndexStatus = testPostings(reader, infoStream, verbose);

       // Test Stored Fields
       // 获取Stored Field 的状态 test: stored fields.......OK [54003 total field count; avg 3 fields per doc]
       segInfoStat.storedFieldStatus = testStoredFields(reader, infoStream);

       // Test Term Vectors
       // 获取Term Field 的状态 test: stored fields.......OK [54003 total field count; avg 3 fields per doc]
       segInfoStat.termVectorStatus = testTermVectors(reader, infoStream, verbose, crossCheckTermVectors);

    // 获取Doc Value 的状态 test: docvalues...........OK [0 docvalues fields; 0 BINARY; 0 NUMERIC; 0 SORTED; 0 SORTED_SET]
       segInfoStat.docValuesStatus = testDocValues(reader, infoStream);

       // Rethrow the first exception we encountered
       //  This will cause stats for failed segments to be incremented properly
       if (segInfoStat.fieldNormStatus.error != null) {
         throw new RuntimeException("Field Norm test failed");
       } else if (segInfoStat.termIndexStatus.error != null) {
         throw new RuntimeException("Term Index test failed");
       } else if (segInfoStat.storedFieldStatus.error != null) {
         throw new RuntimeException("Stored Field test failed");
       } else if (segInfoStat.termVectorStatus.error != null) {
         throw new RuntimeException("Term Vector test failed");
       }  else if (segInfoStat.docValuesStatus.error != null) {
         throw new RuntimeException("DocValues test failed");
       }

       msg(infoStream, "");

     } catch (Throwable t) {
       msg(infoStream, "FAILED");
       String comment;
       comment = "fixIndex() would remove reference to this segment";
       msg(infoStream, "    WARNING: " + comment + "; full exception:");
       if (infoStream != null)
         t.printStackTrace(infoStream);
       msg(infoStream, "");
       result.totLoseDocCount += toLoseDocCount;
       result.numBadSegments++;
       continue;
     } finally {
       if (reader != null)
         reader.close();
     }

     // Keeper
     result.newSegments.add(info.clone());
   }

   if (0 == result.numBadSegments) {
     result.clean = true;
   } else
     msg(infoStream, "WARNING: " + result.numBadSegments + " broken segments (containing " + result.totLoseDocCount + " documents) detected");

   if ( ! (result.validCounter = (result.maxSegmentName < sis.counter))) {
     result.clean = false;
     result.newSegments.counter = result.maxSegmentName + 1; 
     msg(infoStream, "ERROR: Next segment name counter " + sis.counter + " is not greater than max segment name " + result.maxSegmentName);
   }
   
   if (result.clean) {
     msg(infoStream, "No problems were detected with this index.\n");
   }

   return result;
 }

其中关于testFieldNorms这几个的源码明天继续学习