* To display this page correctly, you need a web browser with JavaScript support.

Monitoring Indexing

To monitor the indexing progress of a document collection,

  1. Click Indexing from the Statistics page.
  2. Select the document collection for which you want to view indexing statistics.

Statistics are shown for both bulk indexing and incremental updates, whether automatically or manually initiated. To update the statistics while the indexing process is running, click Refresh.

Collection Information

The following information is provided in the page header of the Statistics page:

Collection

The name of the document collection for which you are viewing indexing statistics. To view statistics for another document collection, choose the name from the drop-down list.

Server

The server containing the documents in the document collection.

Database

The database containing the documents in the document collection.

Started Time

The time the last indexing process started scheduling. The time is displayed as soon as the indexing process is started.

Queued Time

The time the last indexing process was scheduled. The time is displayed as soon as indexing is scheduled.

Profile Count

The number of document profiles in the document collection.


Note:
This count may be less than the count shown in the Total row in the Indexing Results table (and for your DMS saved search) if documents are duplicated in multiple collections and were previously processed by another (primary) document collection. These documents are accounted for in the Duplicate row in the Indexing Results table.


[top of page]

Indexing Results

The Indexing Results table shows the current indexing statistics for the selected document collection. For each indexing component, the number of documents at each status is indicated:

This status:

Shows how many documents in the collection:

Queued

Are currently scheduled and remain to be processed

Delegated

Are currently in process

Duplicate

Have already been processed as part of another (primary) document collection; to monitor any secondary processing on the duplicate document, refer to the indexing statistics for the document's primary document collection

Success

Have processed successfully

Failure

Have not been processed because of errors

Not applicable

Will not be processed because the component doesn’t apply or a component on which it is dependent has failed


[top of page]

Indexing Success Breakdown

The Indexing Success Breakdown table shows a breakdown of the current indexing statistics for the documents that successfully processed. For each indexing component, the number of documents at each status is indicated:


This status:


Shows how many documents in the collection:

Success

Have processed successfully

Analyzer partial analysis

have processed successfully, but a failure occurred in the analyzer component while identifying a specific entity, e.g., jurisdiction.

Identified as secured

Have not been processed because they are within a secured subfolder within an NTFS share

Invalid document structure

Have not been processed because the document's structure doesn't allow the document analysis component to proceed with the analysis

No cites - deleted

Have been deleted from West km during the synchronization process because they have no citations

No cites - pending deletion

Have been scheduled to be deleted from West km because they have no citations; documents will be deleted during the next synchronization process

RID partial error

Have processed successfully, but with an error in identifying citations with RID; at least one citation in each document was identified

Unsupported file type

Have not been processed by the document analysis component because the document file is an unsupported file type--for example, an unsupported version of an application format that is supported in other versions by other components

Work in progress

Have not been processed because they have been modified or added to the collection within the number of days set in the Work in Progress Time Span setting on the System Options, Indexing Settings page


Note: When a document collection is set to index only documents that contain one or more citations (the Citations Only check box is selected for the collection), documents with no citations are deleted from West km during the synchronization process. A document is identified as having no citations after the document is converted to HTML and processed by cite recognition (RID).

[top of page]

Indexing Failure Breakdown

The Indexing Failure Breakdown table shows a breakdown of the current indexing statistics for the documents that failed processing. For each indexing component, the number of documents at each status is indicated:


This status:


Shows how many documents in the collection:

Automatic
reprocessing:

Analyzer failure

Have not been processed because of a failure for unknown reasons in the document analysis component or the analyzer component.

X

Analyzer initialization failure

Have not been processed because the analyzer component (Litigation) failed to initialize.

Analyzer timeout

Have not been processed because the document analysis component or the analyzer component took longer than the specified timeout value of 60 minutes.

 

Document size exceeded

Have not been processed because the documents exceed the maximum size of files allowed into West km. (The default threshold is 50 megabytes.)

 

Full text abstraction failure

Have not been processed because the indexing service encountered a problem while communicating with the full-text searching service and could not copy or retrieve a document from the HTML storage location.

X

General error

Have not been processed because of a general failure.

X

HTML conversion error

Have not been processed because of a failure to convert them to HTML.

 

Identified as corrupt

Have not been processed because they are corrupt.

 

Partial analysis

Have been only partially processed by the document analysis component. Either the FFC failed to convert the native document or the HTML document to SGML, or the BKM mapping algorithm failed when the two SGML files were compared.

Note: This error could occur if too many processes are running simultaneously on your server.

X

RID error

Have not been processed because of a failure in RID to identify citations; no citations in the documents were identified.

X

RID timed out

Have not been processed because RID failed to communicate for 120 seconds.

 

Store is full

Have not been processed because the maximum was reached for your designated HTML document storage.

X

Unable to access file or physical file missing

Have not been processed because the document file could not be found.

 

Well-formed conversion error

Have not been processed because the HTML conversion of the document does not conform to the syntax rules of XML.

 


To view a listing of documents for a particular indexing component and status, click a hyperlinked number in the Indexing Success Breakdown or the Indexing Failure Breakdown table.

To manually reprocess documents from the listing, select the check box preceding each document in the list that you want to reprocess, and then click the reprocessing option you want from the Tasks list in the left frame.

[top of page]

 
© 2005–2026 Thomson Reuters
West km is a trademark of West Publishing Corporation