Monitoring Indexing Results
You can monitor the indexing progress of a document collection. Indexing results are shown for both bulk indexing and incremental update processes, whether automatically or manually initiated.
To monitor the indexing progress of a document collection,
- Click Setup.
- Click the Indexing tab.
- In the left frame, click Indexing Results.
- From the drop-down list at the top of the page, select the document collection for which you want to view indexing results. To view results for all document collections, select All Collections.
Indexing results for documents in the selected document collection are displayed. To update the results while the indexing process is running, click Refresh. (The page is not refreshed automatically.)
Collection Information
Indexing Results
Success Breakdown
Failure Breakdown
Viewing and Reprocessing Documents
Collection Information
The following information is provided at the top of the Indexing Results page:
Collection |
The name of the document collection for which you are viewing indexing results. To view resultsfor another document collection, choose the name from the drop-down list. |
Server |
The server containing the documents in the document collection. |
Library |
The database or library containing the documents in the document collection. |
Start Time |
The time the last indexing process started scheduling. The time is displayed as soon as the indexing process is started. |
Queue Time |
The time the last indexing process was scheduled. The time is displayed as soon as indexing is scheduled. |
Profile Count |
The number of document profiles in the document collection.
|
Indexing Results
The Results table shows the current indexing results for the selected document collection. For each indexing component, the number of documents at each status is indicated:
This status: |
Shows how many documents in the collection: |
Queued |
Are currently scheduled and remain to be processed |
Delegated |
Are currently in process |
Duplicate |
Have already been processed as part of another (primary) document collection; to monitor any secondary processing on the duplicate document, refer to the indexing results for the document's primary document collection |
Success |
Have processed successfully |
Failure |
Have not been processed because of errors |
Not applicable |
Will not be processed because the component doesn't apply or a component on which it is dependent has failed |
Success Breakdown
The Success Breakdown table shows a breakdown of the current indexing results for the documents that successfully processed. For each indexing component, the number of documents at each status is indicated:
|
|
Success |
Have processed successfully |
Analyzer partial analysis |
have processed successfully, but a failure occurred in the analyzer component while identifying a specific entity, e.g., jurisdiction. |
Identified as secured |
Have not been processed because they are within a secured subfolder within an NTFS share |
Invalid document structure |
Have not been processed because the document's structure doesn't allow the document analysis component to proceed with the analysis |
No cites - deleted |
Have been deleted from West km during the synchronization process because they have no citations |
No cites - pending deletion |
Have been scheduled to be deleted from West km because they have no citations; documents will be deleted during the next synchronization process |
RID partial error |
Have processed successfully, but with an error in identifying citations with RID; at least one citation in each document was identified |
Unsupported file type |
Have not been processed by the document analysis component because the document file is an unsupported file type--for example, an unsupported version of an application format that is supported in other versions by other components |
Work in progress |
Have not been processed because they have been modified or added to the collection within the number of days set in the Work in Progress Time Span setting at Setup, Indexing, Index Settings, General |
Note: When a document collection is set to index only documents that contain one or more citations (the Citations Only check box is selected for the collection), documents with no citations
are deleted from West km during the synchronization process. A document is
identified as having no citations after the document is converted to HTML and processed
by cite recognition (RID).
Failure Breakdown
The Failure Breakdown table shows a breakdown of the current indexing results for the documents that failed processing. For each indexing component, the number of documents at each status is indicated:
|
|
Automatic reprocessing: |
Analyzer failure |
Have not been processed because of a failure for unknown reasons in the document analysis component or the analyzer component. |
X |
Analyzer initialization failure |
Have not been processed because the analyzer component (Litigation) failed to initialize. |
|
Analyzer timeout |
Have not been processed because the document analysis component or the analyzer component took longer than the specified timeout value of 60 minutes. |
|
Document size exceeded |
Have not been processed because the documents exceed the maximum size of files allowed into West km. (The default threshold is 50 megabytes.) |
|
Full text abstraction failure |
Have not been processed because the indexing service encountered a problem while communicating with the full-text searching service and could not copy or retrieve a document from the HTML storage location. |
X |
General error |
Have not been processed because of a general failure. |
X |
HTML conversion error |
Have not been processed because of a failure to convert them to HTML. |
|
Identified as corrupt |
Have not been processed because they are corrupt. |
|
Partial analysis |
Have been only partially processed by the document analysis component. Either the FFC failed to convert the native document or the HTML document to SGML, or the BKM mapping algorithm failed when the two SGML files were compared. Note: This error could occur if too many processes are running simultaneously on your server. |
X |
RID error |
Have not been processed because of a failure in RID to identify citations; no citations in the documents were identified. |
X |
RID timed out |
Have not been processed because RID failed to communicate for 120 seconds. |
|
Store is full |
Have not been processed because the maximum was reached for your designated HTML document storage. |
X |
Unable to access file or physical file missing |
Have not been processed because the document file could not be found. |
|
Well-formed conversion error |
Have not been processed because the HTML conversion of the document does not conform to the syntax rules of XML. |
|
Viewing and Reprocessing Documents
To view a listing of documents for a particular indexing component and status, on the Indexing Results page, click a hyperlinked number in the Success Breakdown or the Failure Breakdown table.
To manually reprocess documents from the listing,
- Select the check box preceding each document in the list that you want to reprocess. To select all documents in the list, select the check box in the heading.
Click Actions and then click the reprocessing option you want.
- To reprocess the selected documents for the indexing component being viewed, click Reprocess Component.
- To reprocess the selected documents for all indexing components that have been set for that document collection, click Reprocess All Components.
Documents are rescheduled immediately, and indexing will start according to the collection's document indexing window.
Note: If you reprocess the HTML conversion component, all components that have been set for that document collection will be reprocessed.
To filter the list of documents, type or select values at the top of the Indexing Results page and click Apply. In the Filter box, type a search string to retrieve documents that contain the search string in the document name or in the data source name. To display all documents, click Clear.
To edit the collection to which a document belongs, click View in the Collection column.
|