Skip to main content

Using H2O Document AI - Viewer

H2O Document AI - Viewer is an H2O AI Cloud (HAIC) application that lets you process documents. Your documents are processed using pipelines that are built and published in H2O Document AI - Publisher. Your H2O Document AI - Publisher and H2O Document AI - Viewer applications are intrinsically linked together.

Understanding the dashboard

The H2O Document AI - Viewer dashboard shows your available pipelines as well as your processed documents and their information. From this dashboard you can add more documents to be scored by an available pipeline.

The H2O Document AI - Viewer dashboard. You can see available pipelines and previously uploaded documents.

Pipelines

The dashboard shows the scored documents of all pipelines by default. You can select to see the results of individual pipelines by clicking the pipeline name on the H2O Document AI Pipelines column.

The column displaying all available pipelines.

Documents

The documents you have already processed are available in the order they were added to H2O Document AI - Viewer. You can see your added document after it has finished processing.

A processed document available to review on the H2O Document AI - Viewer dashboard

Each document row has the following information:

  • Filename: The name of the document you added (multiple documents with the same filename can be added)
  • Pipeline: The pipeline you used to process the document
  • State: The current status of your document; one of:
    • Importing : This document is currently importing
    • Import Failed : This document failed to import and can only be deleted
    • Imported : This document has been imported but not processed
    • Processing : This document is currently processing
    • Process Failed : This document failed to process and can only be deleted
    • Processed : This document has successfully been processed and is ready to be reviewed
    • Reviewing : This document is in the process of being reviewed
    • Reviewed : This document has been reviewed
    • Deleted : This document has been deleted and cannot be interacted with any longer
  • Pages: The number of pages the document has
  • Updated: The last time your document was updated

You can review your document any time after it has successfully processed. Reviewing your document allows you to access your Labels and values.

Filtering by state

You can filter through your documents by their state.

Only show the documents in a certain state.

Deleting a document

You can delete a document from the dashboard.

  1. Click the meatball menu at the end of your document row.
  2. Click Delete.

The workflow to delete your document by clicking the meatball menu at the end of the document's row and then clicking delete.

Workflow: Using H2O Document AI - Viewer

The following steps describe the workflow for H2O Document AI - Viewer.

Step 1: Add a document to be processed by an available pipeline

Start by adding a new document to be scored by H2O Document AI - Viewer.

  1. Click Add document The Add Document button located in the upper right corner. Click this to go to the Add Document panel. to add a new document.
  2. Select which pipeline you want to use to score your document.
  3. Select the file you want to process (you can only add one document at a time, but your document can have multiple pages).
  4. Click Add Document to process the document.
info

H2O Document AI - Viewer only accepts the following file types:

  • A PDF file
  • A JPEG file
  • A JPG file
  • A PNG file
  • A ZIP file

The Add Document panel. This panel lets you select your desired scoring pipeline and add a new file to be processed.

Your chosen pipeline then processes your file. A processing bar will show that the file is actively processing. You will be able to interact with your document after it has finished processing.

tip

Processing a new file will take at least a few seconds (possibly longer) depending on the number of pages in the document and the number of predicted values.

When your document has finished processing you will be able to click Review Review button located at the end of each processed file's row at the end of the document row to access the document results page.

Step 2: Access the processed document and review it for inconsistencies and inaccuracies

You review your document from the document results page. The document results page is split into the information panel on the left-side of the screen and the marked document on the right.

The document results page you can review after your document has finished processing.

Information panel

The top of the information panel displays:

  1. The name of your document
  2. The number of values (e.g. Results: 10)
  3. The Review button
  4. The Export button

If you click the drop-down arrow next to your document's name, you can access additional information:

  1. The pipeline used to process your document
  2. The state of your document (either Reviewing or Reviewed)
  3. When your document was processed
  4. The most recent export time

The top of the information panel which has general information about the document including the name of the document, the pipeline used to process it, and the number of values.

The information panel also shows the breakdown of your Labels and values. You can interact with the values predicted by your selected pipeline. When you select a value, it will point to that value's location on the marked document.

Clicking on a value will point to that value's location on the document.

The information about that value will also be displayed when you click the value. The original value predicted by your pipeline is displayed along with the OCR confidence percentage and token classification percentage.

Interacting with values

You can review your predicted values. Go through each predicted value and check that the pipeline correctly processed the information from the document.

For any incorrect predictions, change the content of the value:

  1. Click the pencil icon Pencil icon that appears when you hover over a value box. that appears when you hover over your value. This lets you edit your value.
  2. Correct the predicted value by typing your correction in the value box. Your changes will automatically save.

After you have updated the value, your value box will be marked with a gold dot to show that you have changed the value.

If you want to change your value back to the original value, you can revert your changes by clicking Revert. You can also clear out the contents of the value box by clicking Clear.

The information displayed by clicking a predicted value including the original value, OCR confidence, and classification confidence.

After you have finished reviewing all of your document's values, Click the Review button Finish review button to mark document as Reviewed. to mark your document as Reviewed.

Current workaround
Workaround

Using a published pipeline that was built using a model that was trained using a single document in H2O Document AI - Viewer might result in that document having zero results. This is highly dependent on the quality of the model that was trained. If you try to score with the same file used for training, then it will probably, though not guaranteed, find at least one entity. Additionally, it is dependent on the quality and size of the bounding box created during edit in page view in H2O Document AI - Publisher.

Labels and values

Labels are created in H2O Document AI Publisher and assign regions of a document with different meanings. For example, you can label a region on a document as contact_name. When that region has tokens in it, those tokens are assigned the label contact_name. Each label can have multiple values.

Values are the tokens within labeled regions. They are the text predicted and post-processed by the pipeline. For example, if the tokens JOHN and SMITH are in the labeled region for contact_name, then the predicted value for the labeled region contact_name is JOHN SMITH.

The labels and values are located on the left-side of the document results screen. This label is "contact_person" and its values are "Jarrod Heaney" and "Justin Soto".

Step 3: Export the values of the document to your local computer

You can export your values to your local computer.

  1. Click Export The export button located on the information panel of the document results. on the document results page. This will export a JSON file of your correct values.
  2. Access your JSON file from your Downloads folder on your local computer.

Returning to the dashboard

After you have finished reviewing your document and exporting your values, you can return to the H2O Document AI - Viewer's dashboard by clicking the product name in the upper left corner.

H2O Document AI - Viewer product name always located in the upper left corner of the UI.


Feedback