Add a Document(s) to a Collection
Overview​
A Collection can contain multiple Documents. Added documents are indexed and stored in a database. When you ask a question about the Document(s), h2oGPTe crawls through the indexed Document(s) in the Collection to find relevant content to answer the question while utilizing the H2O LLM to summarize a concise question response. You can add documents while creating a Collection or after creating a Collection.
To learn how to create a Collection, see Create a Collection.
Instructions​
To add a Document(s) to a Collection, consider the following instructions:
-
In the Enterprise h2oGPTe navigation menu, click Collections.
-
In the Collections table, select the name of the Collection you want to add a Document(s) to.
-
Click + Add documents.
infoYou can upload certain text, image, and audio file types to a Collection. To learn more, see Supported file types for a Collection.
-
In the Choose method list, select a method to import a Document(s).
- Upload documents
- Import from file system
- Import from URL
- Upload plain text
- Select a document
- Select a Collection
- Import from S3
- Import from Azure Blob Storage
- Import from Google Cloud Storage
- Click Browse....
- Upload documents.
- In the Directory to import documents from box, enter a directory to import Documents from.
- In the Glob pattern to match files box, enter a global pattern to match the files (Documents).
- In the URL to import box, enter a valid URL.
- In the Plain text to upload box, paste the text you copied from another source to create a document.
- In the Search for a document list, select a Document that is imported to another Collection.
noteThe selected Document will be imported into this Collection.
- In the Search for a collection list, select an existing Collection.
noteAll Documents from the selected Collection will be imported into the new Collection.
- In the S3 Path box, enter the Document URL in the Amazon S3 bucket.
- Enter the Region Name.
- Optional: Enter the Access Key ID.
- Optional: Enter the Secret Access Key.
- Optional: Enter the Session Token.
- Click Add selected.
- In the Container box, enter the URI for the container.
- Optional: In the Path box, enter the URL of the blob.
- In the Account name box, enter the account name.
- Optional: In the Account Key box, enter the account key.
- Optional: In the SAS token box, enter the shared access signature (SAS) token.
- Click Add selected.
- In the Google Storage path box, enter the Google Cloud Storage resource path.
- Optional: In the Service Account Key box, enter the service account key.
- Click Add selected.
note- Toggle the Create short document summaries button to auto-generate a summary of your document.
- Toggle the Create sample questions for documents button to receive auto-suggested sample questions based on your document.
- From the Spoken language in audio files dropdown list, select the language spoken in the uploaded audio files.
- From the OCR model dropdown, select the OCR (Optical Character Recognition) model to identify and extract text from images and PDFs.
-
Click Add.
- If you try to add an empty Document, the indexing of the files will fail. Overall, the Job associated with the Collection will fail.
- To learn how to Chat with a Collection, see Chat with a Collection.
- Submit and view feedback for this page
- Send feedback about Enterprise h2oGPTe to cloud-feedback@h2o.ai