What's new in h2oGPTe v1.6
We are excited to announce the release of h2oGPTe 1.6! Read on to learn about the new features in the 1.6.x series which will improve your ability to find answers and generate new content based on your private data.
Agentโ
The main new feature is the h2oGPTe Agent, which introduces powerful autonomous tool use capabilities through code execution, while using LLMs for code generation and reasoning.
The h2oGPTe Agent placed #1 on the GAIA leaderboard, which measures the usefulness of General AI assistants.
- Blog: https://h2o.ai/blog/2024/h2o-ai-tops-gaia-leaderboard/
- Webinar: https://www.youtube.com/watch?v=m3I3ro_ZAnE
It is a highly capable deep research assistant, with full transparency into its inner workings.
For every agentic chat conversation, the output now includes:
- Agentic Analysis
- Utilized Tools: shows the subset of tools used by the agent
- Summary View: all the steps taken by the agent
- Agentic Internal Chat
- A transcript of all agentic internal conversations, available in summary or detail view
- Downloadable files
- All newly created files by the agent, including final artifacts such as PDF/Excel/PowerPoint but also all code snippets and logs used to create those documents
- All files available to the agent, especially relevant for existing collection documents and/or for enabled chat history
Controlling the Agentโ
The agent can be enabled or disabled inside of the chat message input form for convenience. The agent is acting autonomously based on the user prompts and the prompt templates used. There are 4 presets for agent accuracy, each has a default value for the number of agentic conversation turns and the maximum time each turn can take. Both are controllable by the user.
- Quick
- Basic (default)
- Standard
- Maximum
Agent Tools that can be enabled/disabled for every agentic chat query:
- Aider Code Generation
- Mermaid Chart-Diagram Renderer
- Image Generation
- Ask Question About Image
- Audio-Video Transcription
- Convert Document to Text
- Download Web Video
- Screenshot Webpage
- Google Search
- Browser Navigation
- Scholar Papers Search
- Wolfram Alpha Math Science Search
- H2O Driverless AI Data Science
- Bing Search
- Web Image Search
- Ask Question About Documents
- Wikipedia Articles Search
- Wayback Machine Search
- Advanced Reasoning
- Evaluate Answer
- Shell Scripting
- Python Coding
- RAG Vision
- RAG Text
- Internet Access
- Intranet Access
Each of these tools can be enabled/disabled by the admin.
Roles and Permissionsโ
This version introduces a list of roles defined in the system, and every role is associated with a set of permissions. Users and groups configured by the federated authentication and identity providers (LDAP) are reflected in the system and admins can modify the roles and permissions for each user and group.
Role-based access controls (RBAC) include:
- Delete chats
- Submit chat feedback
- Add collections
- Delete collections
- Edit collections
- Make collection public
- Share collections
- Show admin center
- Allow device pairing when configured
- Show extractors
- Show live logs
- Show models page
- Show private button
- Add documents
- Delete documents
- Delete prompt templates
- Edit prompt templates
- Share prompt templates
- Manage roles
- Display system notifications
- Display developer settings
Custom GPT via collection settingsโ
Collection + Collection Settings + Default Chat Settings = Custom GPT
Each collection now has a set of default chat settings that will be applied for each new chat with this collection. The default chat settings can be applied from any chat session by clicking the 'Apply current settings as collection defaults' button or via API. The collection settings page shows the current set of default settings.
REST APIโ
A new REST API has been implemented, in addition to the existing Python RPC client. It conforms to the OpenAPI standard and it is exposed via built-in Swagger UI.
Auto-generated bindings are available on the API page.
- Python REST API
- JavaScript REST API
- Go REST API
Reasoning modelsโ
The models page shows which models support reasoning capabilities and which reasoning models are enabled for which non-reasoning models, as a supporting model, similar to vision capabilities. Reasoning models can be used for chat and RAG use cases and for agentic use cases.
Improved Vision capabilitiesโ
Various improvements to Vision model capabilities have been made.
Improved Document Parsing capabilitiesโ
Various improvements to parsing:
- layout detection
- chunking
- image captioning
- text conversion
- document highlighting
- Excel document handling (large tables are summarized, but still available in full to the agent for data science)
Improved Handwriting supportโ
The H2O Mississippi model is now used by default for transcription of handwriting to text. It is shipped out of the box.
Support of the latest LLMsโ
We support all widely available proprietary and open-source models. Some noteworthy new models include:
- Claude 3.5 (Bedrock)
- OpenAI o1 (Azure)
- OpenAI o1-mini (Azure)
- DeepSeek V3
- DeepSeek R1
- Gemini 2.0 Flash
- Gemini 2.0 Flash Thinking
- MiniMaxAI
- Qwen/Qwen2.5
- Qwen/Qwen2-VL
- Qwen/QwQ
- Llama-3.3-70B
- Llama-3.2-11B-Vision
- Llama-3.2-90B-Vision
- H2O Mississippi
Improved Scalability and Speedโ
- Models Service: The backend has been redesigned to allow horizontal scaling for faster speed and higher throughput for document ingestion and chat via a dedicated models service that chat/crawl/core services are sharing.
- Optional auto-scaling of the models service via KEDA
- Several improvements to improve the responsiveness of the application have been made
- The Vex Docker image has been reduced in size
- Parallelized Vex DB operations
- Text to PDF conversion has been sped up
UX improvementsโ
- Guardrails and PII settings can be separately enabled/disabled
- Custom guardrails can now be entered from the GUI
- PDF display has been improved
- Thumbnails for collections
- Improved models page
- Improved scrolling and pagination
- Syntax highlighting for markdown and code blocks
- Improved job cancellation
- Speed up automatic RAG type detection
Improved Models Self-Testโ
- More functional self-tests, now does multimodal RAG with guided JSON
- More likely to expose flaws in configuration of model endpoints
White Labelingโ
- Custom logo, colors, greeting message, and personality in prompt templates
Topic Modelโ
For every collection, a topic model visualization can be created by the click of a button. It shows clusters of similar phrases and concepts found in the documents, and enables a quick overview of the content and potentially spots of low information density for quick visual debugging of the collection.
Code for chat messagesโ
Every chat message now shows the client code needed to run the same query from the Python client.
Collection Expiration and Size limitsโ
Admins can now let collections expire after a certain amount of time. Collection size limits can be set as well.
Faster chat sharingโ
Public sharing of chats has been sped up.
Security Vulnerability Fixesโ
No critical or high CVEs at the time of release.
- Submit and view feedback for this page
- Send feedback about Enterprise h2oGPTe to cloud-feedback@h2o.ai