Skip to main content

One post tagged with "v1.6"

View All Tags

Return to docs

Return to docs

What's new in h2oGPTe v1.6

ยท 6 min read

We are excited to announce the release of h2oGPTe 1.6! Read on to learn about the new features in the 1.6.x series which will improve your ability to find answers and generate new content based on your private data.

Agentโ€‹

The main new feature is the h2oGPTe Agent, which introduces powerful autonomous tool use capabilities through code execution, while using LLMs for code generation and reasoning.

The h2oGPTe Agent placed #1 on the GAIA leaderboard, which measures the usefulness of General AI assistants.

It is a highly capable deep research assistant, with full transparency into its inner workings.

For every agentic chat conversation, the output now includes:

  • Agentic Analysis
    • Utilized Tools: shows the subset of tools used by the agent
    • Summary View: all the steps taken by the agent
  • Agentic Internal Chat
    • A transcript of all agentic internal conversations, available in summary or detail view
  • Downloadable files
    • All newly created files by the agent, including final artifacts such as PDF/Excel/PowerPoint but also all code snippets and logs used to create those documents
    • All files available to the agent, especially relevant for existing collection documents and/or for enabled chat history

Controlling the Agentโ€‹

The agent can be enabled or disabled inside of the chat message input form for convenience. The agent is acting autonomously based on the user prompts and the prompt templates used. There are 4 presets for agent accuracy, each has a default value for the number of agentic conversation turns and the maximum time each turn can take. Both are controllable by the user.

  • Quick
  • Basic (default)
  • Standard
  • Maximum

Agent Tools that can be enabled/disabled for every agentic chat query:

  • Aider Code Generation
  • Mermaid Chart-Diagram Renderer
  • Image Generation
  • Ask Question About Image
  • Audio-Video Transcription
  • Convert Document to Text
  • Download Web Video
  • Screenshot Webpage
  • Google Search
  • Browser Navigation
  • Scholar Papers Search
  • Wolfram Alpha Math Science Search
  • H2O Driverless AI Data Science
  • Bing Search
  • Web Image Search
  • Ask Question About Documents
  • Wikipedia Articles Search
  • Wayback Machine Search
  • Advanced Reasoning
  • Evaluate Answer
  • Shell Scripting
  • Python Coding
  • RAG Vision
  • RAG Text
  • Internet Access
  • Intranet Access

Each of these tools can be enabled/disabled by the admin.

Roles and Permissionsโ€‹

This version introduces a list of roles defined in the system, and every role is associated with a set of permissions. Users and groups configured by the federated authentication and identity providers (LDAP) are reflected in the system and admins can modify the roles and permissions for each user and group.

Role-based access controls (RBAC) include:

  • Delete chats
  • Submit chat feedback
  • Add collections
  • Delete collections
  • Edit collections
  • Make collection public
  • Share collections
  • Show admin center
  • Allow device pairing when configured
  • Show extractors
  • Show live logs
  • Show models page
  • Show private button
  • Add documents
  • Delete documents
  • Delete prompt templates
  • Edit prompt templates
  • Share prompt templates
  • Manage roles
  • Display system notifications
  • Display developer settings

Custom GPT via collection settingsโ€‹

Collection + Collection Settings + Default Chat Settings = Custom GPT

Each collection now has a set of default chat settings that will be applied for each new chat with this collection. The default chat settings can be applied from any chat session by clicking the 'Apply current settings as collection defaults' button or via API. The collection settings page shows the current set of default settings.

REST APIโ€‹

A new REST API has been implemented, in addition to the existing Python RPC client. It conforms to the OpenAPI standard and it is exposed via built-in Swagger UI.

Auto-generated bindings are available on the API page.

  • Python REST API
  • JavaScript REST API
  • Go REST API

Reasoning modelsโ€‹

The models page shows which models support reasoning capabilities and which reasoning models are enabled for which non-reasoning models, as a supporting model, similar to vision capabilities. Reasoning models can be used for chat and RAG use cases and for agentic use cases.

Improved Vision capabilitiesโ€‹

Various improvements to Vision model capabilities have been made.

Improved Document Parsing capabilitiesโ€‹

Various improvements to parsing:

  • layout detection
  • chunking
  • image captioning
  • text conversion
  • document highlighting
  • Excel document handling (large tables are summarized, but still available in full to the agent for data science)

Improved Handwriting supportโ€‹

The H2O Mississippi model is now used by default for transcription of handwriting to text. It is shipped out of the box.

Support of the latest LLMsโ€‹

We support all widely available proprietary and open-source models. Some noteworthy new models include:

  • Claude 3.5 (Bedrock)
  • OpenAI o1 (Azure)
  • OpenAI o1-mini (Azure)
  • DeepSeek V3
  • DeepSeek R1
  • Gemini 2.0 Flash
  • Gemini 2.0 Flash Thinking
  • MiniMaxAI
  • Qwen/Qwen2.5
  • Qwen/Qwen2-VL
  • Qwen/QwQ
  • Llama-3.3-70B
  • Llama-3.2-11B-Vision
  • Llama-3.2-90B-Vision
  • H2O Mississippi

Improved Scalability and Speedโ€‹

  • Models Service: The backend has been redesigned to allow horizontal scaling for faster speed and higher throughput for document ingestion and chat via a dedicated models service that chat/crawl/core services are sharing.
  • Optional auto-scaling of the models service via KEDA
  • Several improvements to improve the responsiveness of the application have been made
  • The Vex Docker image has been reduced in size
  • Parallelized Vex DB operations
  • Text to PDF conversion has been sped up

UX improvementsโ€‹

  • Guardrails and PII settings can be separately enabled/disabled
  • Custom guardrails can now be entered from the GUI
  • PDF display has been improved
  • Thumbnails for collections
  • Improved models page
  • Improved scrolling and pagination
  • Syntax highlighting for markdown and code blocks
  • Improved job cancellation
  • Speed up automatic RAG type detection

Improved Models Self-Testโ€‹

  • More functional self-tests, now does multimodal RAG with guided JSON
  • More likely to expose flaws in configuration of model endpoints

White Labelingโ€‹

  • Custom logo, colors, greeting message, and personality in prompt templates

Topic Modelโ€‹

For every collection, a topic model visualization can be created by the click of a button. It shows clusters of similar phrases and concepts found in the documents, and enables a quick overview of the content and potentially spots of low information density for quick visual debugging of the collection.

Code for chat messagesโ€‹

Every chat message now shows the client code needed to run the same query from the Python client.

Collection Expiration and Size limitsโ€‹

Admins can now let collections expire after a certain amount of time. Collection size limits can be set as well.

Faster chat sharingโ€‹

Public sharing of chats has been sped up.

Security Vulnerability Fixesโ€‹

No critical or high CVEs at the time of release.


Feedback