Researchers in China developed a hallucination correction engine for AI models

A team of scientists from the University of Science and Technology of China and Tencent’s YouTu Lab have developed a tool to combat “hallucination” by artificial intelligence (AI) models.

Hallucination is the tendency for an AI model to generate outputs with a high level of confidence that don’t appear based on information present in its training data. This problem permeates large language model (LLM) research. Its effects can be seen in models such as OpenAI’s ChatGPT and Anthropic’s Claude.

The USTC/Tencent team developed a tool called “Woodpecker” that they claim is capable of correcting hallucinations in multi-modal large language models (MLLMs).

This subset of AI involves models such as GPT-4 (especially its visual variant, GPT-4V) and other systems that roll vision and/or other processing into the generative AI modality alongside text-based language modelling.

According to the team’s pre-print research paper, Woodpecker uses three separate AI models, apart from the MLLM being corrected for hallucinations, to perform hallucination correction.

These include GPT-3.5 turbo, Grounding DINO, and BLIP-2-FlanT5. Together, these models work as evaluators to identify hallucinations and instruct the model being corrected to re-generate its output in accordance with its data.

In each of the above examples, an LLM hallucinates an incorrect answer (green background) to prompting (blue background). The corrected “Woodpecker” responses are shown with a red background. (Image source: Yin, et. al., 2023).

To correct hallucinations, the AI models powering “Woodpecker” use a five-stage process that involves “key concept extraction, question formulation, visual knowledge validation, visual claim generation, and hallucination correction.”

The researchers claim these techniques provide additional transparency and “a 30.66%/24.33% improvement in accuracy over the baseline MiniGPT-4/mPLUG-Owl.” They evaluated numerous “off the shelf” MLLMs using their method and concluded that Woodpecker could be “easily integrated into other MLLMs.”

An evaluation version of Woodpecker is available on Gradio Live where anyone curious can check out the tool in action.

Researchers in China developed a hallucination correction engine for AI models

Review Overview

Leave a Reply Cancel reply

CALENDAR

Latest Posts

REX-Osprey recordsdata for MOVE ETF

What Canada’s new Liberal PM Mark Carney means for crypto

Review Overview

Related Articles

Lightspeed Faction launches $285M startup fund for crypto initiatives

Trade circulation hole hits 10K BTC — 5 issues to know in Bitcoin this week

Democratic Get together of South Korea to compel parliamentary candidates to reveal crypto

Leave a Reply Cancel reply

Latest Posts

REX-Osprey recordsdata for MOVE ETF

What Canada’s new Liberal PM Mark Carney means for crypto