Focused language models: A solution for GenAI hallucinations

Generative AI tools like ChatGPT, Claude and Gemini are not making as many news headlines about the impacts of hallucinations as they did a year or two ago. But that doesn’t mean fabricated responses aren’t still an industry problem and an impediment to widespread adoption of GenAI.

In fact, the latest configurations of large language models (LLMs), called reasoning systems, are producing more errors, not fewer. One test of new generative AI (GenAI) reasoning systems from companies like OpenAI, Google and DeepSeek revealed hallucination rates as high as 79%.

Furthermore, although the potential harmful effects of hallucinations produced by discrete LLMs already pose significant business risk, agentic AI applications can cause damage to rapidly cascade. That is particularly true at a time when many organizations are still trying to figure out what agentic AI really is.

There is a solution for curbing hallucinations in GenAI language models today in highly regulated environments such as financial services, and in the agentic workflows of tomorrow. It’s called a focused language model (FLM), a new type of small language model (SLM) that produces consistent, auditable answers.

How small language models get focused

Like LLMs, small language models have garnered data scientists’ attention in recent years. SLMs function similarly to their LLM cousins but are less complex. They are designed to efficiently perform specific language tasks and use fewer parameters, less training data and fewer energy resources to train and operate. As a result, SLMs can deliver greater accuracy at lower cost, compared with generalist LLMs.

Focused language models are a built-from-scratch SLM variant designed to provide consistent answers that comply with a company’s policies, standards and government regulations, if applicable. FLMs fundamentally reduce hallucinations in two important ways.

First, these smaller, built-from-scratch models are trained on well-curtailed data — meaning the data is purposely tightly curated and limited. The importance of having control over training data cannot be overemphasized if organizations want to confidently use GenAI to make critical, hallucination-free business and customer-facing decisions. As a point of contrast, it’s extremely easy for off-the-shelf LLMs to veer into hallucinations; researchers at New York University found that even if .001% of the data used to train an LLM is “poisoned” with deliberately planted misinformation, the entire training set becomes likely to propagate errors, posing significant risk.

Second, in alignment with responsible GenAI principles, FLMs are not just focused on a very narrow domain, such as a company’s stores of customer purchase data. They are also further focused on a particular task to perform. A fine level of task specificity ensures that the appropriate task training data is selected and audited, which is a must-have to keep hallucinations firmly locked out and ensure that the model’s performance accurately reflects the curtailed task completion examples in its training data.

Built to deliver consistent, compliant responses

The steps below illustrate how FLM principles can be operationalized into production systems delivering answers that comply with company rules and government regulations, balancing the exciting potential of GenAI with the safety and governance that responsible AI imparts.

  • Define the task.The data science team works with business owners and domain experts to define the new capability, including the specific problem to be solved. For example, that may include proper responses from contact center agents to a specific type of customer inquiry. It may also include the required success criteria, internal data sources and length of the historical data to be considered. This phase produces a highly curtailed set of seed data, composed of correct and incorrect ways that customers are treated by contact center agents.

  • Train the model with synthetic data.The data science team then uses the experts’ input provided in seed data to create huge volumes of synthetic language data. For instance, hundreds of customer treatment examples can translate into millions of synthetic language data examples, which are used to train the FLM task model. Use of synthetic data enforces consistent behavior by the FLM in production and allows personal identifying information to be avoided, a key privacy concern across industries.

  • Augment the training data. Accurate treatment decisions often can’t be made without factoring in the customer’s history and other possible variables, such as length of relationship and purchase history. Therefore, for optimal decision-making, the FLM can be enhanced with previous interaction data and transaction records, pulled in through linkages to enterprise customer relationship management and other systems, providing an individualized view.

  • Give the agent context-sensitive, real-time guidance. As the agent communicates with the customer, the FLM provides an appropriate script in real time. The agent, assured of acting compliantly, can focus on having an empathetic conversation to solve the issue that prompted the customer’s call. In the background, the FLM also scores the appropriateness of potential actions, presenting the agent with an appropriate next step at the right moment.

FLMs operationalize GenAI responsibly

Focused language models offer strong benefits to organizations in the financial services industry and across many others.

First and foremost, FLMs provide an immediately accessible, responsible approach to stamp out GenAI hallucinations, to achieve compliance in a global regulatory environment. Zooming out, FLMs are the ideal building blocks of autonomous agentic applications, executing multitask workflows with the highest precision.

Paired with responsible AI principles and AI governance frameworks. FLMs’ focus on domain data and task training establishes this analytic construct as a hallucination-free decisioning foundation today and for the agentic future that can be relied upon and trusted.

Original Post>

Enjoyed this article? Sign up for our newsletter to receive regular insights and stay connected.

Leave a Reply