DEV Community

Cover image for Understanding Why Large Language Models Hallucinate
Victor Isaac Oshimua
Victor Isaac Oshimua

Posted on

Understanding Why Large Language Models Hallucinate

Around 10 to 30% of the time, large language models (LLMs) like GPT-4 and Gemini tend to produce factually incorrect responses.

As a user of these LLMs, you might have come across scenarios like this:

You ask the LLM a question such as:
"Who was the first person to walk on Mars?"

And the LLM responds:
"The first person to walk on Mars was Alexei Ivanov in 2024 as part of the Mars One mission."

In reality, no human has ever walked on Mars as of 2025. The LLM has fabricated a name, a date, and even a mission (Mars One) that do not correspond to actual events.

This phenomenon is called hallucination.

Although I personally find the metaphor "hallucination" somewhat misleading—since hallucination refers to sensory perceptions without real sensory input and LLMs lack senses altogether—I believe "incorrect output" would be a more fitting term. Nevertheless, we will stick with the standard terminology used by the AI community.

In this article, you will learn what LLM hallucination is, why it happens, and how to reduce it.

What Does Hallucination Mean in LLMs?

Hallucinations in LLMs occur when their generated outputs deviate from facts or contextual logic. For instance, as shown in the example above, an LLM may generate outputs that appear correct but, in reality, are not grounded in factual information.
This can result in misinformation, particularly in critical industries like education or law, where accuracy in generated outputs is essential.

Categories of LLM halucination

Let’s break down LLM hallucinations into different levels to better understand how they occur.

Output Contradiction: This occurs when an LLM generates a sentence that contradicts a previously generated output. For instance, an LLM might generate a sentence stating that Argentina won the World Cup, and in the next sentence, it could claim that America won the World Cup.

Input Contradiction: This occurs when a LLM generates responses that contradict the input prompt or instruction.
An example of this would be when an LLM is asked to generate a story about a person who loves dogs, but it creates a narrative where the character is described as being afraid of dogs.
For instance, if an LLM is prompted with:
"Write a story about a person who enjoys going on long walks with their dog."

But it generates:
"The character dislikes walking and avoids their dog at all costs."

This is an example of input contradiction, as the response does not align with the original prompt.

Factual Contradictions: As the term suggests, these are outputs generated by LLMs that are inaccurate or false. For example, an LLM stating that Michael Jordan is a boxer.

Why Do LLMs Hallucinate?

The reasons behind why LLMs hallucinate may not have a straightforward answer. Even the engineers who develop these models often struggle to understand the "black box" nature of how LLMs generate their outputs. However, there are a few key reasons we can point out:

Input Context: The prompts provided to LLMs play a crucial role in guiding the model to generate relevant outputs. However, if the prompt is vague or ambiguous, it can confuse the model and lead to less accurate or irrelevant responses.
For example, if a user asks, "Tell me about the history of space exploration," the LLM can provide a detailed and accurate response.
But if the prompt is vague, such as "Tell me about space," the model might struggle to determine whether the user is asking about astronomy, space travel, science fiction, or something entirely.
This ambiguity can result in a response that is either too broad, off-topic, or even factually incorrect

Data Quality: The data used to train LLMs is often filled with noise, errors, and biases. For instance, if an LLM is trained on data scraped from platforms like Reddit, there is a high likelihood of inaccuracies.
For example, a Reddit user might claim that aliens are currently living among us on Earth. Since the LLM cannot verify the accuracy of such claims, it may inadvertently learn and reproduce these inaccuracies in its outputs.

How To Reduce Hallucination

What can you do to reduce hallucinations? Do you just keep prompting different LLMs and hope one avoids hallucinations better than the others? While trial-and-error experimentation can teach us about LLMs' strengths and performance, there are far more reliable ways to tackle inaccuracies.
Here are some strategies to reduce hallucinations and improve the quality of your interactions with LLMs:

1. Use Clear and Specific Prompts
Vague prompts can confuse the model, leading to less accurate responses. Detailed prompts help the model understand exactly what information you’re seeking.

Example: Instead of asking, "What happens on December 25th?" Try, "Can you explain the major holiday celebration that happens every year in December 25th?"

2. Adjust Model Parameters
Many LLMs allow you to control parameters like temperature when prompting; this influences the randomness of outputs.
For instance, a lower temperature produces more conservative and focused responses, reducing the likelihood of hallucinations.
A higher temperature increases creativity but also raises the risk of inaccuracies.

3. Employ Multi-Shot Prompting
Instead of single-shot prompting (one input), provide the model with multiple examples of the desired output format or context. This helps the model recognise patterns and expectations more effectively.

4. Understand the Causes of Hallucination
Hallucinations often come from ambiguous inputs or insufficient context. By identifying these factors, you can trace the reasons for halucination and improve the reliability of the model’s outputs.

Final Thoughts

LLM hallucination remains a significant challenge in generating accurate responses from LLLMs. However, this issue is not unaviodable. By adopting strategies like clear prompting, parameter tuning, and multi-shot examples, we can mitigate hallucinations and enhance output reliability.

Furthermore, advancements in reasoning model architectures and training methodologies, such as improved fact-checking mechanisms, are reducing the rate of LLM hallucinations.

Top comments (0)