AI Hallucinations Aren’t Accidents—they’re Built Into the System. Who’s Responsible?

When AI lies to you, how can you stay “digitally aware”?

The increasing integration of generative AI into professional, informational, and everyday decision-making activities has opened a new area of risk: the so-called “AI hallucination” effect. One well-publicised example from Australia illustrates the stakes of this issue. Deloitte in 2025 was obliged to repay the Albanese government in part after having a report prepared for a federal department for AU$440,000 that contained large amounts of errors, including fabricated references connected to AI-assisted content generation. The incident drew serious criticism on the reliability of AI in high-level consultancy, as well as on the fact that human oversight had been insufficient.

Credit: Adam Constanza/Shutterstock.com.
Partial refund to be issued after several errors were found in a report into a department’s compliance framework. Credit: Adam Constanza/Shutterstock.com.

A similar tension is currently brewing in China, where the country’s first-ever major legal fight for AI hallucination has recently been adjudicated. In this case, Liang spotted a factually false university-related dataset produced by an AI system while assisting a student with the college application process. And when he had pressed against the system, the AI replied it would compensate him 100,000 yuan if it didn’t come out correct. Liang then pursued a lawsuit against the platform’s developer, demanding damages for this statement. But the Hangzhou Internet Court in its first-instance judgment rejected the case, pointing to legal uncertainty around liability for AI-generated content.

These episodes — happening in various jurisdictions, under different institutional contexts — have the same theme: AI hallucination is not a random accident, nor a problem that’s localized to a nation or a legal system. As governments, corporations, and individuals rely increasingly on the product of AI for decision-making, legal systems everywhere are confronted with questions as yet unresolved as questions about responsibility in the face of machine-created knowledge, or accountability for liability and trust.

What Causes AI Hallucinations? Understanding Them

Understanding why large language models (LLMs) produce fabricated legal content requires examining how these systems are fundamentally designed. Hallucinations are not accidental glitches but structural consequences of the underlying architecture of generative AI.

wildpixel/iStock via Getty Images

First, LLMs are based on next‑token prediction. After being trained on massive text corpora, the model produces output by selecting the next token that has the most statistically probable outcome given the preceding context. It means that the LLM itself, even when generating text that appears coherent, authoritative, or legally sophisticated, is not “understanding” the law—it simply extends linguistic patterns it has been trained on from the data. LLMs “do not ‘understand’ content in a human sense, but instead assemble sequences that appear correct based on statistical likelihood.”

Second, modern LLMs are molded via reinforcement learning from human feedback (RLHF), which makes useful, fluent and agreeable suggestions. This training behavior fosters sycophancy—the habit of providing confident responses that match what users expect, even when the model fails to provide reliable information. LLMs are “trained to please the user, not to admit uncertainty,” meaning they’re much more likely to generate plausible‑sounding legal rules, citations, or facts than to respond with “I don’t know.”

Third, LLMs are bad at expressing uncertainty. Moreover, even where the internal probability distributions suggest low confidence, the output of the model doesn’t consistently tell us this truth. Post‑training alignment generally diminishes calibration, such that the model may sound much more sure of itself exactly when it is most inaccurate. The result is a dangerous problem in the legal system where confidence can be confused with credibility.

Lastly, LLMs show a tendency toward overconfidence, which is well‑documented. Frontier models such as GPT‑4 have been found to be paradoxically “more likely to be confidently wrong” than the previous versions even though they have a lower rate of raw hallucination. And this combination: high fluency, high confidence, low epistemic awareness make for an excellent environment in which to generate manufactured legal content that looks legitimate.

Lastly, LLMs show a tendency toward overconfidence, which is well‑documented. Frontier models such as GPT‑4 have been found to be paradoxically “more likely to be confidently wrong” than the previous versions even though they have a lower rate of raw hallucination. And this combination: high fluency, high confidence, low epistemic awareness make for an excellent environment in which to generate manufactured legal content that looks legitimate.

What Went Wrong in the Deloitte Case?

The Deloitte incident is better understood through the framework of the eight‑category error taxonomy, developed in recent AIGC research. Although Deloitte positioned the problem as a poor quality control decision, errors within the report fit very neatly into various defined categories of AI‑generated distorted information. This shows that the problem wasn’t random but structurally predictable.

Factual Errors

Factual errors happen when AI systems make statements that are inconsistent with the truth — when they issue faulty data, or misdescribe policies or inaccurate institutional information. Several examples of these inaccuracies in Deloitte’s report included inaccurate statistics and incorrect descriptions of government programs. They were not minor-level mistakes: They showed a failure even to test basic statements of fact. From the taxonomic perspective, this is what is known as a classic example of a straightforward fact misstatement of an objective fact error – one that could have been addressed by routine validation.

Unfounded Fabrication

Unfounded fabrication is content that is fully fabricated: nonexistent sources, fake references, or hypothetical analytical models. Deloitte’s report cited documents that had no substance, and referenced analytical models that had no basis in prior research or policy practice. These are not “mistakes” in the traditional sense; they are made-up things, created because the model was nudged into generating authoritative-sounding material, when the stuff didn’t exist.

Reasoning Errors

Reasoning errors are incurred when AI systems generate logical-sounding arguments on the basis of faulty logic, incorrect causal premises or a misapplication of legal or policy principles. Deloitte’s report showed a number of such issues, such as unsupported causation claims, policy recommendations that were less than aligned with policy principles, analytical leaps that were not evidenced. These errors are symptomatic of the model not managing to ‘think like you want to’ because it has a natural inclination to ‘do real reasoning…creating logic that’s still not fundamentally meaningful’.

Text Output Errors

Errors found in text generation include structure, redundancy, transitions, and style. Deloitte’s report included sections that were repetitive, poorly structured or not in accordance with professional practice. Such problems are normal with AI‑generated text: it is often more concerned with fluency than coherence, resulting in verbose or circular passages. Such errors are neatly categorized within the taxonomy’s text output error category.

Collectively, these four categories suggest the Deloitte report was not merely “low quality”—it was a combination of well-known kinds of AI‑based errors. According to the taxonomy, Deloitte’s failure was not incidental to, but systemic to, the organization; the organization did not have the necessary frameworks in place to identify, classify and manage AI‑related risks.

The Deeper Problem: A Lack of AI Risk Classification Awareness

The real problem unmasked by the Deloitte affair is not the existence of mistakes; rather, it is the lack of a structured approach to comprehending and taking control of AI‑generated risks. The industry widely is perceived as lacking standardized criteria for AI error detection and classification, as many academic literatures have already noted. Most organizations are doing so through “subjective and inconsistent” approaches, while not recognizing the full nature of distorted information produced by generative models.

Deloitte’s workflow mirrors this larger industry issue. The firm incorporated AI into its report‑writing process without setting up:clear verification protocols, error‑classification frameworks, human‑in‑the‑loop review standards, or disclosure requirements for AI‑generated content.

In other words, Deloitte treated AI as a productivity tool rather than a high‑risk system requiring specialized oversight. This is precisely the type of institutional vulnerability which warns that legal and professional environments are particularly susceptible to AI‑generated fabrications because they rely heavily on patterned, citation‑driven text that AI can mimic with deceptive fluency.

The Deloitte case also illustrates the vulnerability of the “AI‑first” consulting model. Organizations who aggressively embrace generative AI for efficiency gain typically overlook epistemic dangers. Without strong classification systems, AI‑first workflows can produce outputs that seem authoritative but are structurally imprecise. The peril, however, isn’t that errors happen so much as they happen in a systematic and invisible way as seen through shiny and glossy paper that simply do not register.

In the end, Deloitte’s failure was not technological — it was conceptual. The firm lacked an understanding of how AI mistakes manifest themselves, how they can be classified, and how to control their behavior. Just like this, unless organizations adopt robust error‑classification frameworks and can insert them into their daily operations, similar problems are likely to develop even as the underlying AI models improve.

What Are the Consequences—and Who Is Responsible?

AI hallucinations bring serious risks to legal and administrative institutions. Their effects reach beyond mere human error to affect the basic underpinnings of a culture that relies on institutional trust, evidentiary principles, and professional responsibility.

Consequences: Erosion of Trust and Distortion of Legal Reasoning

Hallucinated content can corrupt legal arguments, distort precedents, and embed fictitious facts into formal records. The legal field, where authority is established by citation, coherence, and textual fidelity, but fabricated content attacks the legitimacy of decisions, destroying the public confidence in judicial procedures and outcomes. Hallucinations “constitute a profound epistemic challenge,” revealing how easily legal discourse — heavily patterned and citation‑driven — can be replicated without understanding.

Responsibility: A Multi‑Layered Accountability Structure

The culpability of AI hallucinations cannot just be placed under the blame of one person. Instead, it needs to be doled out along three levels:

  • Individual Users

Users assume the primary responsibility for verifying the accuracy of any AI‑generated content they use in professional work. Courts in several jurisdictions have made this plain: AI tools can support, but can never displace legal research or professional judgment. Negligence is failure to check, not technological failure.

  • Institutional Actors

An organization needs to have its own internal protocols for AI usage — including obligations about disclosures, verification, and quality‑control. The Deloitte case illustrates the danger of institutional overreliance on AI, when insufficient protections are in place. The institutions that are deploying AI tools must do everything in their power to instill a sense of their staff knowledge of what these systems can and cannot do.

  • AI Developers and Vendors

Developers also have an obligation to refrain from overstating the reliability of their systems — especially assertions like “hallucination‑free legal citations,” which have been found to be misleading. Vendors will need to show transparency around potential limitations on models, provide citation verification tools, and eschew marketing that encourages people to be blind to the outputs from AI.

Taming AI Hallucinations: A Shared Governance Challenge

Effectively governing AI hallucinations requires a multi-level approach that recognises their structural origins rather than treating them as isolated technical failures. Since hallucinations emerge from probabilistic generation, optimisation for fluency, and the production of distorted information , governance must combine regulatory oversight, institutional safeguards, and user-level responsibility.

Photo by Igor Omilaev on Unsplash

At the government level, the priority is to establish clear accountability frameworks and risk-based regulation. Given that hallucinations can generate persuasive yet false content with real social and economic consequences , policymakers should require transparency in AI systems, including disclosure of limitations, data provenance, and known error rates. Regulatory mechanisms could mandate auditing processes, especially in high-stakes domains such as law, healthcare, and finance, where hallucinated outputs may lead to material harm. In addition, public investment in AI literacy and independent evaluation bodies would further strengthen these regulatory efforts by improving public understanding of AI limitations and supporting the continuous assessment of system reliability.

At the individual level, users must adopt a critical and reflexive stance toward AI-generated content. Because LLMs are designed to produce confident and coherent answers even in the absence of factual grounding , individuals should treat outputs as probabilistic suggestions rather than authoritative truths. Practical strategies include cross-checking information with reliable sources, avoiding blind reliance in high-stakes decisions, and developing basic AI literacy to understand how and why hallucinations occur. Professionals, in particular, bear heightened responsibility to verify AI-assisted outputs before use, as demonstrated by increasing cases of fabricated citations in legal contexts.

In short, good governance is the combination of institutional regulation and individual alertness. Governments can establish the rules and infrastructure of trustworthy AI systems, but without informed and critical users, those solutions are not enough. On the other hand, single personal caution alone cannot make up for the structural weaknesses of the design. It follows that a sustainable governance framework should ensure that technical design, regulatory environments, and user cognition align, and to do so ensure that AI hallucinations risk can be reduced while the benefits of generative AI are not lost.

References

The Guardian. (2025, October 6). Deloitte to pay money back to Albanese government after using AI in $440,000 report. https://www.theguardian.com/australia-news/2025/oct/06/deloitte-to-pay-money-back-to-albanese-government-after-using-ai-in-440000-report

Stanford Human-Centered Artificial Intelligence Institute. (2024). AI trial: Legal models hallucinate 1 out of 6 (or more) benchmarking queries. https://hai.stanford.edu/news/ai-trial-legal-models-hallucinate-1-out-6-or-more-benchmarking-queries

Sohu. (n.d.). AI hallucinations are no illusion: The structural challenges behind distorted AI-generated information. https://www.sohu.com/a/981976312_121106884

Xinhua News Agency. (2026, February 6). Beware of the chaos of AI-generated content: Do not let academic integrity be compromised. https://www.news.cn/sikepro/20260206/0532d0265252439e8f52b15cab3b293b/c.html

Flew, T. (2021). Regulating platforms. John Wiley & Sons.

Sun, Y., Sheng, D., Zhou, Z., et al. (2024). AI hallucination: Towards a comprehensive classification of distorted information in artificial intelligence-generated content. Humanities and Social Sciences Communications, 11, 1278. https://doi.org/10.1057/s41599-024-03811-x

Maleki, N., Padmanabhan, B., & Dutta, K. (2024). AI hallucinations: A misnomer worth clarifying. In Proceedings of the 2024 IEEE Conference on Artificial Intelligence (CAI) (pp. 133–138). IEEE. https://doi.org/10.1109/CAI59869.2024.00033

Charlotin, D. (2025). GenAI’s legal fictions: Addressing hallucinations in the international dispute resolution arena. In YY (forthcoming).

Be the first to comment

Leave a Reply

Your email address will not be published.


*