Why Generative AI Stalls in Insurance: The “Data Landfill” Problem

jquigley

20 hours ago

Keynote Recap

Why Generative AI Stalls in Insurance (And How to Fix the “Data Landfill”)

At the recent Altaworld Insurance Tech & Innovation Conference in Chicago, the agenda made one thing abundantly clear: the future of insurance operations is moving at breakneck speed. Sessions across the two-day event mapped out a highly technical future, diving deep into everything from AI-driven underwriting and claims automation to cyber insurance, IoT, ESG integration, personalization, and emerging risk strategies.

Yet, as carriers sprint to operationalize these advanced frameworks, a frustrating reality is setting in behind closed doors—most downstream automation and GenAI initiatives are stalling out or failing to survive past the proof-of-concept phase.

During a packed keynote addressing over 150 insurance executives, Ondexx CEO Paul Quigley and SVP Michael Greenhow delivered a sobering boardroom reality check: the bottleneck isn’t a failure of the AI models. It’s a systemic crisis of unstructured, uncurated enterprise data—the “Data Landfill.”

The Hidden Problem: The 85-95% “Data Landfill”

Insurance carriers run on massive, complex knowledge ecosystems with terabytes of unstructured content and documents—treaties, underwriting guidelines, and multi-page policy manuals. Yet, decades of relying on legacy, IT-managed document repositories such as SharePoint have left organizations with a structural nightmare: in a typical corporate repository with millions of documents, only 5% to 15% represents the verified, legally active “final version” of truth. The remaining 85% to 95% is a landfill of obsolete duplicates, unvetted drafts, and unapproved regional files.

Feed this unstructured clutter into an advanced Large Language Model (LLM) via Copilot or similar, and you trigger a mathematical certainty:

Dirty Data + GenAI = An AI lying with absolute confidence.

Because LLMs are predictive engines and cannot decipher organizational authority on their own, they smooth out conflicting data points algorithmically, resulting in highly confident multi-million-dollar hallucinations.

The Keyword Trap in Action

Consider a mid-level commercial property underwriter typing: “What is our wood frame construction secondary water damage deductible requirement for 2025?”

In a standard SharePoint environment, a general keyword search scans the entire drive and dumps 38 conflicting results:

An unvetted Q2 2024 actuary template left in the folder.
An unapproved localized regional instruction sheet.
An outdated 2023 version that was never marked historical.

Traditional Repository + Copilot

The Breakdown: The chatbot ingests all 38 files simultaneously. Because it cannot isolate authority or decipher context, it averages and “smooths” the conflicting data points.

The Hallucination Answer: “According to our corporate underwriting guidelines, the deductible is $5,000.”

The Severity: The system validates an unapproved, localized rule. The underwriter issues the policy matching the lower threshold, exposing the carrier to an unhedged, multi-million dollar liability.

The Ondexx Curated API Pipeline

The Breakthrough: Ondexx actively sanitizes information at the source. It prevents data landfills, structureless metadata, and behavioral clutter from reaching downstream architectures.

The Compliant Answer: “Per 2025 Master Guidelines, the mandatory corporate-wide minimum deductible for secondary water damage on wood frame construction is $25,000. This rule is legally binding across all regions with zero exceptions.”

The Guardrail: The carrier immediately eliminates underwriting exposure and enforces structural compliance in seconds.

Under pressure to hit a deadline, a generic corporate chatbot synthesizes this messy data, validates the unapproved draft, and tells the user the deductible is $5,000 instead of the mandatory $25,000 corporate minimum. The carrier is instantly exposed to catastrophic, unhedged liability.

The Regulatory Shift: “The LLM Hallucinated” Is No Longer an Excuse

Data hygiene is no longer just an IT optimization project; it is an urgent legal mandate. In a highly regulated environment, a model hallucination isn’t an IT glitch—it’s an unauthorized payout and a direct compliance violation.

The NAIC Model AI Bulletin: Currently active across 20 U.S. states, with 20 more coming online over the next 6 to 18 months. Writing business in just one active state exposes your entire corporate AI pipeline to their auditing standards.
Canada’s OSFI Guideline E-23: Mandates matching, stringent AI data integrity and governance regulations effective May 1, 2027.

Regulators have made the boardroom reality clear: Carriers are fully, legally responsible for their AI’s data integrity. You can no longer blame the algorithm or your third-party vendor. If your automated systems quote flawed rules, your organization faces major compliance penalties and legal liability.

Why SharePoint Fails as an AI Data Source

Ondexx CEO Paul Quigley pulled back the curtain on why general-purpose document repositories are inherently unequipped to fuel downstream AI systems:

The “Free” Illusion: Legacy solutions like SharePoint are rarely free, costing an estimated $250K to $750K annually in hidden IT headcount per 1,000 users just to manage file clutter.
The Pedigree Gap: Solutions are designed, built and managed by IT, rather than by software and application engineers who understand downstream knowledge delivery, structure, and integrity.
The 30-Year-Old User Behavior Problem: Built on a rigid file-and-folder mentality, these repositories allow users to constantly create fragmented, unvetted duplicates on shared drives, continually feeding the data landfill.

Figure 2: Why traditional storage repositories fail as an AI data source—generic bots smooth over chaotic data to present incorrect rules with absolute confidence.

The Solution: Why Flashy AI Fails Without Rock-Solid Architecture

While horizontal tech vendors sprint toward flashy, unvetted AI automation, Ondexx takes a contrarian stance. We embrace what the AI tech world dramatically underestimates, and reframe it as a client-winning position: we focus on excellence in design, and rock-solid information architecture.

Ondexx doesn’t replace your existing storage infrastructure; it sits cleanly on top of your current Microsoft investments. True transformation requires Knowledge Demolition—stripping away the 85-95% landfill at the source so your private AI models only ever touch a validated Single Version of Truth.

Fueling Your AI Strategy via the Ondexx 6.0 Content Extraction API

Through the newly released Ondexx Content Extraction API, downstream systems, RAG pipelines, and internal corporate copilots can securely receive authoritative knowledge with its critical business context intact:

Consistent Semantic Structure: By organizing knowledge into a rock-solid hierarchy, removing ambiguity so LLMs don’t have to guess business intent from folder locations.
Granular Permission Guardrails: User access can be controlled down to the lowest level in Ondexx. The API passes this access-control context directly to downstream systems, ensuring restricted underwriting rules or procedures are never surfaced to the wrong audience.

Secure Your Foundation

A brilliant AI model is entirely useless—and legally dangerous—if it is grounded in an unmanaged internal landfill. By shifting your strategy from fixing chaotic data to actively preventing it at the source, your organization can capture significant ROI while mitigating massive regulatory liabilities.

Coming Soon: We are putting the finishing touches on two brand-new, deep-dive white papers on AI Governance designed to help insurers navigate this landscape:

Why RAG Fails Without Governed Enterprise Knowledge
Why AI Audit Trails Fail with Your Current Source Repository

Connect with Michael Greenhow at mgreenhow@ondexx.com or visit www.ondexx.com to learn more.

Connect with Us

Call for more info

(888) 333-1193

Email our experts

experts@ondexx.com

Book a demo

Get Started