AI & ML

Financial Chatbot Design Failures: Rearchitecting for User Behavior and Compliance

· 5 min read

The Real Bottleneck in AI: When Models Meet Messy Reality

In the breathless rush to celebrate every new large language model, it’s easy to get lost in what the models *can* do. We hear about extraordinary capabilities, and we see impressive demos. But talk to engineers who deploy these systems at scale, in genuinely high-stakes environments, and you'll find a different conversation entirely. It’s not about the raw power of the underlying AI anymore. The critical challenge has shifted to the "plumbing" around those models—the complex systems thinking required to make AI genuinely useful, reliable, and trustworthy for real users. This is a problem Vijay Kumar Sridharan has focused on for well over a decade. As a Vice President of Software Engineering at Goldman Sachs, he currently leads Conversational AI initiatives. His experience reveals a truth often overlooked in the hype cycle: the hardest problem isn't natural language understanding or entity extraction. Those elements of the NLP stack, while challenging, are largely solved for many enterprise use cases today. The real friction point is designing a conversational AI system that truly serves a user's journey, rather than just processing isolated inputs based on a company's internal logic.

Building for How Humans Think, Not How Companies Organize

Here’s the thing: most chatbots are built to reflect the company's information architecture. Think about it—they mimic product categories, departmental silos, or database schemas. But users don’t think that way. They have a problem, a question, or a need, and they expect the system to understand *their* mental model. This disconnect is where frustration creeps in, and where even technically capable bots fail. This philosophical underpinning was put to the test during the early days of the COVID-19 pandemic. In 2020, Sridharan was leading engineering at OneMain Financial, the largest personal installment loan company in the U.S. As branches restricted operations and call centers buckled under unprecedented demand, customers still needed to manage loans, defer payments, and get answers, often outside business hours. OneMain's existing chatbot infrastructure was simply not up to the task. Two previous development efforts had already stalled. What Sridharan inherited was a pile of failed attempts and plenty of speculation about why they hadn't worked. His team went straight to the source: user interaction logs from those earlier failures. The patterns were stark. Bots faltered on anything outside a narrow set of expected phrasings. Customer intent was ambiguous, and the systems offered no graceful way to navigate that ambiguity. They'd just error out or deliver a generic, unhelpful fallback. His team tackled this by rethinking the NLP pipeline from the ground up. They trained classification models not just on pristine examples, but deliberately on the messy edge cases found in actual logs—the misspellings, abbreviations, and fragmented inputs that real people use. They also engineered context management that could maintain session state across the tangents and interruptions inherent in human conversation. Crucially, they moved away from the common chatbot approach of hitting a wall with a generic "I don't understand" or sending users back to a main menu. Instead, they built graduated confidence thresholds. If the model was uncertain about intent, the system would ask a clarifying question designed to move the conversation forward, rather than resetting it entirely. The user experienced a slight bump in the road, not a dead end. The operational impact at OneMain was immediate and significant. Call volumes shifted, response times plummeted, and the system absorbed pandemic-level demand that would have been impossible for human agents alone. But the metric that truly signaled success was user adoption. People kept coming back to the bot, which doesn't happen with a system that creates frustration. It showed that building around the user's mental model, even under extreme pressure, yields tangible results.

Challenging the "Technically Unfeasible" Myth

Sridharan's career also points to a broader lesson in engineering leadership: always question core assumptions. A prime example is the CheckCapture project at OneMain. The company needed to enable mobile check payments and document processing for customers and staff. After seven months, the project was stuck. Senior engineers had concluded that existing architectural constraints made the solution "technically unfeasible." Sridharan's take on this is pointed. "‘Technically unfeasible’ usually means ‘unfeasible given our current assumptions.’" He found the team had been trying to build *on top* of the existing architecture, rather than *alongside* it. The constraint wasn’t the technology itself, but a design choice. By rethinking the problem and building a solution that integrated with, but didn’t depend entirely on, the legacy architecture, his team successfully shipped CheckCapture. Payment processing that once took days was reduced to hours, and the system seamlessly integrated with other critical platforms like Kofax Total Agility, Mobius, and ELF for document ingestion. The lesson: when a problem is declared unsolvable, the first step is to scrutinize the assumptions that created that roadblock.

The Stakes are Higher: AI in Regulated Financial Environments

His move to Goldman Sachs in 2022 brought a new dimension to these challenges: navigating the rigorous regulatory environment of a major investment bank. The scale alone is different, but the compliance frameworks are what truly reshape AI system design. At a consumer lending company, the consequences of a faulty chatbot interaction, while real, are generally bounded. In financial services, the potential for error is vast, and regulatory scrutiny is intense. As Sridharan notes, "In finance, the cost of being confidently wrong is much higher than in, say, a retail recommendation system." If a bot provides incorrect information about a financial instrument or suggests the wrong product, it's not just a bad user experience; it's a significant regulatory and trust issue. This reality has sharpened his focus on a problem the broader AI industry is only just beginning to grapple with: the very fluency that makes LLMs so impressive in demos also makes them dangerous in production settings where accuracy is non-negotiable. They're excellent at sounding right, even when they’re not. His solution involves what he terms "verification layers." These aren't simple rule filters. They represent architectural decisions about where in the AI pipeline to introduce checks, how to identify high-risk outputs, and critically, how the system degrades gracefully when its confidence is low, rather than projecting false certainty. The goal isn't just a capable model, but a system that behaves predictably and safely at the edge of its knowledge. Financial services is being forced to solve this problem now, and the approaches being developed there will undoubtedly become blueprints for other high-stakes domains like healthcare, legal, and insurance as they scale their LLM deployments.

Toward an Adaptive, Persistent Assistant

Alongside his practical engineering efforts, Sridharan maintains an active research track, tackling questions that transcend individual product sprints. His work delves into the evolution of NLP and LLMs, adaptive chatbot architectures, and the design of AI-driven IVR (interactive voice response) systems. He's also an IEEE Senior Member and a judge for the Globee Customer Excellence Awards, a clear signal of peer recognition within the field. A recurring theme in his research is adaptive intelligence. Many current chatbot systems are essentially stateless; every conversation starts from scratch, every user is a stranger, and all prior context is lost. Sridharan sees a "meaningful difference between a system that can answer your question and a system that knows you well enough to anticipate it." The first is merely a search engine with a conversational interface. The second approaches a trusted assistant. Bridging this gap requires persistent user modeling—building and maintaining representations of individual preferences, communication styles, and behavioral patterns across sessions. The technical hurdles are considerable, from privacy compliance to computational cost, but they are solvable with deliberate design. His work on IVR systems explores a similar principle. We've all endured frustrating phone menus designed around a company's internal structure. Sridharan envisions LLM-based speech understanding replacing these rigid trees with systems that respond more naturally and contextually to how people actually speak when seeking help.

The Enduring Mission for Practical AI

There's a prevailing narrative today that AI models are expanding at warp speed, and applications are everywhere. Sridharan doesn't dispute the raw capability. But he believes it misses the essential point. "The story about how good the models are is true. What’s less discussed is how much of the value of these systems depends on the plumbing around them." This includes retrieval architecture, context management, escalation logic, and the system's ability to recognize when it simply doesn't know. These aren't the flashy bits, but they're often the difference between a production deployment that actually works and one that frustrates users and costs enterprises dearly. That gap—between a chatbot that works in a controlled demo and one that truly works for real users, at scale, in a financially consequential context—is what drives his work. The tools are better now than they were even a few years ago, but the fundamental challenge hasn’t disappeared. For him, progress isn’t some abstract AI benchmark. It's when a user reaches out with a problem, the system genuinely understands their need (not just their literal words), has their context, and either resolves the issue smoothly or connects them to the right person without friction. And the user comes away feeling heard. This isn't an exotic vision. Anyone who's worked in customer service would recognize it immediately. The best AI systems today still fall short of that standard in significant ways. But for Sridharan, solving this fundamental problem matters deeply. The financial services industry, for example, serves countless individuals who lack good alternatives. If technology can genuinely help them, not just technically function but truly improve their lives, that's a conviction worth building an entire career on.