98% Automation: Why Most Enterprise AI Projects Fail and What Actually Works

*A machine learning engineer at Cognizant, Raj Bhowmik, who achieved 98% automation of EDI mapping work, shares his approach to enabling GenAI systems to work with legacy infrastructure.*

Enterprise IT departments face a problem: their data is stored in incompatible repositories. SAP HANA runs alongside BigQuery and Azure SQL. EDI files come in dozens of formats. Customer data is spread across systems that don't talk to each other. According to a McKinsey report, although 23% of organisations are scaling agency AI and 39% are experimenting with them, most fail when confronted with this reality. Raj Bhowmik, Machine Learning Engineer at Cognizant with a Master's in Machine Learning and Data Science, built GenAI systems that actually work in these environments: an intelligent EDI analyzer achieving 98% automation and reducing weeks of work to minutes, an Enterprise AI Data Copilot that queries SAP HANA in natural language, and a sentiment analysis system improving picker efficiency by 80 percent. With two papers accepted at IEEE conferences (ICNGN 2025 in Singapore and FMLDS 2025) and experience judging AI hackathons across the U.S., he bridges academic research with production deployment. This interview examines the technical architecture needed to make AI systems work with legacy enterprise infrastructure and why hybrid approaches outperform pure LLM solutions.

*Raj, your electronic data interchange analyser achieved 98% automation, reducing weeks of work to a couple of minutes by automatically generating mappings, extensible stylesheet language transformations , and eXtensible markup language for X12, EDIFACT, and IDoc formats. What prompted you to use a hybrid approach instead of a pure large language model (LLM) or pure rule-based code?*

We chose a hybrid approach because EDI work sits at the intersection of high variability and high consequence. Deterministic components provide predictable, testable behavior for structural compliance, while an LLM helps with interpretation and edge cases where rigid logic doesn’t scale. The guiding idea is simple: keep anything that must be correct and auditable in a controlled layer, and use generative AI where it adds flexibility—under strict validation and human-in-the-loop safeguards. It also helps with long-term maintainability: instead of constantly expanding brittle rules, we can adapt to new partner patterns without redesigning the whole system. Overall, it’s a pragmatic balance between accuracy, scalability, and governance for production-grade integrations.

*After earning a degree in computer science from one of India's top universities, a Master's degree from IUB, and working for four years as a software engineer at a leading Indian IT company, what prompted you to move into machine learning, and how does your engineering background influence the creation of artificial intelligence production systems today?*

Software engineering taught me that models make up 30 per cent of production systems. The other 70 per cent is tooling, error handling, fallback logic, and integration. When I was building a sentiment analysis system for chats between salespeople and customers in retail, the machine learning model was simple. Most of the effort went into ensuring its reliability in production: handling message queue failures, database load, drift monitoring, and smooth performance degradation. Data scientists often create notebooks that never leave their laptops because they ignore these issues. My background in computer science means that I think of database performance, API design, deployment pipelines, and failure modes as primary constraints, not secondary considerations. Systems must work on Tuesday morning when the infrastructure is overloaded and someone's deployment has broken upstream services.

*You work as an end-to-end GenAI owner from problem definition through deployment. How does this bridge between business teams, data engineers, and stakeholders actually work when most of them don't understand the capabilities of LLM?*

I act as a two-way translator. Business teams describe problems, such as slow customer service response times or lengthy supplier onboarding processes. I study actual workflows to find bottlenecks, then determine whether AI will help and what approach is appropriate. Not everything requires GenAI; sometimes simpler solutions work better. When AI makes sense, I design the architecture: database queries, prompt structures, vector stores, and human involvement points. Data teams build pipelines, and engineers handle deployment. For stakeholders, I set realistic expectations and measure results using business metrics: time savings, error reduction, and increased throughput. When presenting demos, my goal is to show concrete value in numbers, not impressive terminology. This level of translation prevents both overpromising and underutilising AI capabilities.

*We’re entering the age of AI agents, companies want assistants that can answer questions and complete tasks across tools like data platforms, CRMs, ticketing systems, and internal wikis. If I wanted to build an enterprise-grade agent, what architecture would someone use to keep it accurate, secure, and reliable?*

I’d start with an agent orchestration layer that breaks a request into steps (plan → act → verify) and routes each step to the right system. The LLM is best used for intentunderstandinganddecisioning, but execution should happen via approvedconnectors/tools (APIs, SQL runners, workflow services) with strict input/output schemas. Next, I’d add guardrails: permission checks tied to the user’s identity, policy rules (what the agent can/can’t do), and validation that blocks risky actions before they run. For reliability, I’d include retrieval from trusted sources, testable prompts/templates, and automated checks (schema validation, unit tests for critical workflows, and fallbacks to human review). Finally, I’d build in observability—logging, audits, and feedback loops—so you can measure accuracy, catch failures, and continuously tighten policies as the agent scales.

*Two papers have been accepted at the ICNGN 2025 and FMLDS 2025 conferences, held by the IEEE in Singapore and Los Angeles. What research questions linking academic machine learning and enterprise implementation do they address?*

Both articles explore practical applications of machine learning in a corporate context, documenting models that work in production but are not obvious from theoretical research. Academic venues sometimes underestimate the challenges of industrial implementation. These conferences, the International Conference on Intelligent Computing and Next Generation Networks and the International Conference on the Future of Machine Learning and Data Science, attract practitioners facing similar challenges: reliability, data quality, and measuring real-world impact. International presentations introduce you to different regulatory environments and technical standards, which are important for systems deployed around the world.

*Judging artificial intelligence hackathons across the United States and reviewing the technical content of a book. How do these activities influence your work in the production system?*

Judging hackathons allows me to identify new trends several months before they are implemented in enterprises — what tools teams choose, what they encounter, what excites them. This is forward-looking information that allows me to predict the needs of enterprises. Reviewing the technical content of the book allowed me to identify different architectural models and hone my critical thinking skills with regard to documentation. Evaluating the approaches of others helps to clarify my own. Both activities require explaining complex concepts to different audiences: hackathon participants, authors, and my colleagues. This translation skill is just as important as technical depth in corporate AI. Stakeholders need to understand what you are creating, and engineers need the context of constraints. Training others in AI use cases forces codifying what actually works versus demo-impressive approaches.

*The next step will be artificial intelligence in the field of ecology. How can the experience of integrating multiple databases in an enterprise be applied to solving problems related to environmental data?*

Environmental problems are associated with disordered, distributed data: sensors, satellites, government databases, and local measurements. These are integration tasks similar to those involved in integrating multiple databases within an enterprise. Optimising energy consumption, predicting environmental impact, and monitoring ecosystems — all of this requires systems that combine disparate sources into practical conclusions. The stakes are higher, and the feedback cycles are longer than in business applications — you can't do A/B testing on real ecosystems. But the core skills are transferable: reliable data channels, stakeholder trust, rigorous measurement. I am interested in applications that link environmental data and policy decisions, similar to how technical capabilities and business needs are currently linked. The goal is to make complex systems understandable and usable, whether they are corporate databases or environmental data streams.

98% Automation: Why Most Enterprise AI Projects Fail and What Actually Works

Dan Agbo

Related