Why Most AI Projects Fail in Production: Architect Vladislav Lubov Explains

According to the MIT Sloan Management Review's 2025 report "The GenAI Divide: State of AI in Business", only about 5% of generative AI pilot programs deliver measurable, sustained business value, while the vast majority stall at the pilot or proof-of-concept stage. One key reason experts highlight is technical friction: solutions that shine in demos often falter when integrated into real, messy business environments with diverse data sources, legacy systems, and complex processes.

It happens because data comes from multiple sources, systems were originally designed for different tasks, and business processes are complex and fragmented, even promising AI solutions can become unstable. As a result, a model may encounter difficulties in real-world operation, for example, as data volumes grow, usage scenarios become more complex, or system load increases.

To understand which engineering principles help digital products withstand real-world demands — and why architecture often makes or breaks advanced technologies like AI in production — we spoke with Vladislav Lubov, an experienced full-stack engineer specializing in scalable backend systems. He has built and scaled infrastructure for high-traffic platforms, including Skyeng, a leading online education service that delivers millions of lessons monthly, and Webbankir, a major fintech that processes hundreds of thousands of applications per month, both requiring rock-solid stability under heavy, unpredictable loads.

Data Readiness as a Critical Barrier to AI Deployment

Data readiness remains one of the biggest barriers to successful AI adoption. According to a recent Gartner report, up to 60% of AI projects are expected to fail by 2026 due to a lack of AI-ready data. Even well-trained models can struggle in production environments when underlying data pipelines are unstable, delayed, or unable to handle increasing data volumes.

Vladislav Lubov has faced these issues firsthand. In one high-load project, analytical queries started overwhelming the main production database, causing noticeable slowdowns for the entire service during busy periods. To fix it, he redesigned the data pipeline: key event streams were routed straight from Kafka to ClickHouse for fast analytics, with intermediate results cached in a NoSQL store. This broke heavy computations into smaller, incremental steps instead of big batches. The changes eliminated overloads during peak periods, enabled the system to handle significantly larger data volumes reliably, and sped up processing in key scenarios by up to 30 times.

"Many teams design data pipelines around the needs of a prototype," Lubov says. "But production systems operate under very different conditions — real traffic, large data volumes, and unpredictable load. If the pipeline isn't modeled for the scale the product is expected to reach, bottlenecks will inevitably appear as soon as the system starts growing."

Observability and Reliability in Modern Digital Infrastructure

Service outages remain one of the biggest headaches for tech companies. New Relic's 2025 Observability Forecast reports that high-impact outages cost a median $2 million per hour, resulting in an annual median impact of $76 million for the businesses surveyed. Engineering teams spend about 30% of their time — roughly 12 hours a week in a standard schedule — dealing with incidents and firefighting instead of building features.

In many technology projects, however, teams focus mainly on developing product features while giving limited attention to monitoring system health once the system goes into production. As a result, after launch, teams often lack visibility into infrastructure performance, and problems become apparent only when users begin experiencing failures or slowdowns. For example, a service may gradually slow down, message queues may grow, or a database may approach its resource limits. Identifying the root cause can then become a lengthy process requiring engineers to analyze logs, check server states, and manually reproduce issues—sometimes taking hours or days in complex systems.

In each project, Lubov implements metrics systems that enable the team to track key indicators, including request processing time, error rates, service load, and queue status. This enables the detection of emerging issues and response before they become noticeable to users.

"Introducing technical metrics can significantly improve service reliability. In several projects, system resilience was brought close to 100% availability, and incidents began to be detected at the stage of the first technical anomalies rather than after user complaints."

The Risks of Overengineering and Trend-Driven Architecture

New architectural approaches regularly emerge in the technology industry and quickly become "trendy." Microservices, Kubernetes, and complex distributed systems are often perceived as hallmarks of modern infrastructure. However, industry surveys suggest that adopting these architectures does not always lead to the expected benefits.

For example, according to a survey published by O'Reilly Media, 37% of respondents said they had achieved only «some» success with microservices, and about 30% reported increased system complexity after adopting this architectural approach.

Based on Vladislav Lubov's experience, many teams begin designing the system architecture based on the popularity of technologies rather than the product's actual needs. For example, a system may be immediately divided into dozens of services, each deployed independently and communicating with others over the network. For large-scale platforms, this approach can make sense, but for smaller or early-stage products, such architecture often introduces more problems than it solves.

Every additional service becomes another potential point of failure, another infrastructure component that must be configured, and another layer that must be monitored. As a result, the system becomes harder to maintain, and even minor issues can propagate across multiple services.

Lubov notes that he has repeatedly encountered situations in which excessive architectural complexity has become the primary source of system instability. In such cases, the first step was often to reconsider the architecture itself: instead of prematurely splitting the system into microservices, the team would return to a simpler structure and gradually extract separate services only where they were truly required by system load or product logic.

"In many situations, it is far more effective to begin with a well-designed monolith," Lubov explains. "At early stages, it lets teams move faster — easier testing, quicker deploys, simpler changes. As the product grows and real metrics reveal the pain points, you can gradually carve out microservices only where they truly solve problems. Architecture should follow actual needs and data, not hype cycles. In today's rush to production-scale AI, this pragmatic mindset often separates projects that deliver real value from those that stay forever in pilot limbo."

Why Most AI Projects Fail in Production: Architect Vladislav Lubov Explains

Data Readiness as a Critical Barrier to AI Deployment

Observability and Reliability in Modern Digital Infrastructure

The Risks of Overengineering and Trend-Driven Architecture

Dan Agbo