Skip to main content

Posts

The Validation Gap Is Costing You More Than You Think

Our latest State of Software Delivery report analyzed more than 28 million CI workflows and found a pattern that should give engineering leaders pause. Average throughput grew 59% year over year. Main branch activity for the median team declined 7%. Teams are generating more code than ever before. Less of it is reaching production. The cost of poor validation used to show up mostly in developer hours: debugging, blocked deployments, context switching. That cost hasn’t gone away. But there is a second bill now. Every failed build means agent retries. Every slow pipeline is compute burning while an agent waits. Main branch success rates have fallen to a five-year low of 70.8% against a 90% benchmark, and the AI spend attached to every failed cycle is climbing alongside it. The teams doing well are catching failures earlier and keeping their pipelines healthier. They are running the same tools as everyone else. What they have structured differently is where and when validation ha...
Recent posts

Why Enterprise AI Infrastructure Is Becoming a DevOps Problem

Most enterprise AI projects start with retrieval. You connect Jira, Confluence, SharePoint, and Slack. Maybe a few internal databases nobody has touched in five years. You tune embeddings, optimize chunking, wire up a vector database, and convince yourself you’ve built an AI-powered knowledge system. Then the model server crashes. And suddenly, you discover the uncomfortable truth about enterprise AI: The hard part was never retrieval. It was infrastructure. For the past two years, the industry has treated LLM deployment like a feature integration problem. In reality, it is rapidly becoming a platform engineering problem, one involving GPU orchestration, scaling economics, governance boundaries, workload scheduling, observability, and operational resilience. The moment organizations move beyond prototypes, the conversation changes fast.   Search Was Never the Product Enterprise search already exists. Most organizations have had it for years. But what teams actually want is synthe...

AI Is Changing How We Write Infrastructure, But It’s Not Solving How We Control It

Over the past year, AI has fundamentally changed how software is written. Infrastructure code is no exception. Tasks that once required deep familiarity with tools, syntax, and workflows can now be handled through natural language. Engineers are no longer starting from a blank file. In many cases, reviewing and modifying code generated for them has become the norm. At a high level, this looks like progress, and in many ways, it is. Teams can move faster, the barrier to entry is lower, and experimentation is easier. But there is a growing gap that many organizations are only beginning to recognize: AI is accelerating how infrastructure is created, but it is not solving how infrastructure is understood, controlled, or governed. The Shift from Writing to Generating Traditionally, infrastructure as code assumed that the person writing the configuration understood what it would do. That assumption no longer holds. Today, it is increasingly common for infrastructure definitions to be gen...

Why Your AI Agent is a Black Box and How to fix it With OpenTelemetry 

You built the agent. It works in testing. Then it hits production and starts giving wrong answers, timing out or burning through your token budget, and you have no idea why. This is when developers discover that print statements and log files weren’t designed for this.    LLM applications fail in ways that traditional tooling can’t see. A hallucination doesn’t throw an exception. A slow retrieval step doesn’t show up in CPU metrics. A prompt that worked yesterday silently degrades today.   The fix is observability, and the standard for doing it right is OpenTelemetry (OTel).   What OpenTelemetry Actually Is   OTel isn’t a monitoring product; it’s a vendor-neutral specification under the CNCF that defines a standard way to collect observability data: What gets collected, what it’s called and how it’s shipped. You instrument your application once and can send that data to Grafana, Datadog, Jaeger or a purpose-built LLM platform without rewriting your instrumentation.   That portabil...

Agentic SRE: The Next Frontier of Reliability 

Agentic SRE is the evolution of site reliability engineering where AI agents help observe systems, reason over telemetry and take bounded operational actions under human-defined guardrails. The goal is not to replace SREs, but to reduce toil, speed up diagnosis and make incident response more consistent and scalable.   Why This Matters   Modern systems are too distributed, noisy and fast-moving for purely manual operations to keep up. Engineers spend significant time correlating dashboards, reading logs, checking recent deploys and hunting for context before they can even start fixing the problem. Agentic SRE addresses this by turning telemetry into actionable context and automating safe parts of the response loo p .   This shift is especially important because reliability work is full of repetitive, high-pressure tasks that are easy to standardize but hard to execute perfectly at 2 a.m. That makes it a perfect fit for agents that can summarize, correlate, recommend and execute wit...

5 Ways Agentic AI is Redefining DevOps Architecture for Self-Healing CI/CD Systems 

In the past, the flaky test was a problem: A race condition, a timeout, an annoyance that needed to be rerun and forgotten. That’s no longer the case. As enterprises transition from deterministic applications to agentic AI, the flakiness problem has become a structural issue.   Old CI/CD systems rely on binary assertions: Assert X == Y. But with AI agents, the output isn’t Y; it’s Y-like answers. Run the same agent again, and it will likely produce two defensible but varying results. So, the test suite built on a scenario that no longer exists, calls this a failure.   DevOps teams and engineers don’t just face the challenge of building agents but also recreating the entire pipeline.    In this post, we will share  how agentic AI is transforming  the DevOps architecture for self-healing CI/CD.    What Does the Term “Agentic” Mean Here?    Agentic AI is an automated system capable of receiving a target state, sensing its surroundings using telemetry and APIs, reasoning about the act...

JFrog Report Surfaces Need for Rapid DevSecOps Change in AI Era

A report published by JFrog finds that cybercriminals are now increasingly targeting the artificial intelligence (AI) tools and platforms used by application development teams. Based on an analysis of 18.2 billion artifacts managed via the JFrog Platform, security researchers discovered 969 AI agent skills carrying high-impact payloads in addition to 495 malicious AI models on the Hugging Face platform for hosting open source AI models. Additionally, 56 malicious extensions were also discovered on the OpenVSX registry. The survey also finds 41% of respondents work for organizations that are actively using AI libraries, with organizations on average employing 9.3 AI libraries each. At the same time, a separate global survey of 1,508 security and DevOps professionals conducted by JFrog finds more organizations are struggling to secure code generated by AI coding tools. Nearly half of respondents (45%) said reviewing and hardening AI-generated code is now a major time drain, with an eq...