Skip to main content

Posts

The “Day 2” AI Problem: Why Standard API Gateways Fail at GenAI Scale

Injecting GenAI into applications is deceptively easy. Need a new chatbot backed by an LLM? Grab an OpenAI API key and you can throw together an MVP in an afternoon. This is the pattern teams have used to push AI features into apps for the last few years. The problem, as with previous tech hype cycles, is the “Day 2” hangover. This is the operational nightmare where the telltale signs of architectural debt appear. Once these apps hit production, reality bites: you wake up to a $10,000 bill because some logic went rogue, or you discover that 50 different developers have hardcoded 50 different API keys across their .env files. The remedy isn’t just better discipline; it’s better architecture. Specifically, the AI Gateway pattern. This middleware sits between your internal developers and external model providers, acting as a critical control plane, including giving developers an easy way to implement solutions to pressing problems in the AI space, including AI guardrail...
Recent posts

GitHub Breach Tied to Malicious VS Code Extension Exposes Thousands of Internal Repositories

GitHub says attackers accessed thousands of internal repositories after a company employee’s device was compromised through a malicious Visual Studio Code extension, though the company said it has removed the malicious extension, isolated the compromised endpoint, and launched an investigation. The company confirmed that approximately 3,800 internal repositories were affected. GitHub stated that investigators have not found evidence of impact to customer repositories or enterprise environments outside GitHub’s own systems. The hacking group TeamPCP later claimed responsibility for the intrusion in a post on the Breached cybercrime forum. The group alleged it had obtained source code and thousands of private repositories and sought at least $50,000 for the data. GitHub has not formally attributed the attack to TeamPCP, though the company acknowledged that the group’s public claims are generally consistent with the scope of the ongoing investigation. The GitHub breach is the latest e...

OpenSSF’s CRob: ‘The Runway Is Rapidly Running Out’ on EU CRA Readiness

The EU’s Cyber Resilience Act kicks into high gear this September, and companies are still clueless about how they must obey its strictures. MINNEAPOLIS — At Open Source Summit North America , Christopher “CRob” Robinson, Chief Security Architect for the Open Source Software Foundation (OpenSSF) , spoke about the European Union’s (EU) Cyber Resilience Act (CRA ). CRob warned that companies are still “running straight at that wall” as the first CRA enforcement date draws ever closer. The CRA, for those who don’t know it, sets mandatory cybersecurity rules for nearly all “products with digital elements,” which means hardware and software, sold on the EU market, with most obligations falling on manufacturers but some also on importers and distributors. That means if you sell pretty much anything in the EU, you must include a security risk assessment; design them with secure default configurations and the ability to restore to a secure state; eliminate known exploitable ...

1Password Allies With OpenAI to Secure Codex AI Coding Tool

1Password and OpenAI today revealed they have integrated a Model Context Protocol (MCP) server to the Codex artificial intelligence (AI) coding tool to better secure developer credentials. As a result, Codex credentials can now be issued on a just-in-time basis to ensure secrets are not logged, cached, reused across sessions or surfaced in unexpected outputs. Instead of sharing .env files or hardcoding credential values, application developers access a shared environment where secrets are made available at runtime, without the values ever appearing in code, terminals, or model context. 1Password CTO Nancy Wang said, with that approach, in effect, developers can grant Codex access to credentials directly inside their coding workflows while keeping secrets outside of code. The MCP server does not read or return secret values through the MCP channel, surface secrets in the model’s context window, or write them to disk. Codex can create environments, list variable names, and invoke appl...

We Spent 15 Years Automating Infrastructure. Now We’re Automating Decisions

For most of the last 15 years, DevOps has been engaged in a massive automation project. First, it was server provisioning, then configuration management, then infrastructure as code. CI/CD pipelines followed, along with containers, Kubernetes, GitOps and eventually platform engineering. Each wave built on the previous one, steadily pushing infrastructure and operations further away from manual processes and deeper into programmable systems. The industry became extraordinarily successful at it. Tasks that once required ticket queues, weekend maintenance windows and large operations teams became automated workflows that could execute repeatedly and reliably. Infrastructure stopped being something organizations manually assembled and increasingly became something they declared, versioned and continuously reconciled through software. What is important, though, is that most of this automation was still fundamentally deterministic. Engineers wrote scripts. Teams defined workflows. Desired ...

Cursor’s Composer 2.5 Brings Smarter, More Reliable AI Coding Agents

AI-assisted coding tools are getting a meaningful upgrade. Cursor has released Composer 2.5, the latest version of its proprietary coding agent model, and the improvements go well beyond a version bump. Composer 2.5 is described as a substantial improvement in intelligence and behavior over its predecessor, Composer 2. It handles sustained work on long-running tasks better, follows complex instructions more reliably, and is easier to work with overall. For development teams already using Cursor or evaluating AI coding tools, that combination matters. Raw capability is one thing. But an agent that can stay on task across a lengthy workflow — without drifting, hallucinating tool calls, or needing constant correction — is a different story. Built on Open-Source Foundations Composer 2.5 is built on the same open-source checkpoint as Composer 2, Moonshot’s Kimi K2.5. That’s worth noting because it reflects a broader trend in the AI industry: frontier-quality capabilities are ...

When Millions Arrive in a Minute: Why Reactive Autoscaling Fails and the Predictive Fix 

Reactive autoscaling is a critical safety net . Demand rises, metrics spike, policies trigger, and capacity increases. But flash-crowd events, product drops, major campaigns, and limited-inventory moments do not ramp. They cliff. Users arrive at once, and reactive scaling is structurally late because “scale triggered” is only the start of the journey to usable capacity.   If your demand spike arrives faster than your system can warm up, reactive scaling will lag no matter how well you tune it. The fix is planning and verification: scale before the event and prove the system is ready before customers arrive.   This article outlines a practitioner approach: schedule-aware, tier-based predictive scaling using capacity targets and an executor that verifies readiness.   Why Reactive Scaling Loses Against Flash Crowds   Reactive scaling assumes:   Demand ramps gradually enough to be detected early.   Signals (CPU, request rate, latency) change soon enough to trigger action.   Pro...