AI Coding: New Research Shows Even the Best Models Struggle With Real-World Software Engineering

software engineering, AI coding, human, DryRun, application, developers, Nerd/Noir framework-defined infrastructure, developers, Daytona Loft Labs developer architecture Red hat engineering economic downturn developer governance

New OpenAI research reveals that frontier AI models like Claude 3.5 and GPT-4o solve fewer than half of real-world software engineering tasks from a $1M benchmark.

from DevOps.com https://ift.tt/49mig5X

Comments

Shift Left to the Developer’s Machine: Building Local Git Security Gates

A developer pushes one file. It contains an AWS access key left in a configuration block. Five minutes later, CI catches it. By then, the secret is in the remote repository, cached by mirrors and potentially forked. The developer rotates the key, scrubs the commit history and spends the rest of the afternoon on incident response. The real question isn’t how to clean up faster — it’s why the secret left the developer’s machine in the first place. The Five-Minute Gap Most engineering teams have invested in CI-based secret scanning . Tools such as GitHub Advanced Security, GitGuardian and TruffleHog’s CI integration catch leaked credentials in pull requests and pushed branches. This is good, but it’s also too late. The GitGuardian 2026 State of Secrets Sprawl report found that 29 million secrets were detected on GitHub in 2025 alone — a 34% year-over-year increase and the largest single-year jump ever recorded. Worse, 64% of secrets leaked back in 202...

Cloudbees Is A Launch Partner For Google Cloud Run As Product Goes Ga

Industry Leaders Collaborate to Offer Streamlined Deployment of Containerized Applications, Better Access to CloudBees Solutions on Google Cloud Platform Marketplace GOOGLE CLOUD NEXT, LONDON AND SAN JOSE, CA. – November 20, 2019 – CloudBees, the enterprise DevOps leader powering the continuous economy, today announced an extension of its partnership with Google. As a Google […] The post Cloudbees Is A Launch Partner For Google Cloud Run As Product Goes Ga appeared first on DevOps.com . from DevOps.com https://ift.tt/2XuFTxc

Why Endpoint Protection Matters More than Ever in CI/CD Environments

CI/CD environments depend on far more than repositories and deployment infrastructure. Developer endpoints hold sensitive data: cloud credentials, SSH keys, deployment permissions, direct access to internal systems. Endpoint security and control are part of daily operational risk management. Engineering teams are shifting more and more toward distributed workflows, so discussions around CI/CD security include the security posture of the devices connected to the pipeline. Many organizations already focus their CI/CD security efforts on secrets management , dependency scanning and supply chain controls. However, advanced endpoint security solutions are also relevant in cloud-native development environments, where local devices maintain direct access to production workflows. Endpoint Compromise Can Bypass Mature CI/CD Controls CI/CD security discussions mostly focus on repositories, containers, infrastructure, and deployment automation. Developer endpoints are often overlooked as a par...

News and Tech Update

Search This Blog