
GitHub Copilot is popular. The AI-powered code completion tool (originally developed by GitHub and OpenAI) works to give software application developers a so-called “AI pair programmer” buddy that offers suggested code snippets and (when called upon) entire functions – and it happens directly within an engineer’s Integrated Development Environment (IDE) of choice.
All of which means that GitHub Copilot isn’t just popular in terms of total usage; the tool is reporting an increase in patterns of high concurrency (individual developers performing similar operations, but more likely different developers requesting the same types of functions) and intense usage among power-users.
No Foul Play, Probably
The GitHub blog itself doesn’t necessarily point the finger at nefarious usage techniques – the team understands that spikes “can be driven by legitimate workflows” here – but indirect prompt injection (placing malicious instructions inside a public repository or pull request) could exist at some level.
However it happens, these usage patterns place a significant strain on the shared infrastructure and operating resources that underpin the service.
New Limits
All of which thus far should mean the new limits for GitHub Copilot come as no surprise. Set to roll out over the next few weeks at the time of writing, two usage limit restrictions will come into place.
- Limits for overall service reliability
- Limits for specific models or model family capacity
When a developer hits what is being labelled as a “service reliability limit”, they will need to wait until their current session resets. This will be visible in the error experience when a developer is rate-limited.
Switch Models, AutoMode
There are workarounds here, i.e., a usage-limited programmer on one specific model (or model family) has the option to switch to an alternative model or use Auto mode.
A core toolset function, Copilot Auto mode, enables the tool itself to intelligently choose the best available model on the developer’s behalf. Model selection is based on real-time system health and model performance. This means (in theory, if not also in practice) that software engineers benefit from reduced rate limiting, lower latency and errors. Note that Auto model selection for Copilot cloud agent is available for GitHub Copilot Pro and GitHub Copilot Pro+ plans.
“We recommend distributing requests more evenly over time when possible, rather than sending them in large, concentrated waves. [Developers] can also upgrade their plan for higher limits,” writes the GitHub Copilot blog bot. “We know limits can be frustrating and are actively exploring new ways to offer increased capacity for all users. We will share updates as we identify durable solutions.”
At its core, GitHub defines rate limiting as a mechanism used to control the number of requests a user or application can make in a given time period. The organization uses rate limits to ensure everyone has fair access to GitHub Copilot and to protect against abuse.
Goodbye, Opus 4.6 Fast
To further improve service reliability, the GitHub team is streamlining its model offerings and focusing resources on the models our users use the most. As a first step, they’ll be retiring Opus 4.6 Fast for Copilot Pro+ users, as of now.
Opus 4.6 Fast for Copilot Pro is (or more accurately, was) a high-speed configuration for GitHub Copilot, delivering output 2.5x faster than standard Opus 4.6. The technology itself prioritized low-latency for complex workflows.
The team recommends using Opus 4.6 as an alternative model with similar capabilities.
Food For Thought
Developers “indulging” in GitHub Copilot who may have been caught somewhat off guard by this development should perhaps remember that rate limiting happens all the time in order to preserve capacity. Rate limiting helps prevent the system from being overloaded.
The team also reminds us that popular features and models may receive bursts of requests, so rate limits ensure no single user or group can monopolize these resources. Without rate limits, malicious actors could exploit Copilot more easily, leading to degraded service for everyone or even denial of service. So let’s not let that happen.
from DevOps.com https://ift.tt/IYdetC0
Comments
Post a Comment