Overview
Users may encounter client-side throttling issues with CircleCI's container runner on Kubernetes, particularly when the Kubernetes API server is heavily utilized by having a lot of resource in their helm values. This can result in jobs being stuck in the "Task lifecycle" stage. Users are likely to see error like the following in their container-agent logs.
waited for 3s due to client-side throttling, not priority and fairness, request:
Prerequisites
Access to the Kubernetes cluster where the CircleCI container runner is deployed.
Familiarity with Kubernetes and Helm configurations.
Ability to modify the
values.yamlfile for the container runner.
Solution
To address the client-side throttling issue, consider the following steps:
Increase Agent Replica Count: Distribute the API request load by increasing the number of container agent replicas. This can help prevent any single pod from reaching the throttling limits.
Update your
values.yamlfile with the following configuration:agent: replicaCount: 2
Deploy the change; This adjustment helps balance the API requests more effectively across multiple pods.
helm upgrade container-agent container-agent/container-agent -n -f values.yaml
Additional Resources
Looking for a broader runner troubleshooting guide? This article covers one specific error. For a complete guide covering resource class errors, jobs stuck in "Not Running", Launch Agent EOL, container runner issues, log locations, and more, see: Troubleshooting Self-Hosted Runners (Machine Runner & Container Runner)