kubectl-ai: an AI for Kubernetes

Do you want our logo?

Do you want our logo description

In the world of Kubernetes, mastering the kubectl command has always been an essential skill for both cluster administrators and development teams. And let’s face it—getting there takes time and effort. Wouldn’t it be great to have a tool that uses AI to help overcome that initial barrier?

In this post, we’ll explore kubectl-ai, an intelligent assistant developed by the Google Cloud Platform team that enables users to interact with Kubernetes using natural language. This tool can significantly simplify complex operations by turning natural language queries or instructions into precise kubectl commands—plus, it provides contextual help along the way.

What is kubectl-ai?

kubectl-ai is a command-line tool that acts as a smart interface between users and Kubernetes. It interprets queries written in natural language, translates them into precise kubectl commands, executes them, and provides clear and enriched results with explanations. The tool works as a kubectl plugin and can also be run as a standalone application, offering maximum flexibility for different working environments.

Although it was released by the Google Cloud Platform team, it's not limited to using Gemini—Google’s AI model—as its provider. It offers support for multiple providers, including Google Gemini (default), Vertex AI, OpenAI, Azure OpenAI, Grok by xAI, and local models via Ollama or llama.cpp. This versatility allows organizations to choose the AI model that best fits their needs, security requirements, and budget.

Installation and Configuration

The installation of kubectl-ai is straightforward. For Linux and macOS systems, it can be installed with a simple command:

curl -sSL https://raw.githubusercontent.com/GoogleCloudPlatform/kubectl-ai/main/install.sh

Alternatively, it can be installed using Krew, the plugin manager for kubectl:

kubectl krew install ai

For the basic configuration using the default Gemini model, you need to set an environment variable with your API key:

export GEMINI_API_KEY=your_api_key_here

If you don't have a Gemini API_KEY, you can generate one from Google AI Studio.

Kubernetes Administration Perspective

Below are examples of common tasks performed by a Kubernetes administrator using kubectl, followed by the same tasks with kubectl-ai.

Troubleshooting a Problematic Node

To get information about a node's status in Kubernetes, an admin might run a series of commands like:

kubectl get nodes --show-labels
kubectl describe node worker-3
kubectl top nodes
kubectl get pods --all-namespaces --field-selector spec.nodeName=worker-3
kubectl get events --field-selector involvedObject.kind=Node

Simplified Troubleshooting

With kubectl-ai, these same operations can be performed using natural language. In this case, within an interactive session.

kubectl ai --model gemini-2.0-flash
 Hey there, what can I help you with today?
>>> show me the health status of all nodes and highlight any issues
  I need to examine the nodes in your Kubernetes cluster to assess their health
  status and identify any potential issues. I will use  kubectl get nodes  to
  retrieve information about the nodes and then analyze the output for any
  problems.
  Running: kubectl get nodes
  The nodes worker-1, worker-2, and worker-3 are in the cluster. worker-1 and worker-2 are
  ready. worker-3 is Ready, but SchedulingDisabled. This means that no new pods can
  be scheduled on worker-3. I will get more details about the nodes to see if there
  are any issues.
  Running: kubectl describe node worker-1
  Running: kubectl describe node worker-2
  Running: kubectl describe node worker-3
  Here's a summary of the health status of your nodes:
  • worker-1: Ready. The node's resource allocation is overcommitted. CPU is at 48%
  requests and 552% limits, and memory is at 46% requests and 726% limits.
  • worker-2: Ready. The node's resource allocation is overcommitted. CPU is at 42%
  requests and 1462% limits, and memory is at 60% requests and 1996% limits.
  There's also a warning about an invalid disk capacity.
  • worker-3: Ready, but SchedulingDisabled. This node has a taint that prevents new
  pods from being scheduled on it. Additionally, there are warnings about
  certificates expiring in 90 days and a suggestion to restart k3s on this node.

As we can see, kubectl-ai not only runs the appropriate commands, but also provides contextual analysis and explanations about the issues found. This is especially valuable during critical incidents where every second counts.

Developer Perspective

Traditional Development with kubectl

Development teams deploying applications on Kubernetes face different but equally complex challenges.

Application Deployment

kubectl create deployment nginx-app --image=nginx:1.20
kubectl expose deployment nginx-app --port=80 --type=LoadBalancer
kubectl scale deployment nginx-app --replicas=3

Application Debugging

kubectl logs -f deployment/nginx-app
kubectl exec -it nginx-app-pod-name -- /bin/bash
kubectl describe pod nginx-app-pod-name

Configuration Management

kubectl create configmap app-config --from-file=config.properties
kubectl create secret generic app-secrets --from-literal=db-password=secret123

Enhanced Experience

For development profiles, kubectl-ai transforms Kubernetes development into a more natural experience.

Simplified Deployment

kubectl ai "create a deployment with nginx and a service that exposes port 80"
kubectl ai "scale my nginx deployment to 5 replicas"

kubectl-ai automatically generates complete YAML manifests and asks if you want to apply them directly to the cluster.

Conversational Debugging

kubectl ai "show me the logs for the nginx app in the hello namespace"
kubectl ai "explain why my pod is stuck in Pending state"

The tool not only displays the logs, but also can analyze common errors and suggest solutions.

Smart Configuration Management

kubectl ai "create a ConfigMap for my Flask app with database configuration"
kubectl ai "check if my app has the correct permissions to access the database secret"

Comparison

Speed and Efficiency

kubectl requires users to:

Learn specific syntax along with various options.
Know how to chain multiple commands for complex tasks.
Frequently consult the documentation.

kubectl-ai allows users to:

Describe their intent using natural language.
Execute a series of tasks with a single query.
Automatically receive contextual explanations.

Learning Curve

Let's look at troubleshooting a problematic pod as an example.

kubectl requires at least 4 different commands with various options:

kubectl get pods -A --field-selector=status.phase=Failed
kubectl describe pod failing-pod-name -n namespace
kubectl logs failing-pod-name -n namespace --previous
kubectl get events --field-selector involvedObject.name=failing-pod-name -n namespace

kubectl-ai, on the other hand, allows you to perform a single query to obtain not only the status information, but also an explanation of the possible causes:

kubectl ai "find all failed pods and explain what went wrong"

This difference is especially noticeable for development and administration profiles who do not use Kubernetes on a daily basis.

Use cases. Where can we use it?

An admin profile can especially benefit in the following scenarios:

Security audits

kubectl ai "check if any users have excessive permissions to delete deployments"

Resource monitoring

kubectl ai "show me which pods are consuming the most CPU and memory"

Incident management

kubectl ai "find all pods that restarted in the last hour and explain why"

And development teams can take advantage of it in the following scenarios:

Manifest generation

kubectl ai "create a complete YAML for a Flask app with 2 replicas and resource limits"

Configuration validation

kubectl ai "check if my deployment follows Kubernetes best practices"

Application troubleshooting

kubectl ai "why is my application returning 503 errors?"

Security considerations and limitations

kubectl-ai operates in read-only mode by default and requests permission before executing any command that could modify resources in the cluster. This feature is especially important when working in production environments where disruptive changes should be minimized.

It’s important to note that, like any AI-powered tool, kubectl-ai can occasionally generate incorrect commands or misinterpret instructions, so users should have at least a basic understanding of Kubernetes before using the tool.

For organizations with strict security requirements, it’s possible to use local AI models with Ollama, allowing AI usage without relying on external services.

Common errors

During initial testing using the “gemini-2.0-flash” model, we occasionally encountered error messages like this:

kubectl ai --model gemini-2.0-flash "show me pods with issues"
Error: reading streaming LLM response: iterateResponseStream: invalid stream chunk: {
  "error": {
    "code": 503,
    "message": "The model is overloaded. Please try again later.",
    "status": "UNAVAILABLE"
  }
}

For this reason, I would recommend considering the use of local LLMs such as Ollama, which are not hosted on shared infrastructure.

Conclusions

kubectl-ai represents a paradigm shift in how we interact with Kubernetes. The tool doesn’t replace the need to understand Kubernetes fundamentals and how it works, but it reduces friction and simplifies complex operations, significantly speeding up day-to-day work.

For experienced administrators, kubectl-ai can be seen as a productivity accelerator, allowing them to express complex intentions using natural language. For development teams and junior administrators, it serves as an intelligent assistant that not only executes commands but also provides additional insights and possible explanations when issues arise.

While traditional kubectl will remain essential for precise operations and automated scripting, kubectl-ai emerges as the perfect companion for exploration, diagnostics, and quick execution of complex tasks. The combination of both tools promises to make Kubernetes more accessible and manageable for teams of all experience levels.

The AI revolution in DevOps is just beginning, and kubectl-ai is a perfect example of how artificial intelligence can transform complex tools into intuitive conversational interfaces without sacrificing power or precision.

Sergio David Morel

I consider myself a lifelong Sysadmin—the kind who runs Linux servers at home just for fun. Thanks to the opportunities my team has given me, I’ve transitioned between roles like Python developer, cloud systems engineer, and now systems architect. Always on the lookout for the next professional challenge. Fan of rock music, anime, and pineapple pizza.

View more of Sergio David.