Be the first to know and get exclusive access to offers by signing up for our mailing list(s).

Subscribe

We ❤️ Open Source

A community education resource

How LLMs and AI agents eliminate DevOps toil with intelligent automation

From invisible cognitive load to AI that meets engineers at their level.

DevOps professionals are invisible until something breaks. In this presentation at All Things Open, Kedar Kulkarni, Senior DevOps Architect at Apple, shares why the real pain in DevOps isn’t just busy work, it’s cognitive load from parsing inconsistent logs, recalling outdated runbooks during incidents, and hunting for tribal knowledge trapped in Slack threads.

Subscribe to our All Things Open YouTube channel to get notifications when new videos are available.

Kedar reframes toil beyond just manual tasks. It’s context switching between tools with different formats, translating YAML to JSON constantly, and finding that one expert getting calls from 10 different projects. This invisible toil varies wildly based on experience. A network admin might fix DNS in 5 minutes while someone else spends two hours under middle-of-the-night page pressure. The pain is real, measurable, and costs organizations productivity they don’t track.

LLMs and AI agents address this differently. LLMs answer questions because they’ve read everything. AI agents actually do things. Kedar demonstrates with kubectl AI, showing how a junior engineer gets quick troubleshooting steps for a broken pod, while a senior engineer asks for pattern analysis across deployment revisions. A security person checks for malicious activity, and a business person estimates time to fix for customers. Same tool, different perspectives, all getting value based on their role and needs.

Read more: 5 forces driving DevOps and AI in 2026

His second demo takes this further with self-healing systems. Using kubectl AI with K8sGPT operator, he shows how multi-agent workflows can detect problems, analyze root causes, and automatically apply fixes without human intervention. A pod goes from crash loop to running on its own. But Kedar emphasizes keeping humans in the loop because most DevOps knowledge lives in private Slack threads and war rooms, not GitHub repos where AI trains.

AI adoption looks different for every team. Startups need speed and rapid iteration. Mid-size companies need automation and process. Enterprises need coordination and enforced standards. Kedar’s roadmap starts simple: Week one, identify top three toil sources. Then in the first month, focus on automating common troubleshooting. For long-term success, get 80 percent of the team knowing which AI tools are approved and how to use them.

Key takeaways

  • LLMs help troubleshoot, monitor, and analyze security concerns by meeting engineers at their experience level and providing context-aware guidance.
  • AI-powered workflows enable self-healing systems through multi-agent collaboration, but human oversight remains critical for production decisions.
  • Adoption strategies differ by company size, from startup speed to enterprise coordination, but all benefit from starting small and measuring impact.

AI amplifies human capabilities in DevOps, it doesn’t replace judgment. The goal isn’t eliminating humans, it’s eliminating toil so teams can focus on higher-value work while maintaining the oversight that only experience provides.

👉 Watch the DevOps track playlist from All Things Open 2025.

More from We Love Open Source

The opinions expressed on this website are those of each author, not of the author's employer or All Things Open/We Love Open Source.

Want to contribute your open source content?

Contribute to We ❤️ Open Source

Help educate our community by contributing a blog post, tutorial, or how-to.

Two World-class Events

If you didn't make it to All Things AI, check out the event summary, and make plans to join us October 19-20 for All Things Open.

Open Source Meetups

We host some of the most active open source meetups in the U.S. Get more info and RSVP to an upcoming event.