Modern software systems operate in complex, dynamic environments where failures are inevitable. Traditional monitoring and manual incident response are no longer sufficient to ensure resilience or customer satisfaction. This talk explores how to design and implement self-healing software systems by combining telemetry data with an AI-driven agentic approach. We’ll start by examining how high-quality telemetry forms the foundation for detecting anomalies and predicting failures. Next, we’ll show how modern GenAI (LLMs) can transform this telemetry into actionable insights for AI agents that interpret data, pinpoint root causes, and apply automated fixes. Through a practical, real-world example, you’ll see how telemetry and AI work together to create adaptive feedback loops that continuously improve system reliability, while freeing engineers from repetitive operational tasks.

Most people think the cloud is just files floating in the sky.Spoiler: it's not.In this episode, I’m breaking down what “the cloud” really is, why everything you’ve been told is probably wrong, and why it's the engine behind everything from Netflix to AI to your online checkout.This is the kickoff to WTH is the Cloud?! a fun series that makes cloud technology make sense.

Cloud infrastructure has fueled innovation for nearly two decades—yet cost control remains a challenge. Unforeseen expenses and complicated billing can hamper agility, forcing teams to overspend just to stay competitive.What if you could evaluate costs in real-time, identify inefficiencies, and optimize deployments—without slowing development? Imagine adjusting parameters based on on-the-fly estimates and usage.In this presentation, Malcolm Matalka, Co-founder and CTO of Terrateam, explores how OpenInfraQuote, a new open-source command-line tool, transforms Terraform and OpenTofu code into actionable cost insights. Learn how to automate price checks, compare scenarios, and avoid financial surprises—alongside how it differs from other solutions and how to integrate it into your workflow.Resources- GitHub: https://github.com/terrateamio/openinfraquote- Documentation: https://openinfraquote.readthedocs.io/en/latest/

Terraform 1.6 introduced native Terraform tests, but running them against real cloud resources leads to long deployment times and unnecessary costs. With LocalStack's Terraform integration (tflocal), you can validate configurations locally while closely emulating real cloud behaviour. By combining Terraform tests with LocalStack, developers can run integration tests in CI/CD environments, test event-driven serverless workflows, and establish a rapid feedback loop for cloud development.In this presentation, Harsh Mishra provides a hands-on demo of Terraform testing with LocalStack, exploring how to configure tests, validate infrastructure locally, and reduce costs and complexity while improving confidence in deployments.## Resources- Blog: https://blog.localstack.cloud/efficient-infrastructure-testing-localstack-terraform-tests-framework/- Documentation: https://developer.hashicorp.com/terraform/language/tests- tflocal: https://docs.localstack.cloud/user-guide/integrations/terraform/#tflocal-wrapper-script