Creating self-healing software systems via effective usage of telemetry data and AI agents

Modern software systems operate in complex, dynamic environments where failures are inevitable. Traditional monitoring and manual incident response are no longer sufficient to ensure resilience or customer satisfaction. This talk explores how to design and implement self-healing software systems by combining telemetry data with an AI-driven agentic approach. We’ll start by examining how high-quality telemetry forms the foundation for detecting anomalies and predicting failures. Next, we’ll show how modern GenAI (LLMs) can transform this telemetry into actionable insights for AI agents that interpret data, pinpoint root causes, and apply automated fixes. Through a practical, real-world example, you’ll see how telemetry and AI work together to create adaptive feedback loops that continuously improve system reliability, while freeing engineers from repetitive operational tasks.

Related Talks

Simulating outages with LocalStack Chaos API

LocalStack Chaos API enables you to simulate outages in any AWS region or service. Chaos API provides an easy way to implement chaos engineering experiments to test a wide variety of simulated outages and failures within your application safely, without impacting your production users.Common examples can include:- Region-wide outages- DNS failovers- Service failures- Network faultsAll the testing scenarios described above can be executed within LocalStack, providing thorough coverage for critical situations in a matter of minutes rather than hours or days.In this presentation by Viren Nadkarni, we explore how Chaos API is leveraged to perform service failures in a local environment while using robust error handling to address and mitigate such issues.## Resources- Documentation: https://docs.localstack.cloud/user-guide/chaos-engineering/chaos-api/- Get access: https://www.localstack.cloud/contact

Watch recording
Watch recording
Unified Kubernetes Support in LocalStack

LocalStack now provides enhanced support for running AWS services in Kubernetes environments. In this presentation from the LocalStack 4.0 community meetup by Simon Walker, we explore how to deploy and manage local AWS resources within Kubernetes clusters with LocalStack, to help developers maintain consistency between development and production environments.The session further covers LocalStack’s Kubernetes integration, including deployment via Helm charts, configuration of services like Lambda and RDS as Kubernetes pods, and networking between components. A demo illustrates provisioning a serverless application (Lambda functions interacting with a MySQL database) using Terraform, with all resources managed within a local Kubernetes cluster.You'll additionally learn the practical approaches for local testing and infrastructure emulation by moving from Docker to Kubernetes-native solutions as well as upcoming features, including broader service support and new container runtime options.## Resources- Documentation: https://docs.localstack.cloud/user-guide/localstack-enterprise/kubernetes-executor/- Get access: https://www.localstack.cloud/contact

Watch recording
Watch recording
New Bedrock Support in LocalStack

Bedrock is a fully managed service provided by Amazon Web Services (AWS) that makes foundation models from various LLM providers accessible via an API. LocalStack allows you to use the Bedrock APIs to test and develop AI-powered applications in your local environment.In this video, Silvio showcases how LocalStack 4.0, with our new Bedrock support, is keeping up with advancements in Generative AI (GenAI) and large language Model (LLM) ecosystems. You'll learn what Amazon Bedrock is, the benefits of Bedrock emulation, and a live demo of how it works.## Resources- Documentation: https://docs.localstack.cloud/user-guide/aws/bedrock/- Get access: https://www.localstack.cloud/contact

Watch recording
Watch recording

Launch yourself in the world of local cloud development

Try for free
Try for free
Talk to Sales
Talk to Sales