Creating self-healing software systems via effective usage of telemetry data and AI agents

Modern software systems operate in complex, dynamic environments where failures are inevitable. Traditional monitoring and manual incident response are no longer sufficient to ensure resilience or customer satisfaction. This talk explores how to design and implement self-healing software systems by combining telemetry data with an AI-driven agentic approach. We’ll start by examining how high-quality telemetry forms the foundation for detecting anomalies and predicting failures. Next, we’ll show how modern GenAI (LLMs) can transform this telemetry into actionable insights for AI agents that interpret data, pinpoint root causes, and apply automated fixes. Through a practical, real-world example, you’ll see how telemetry and AI work together to create adaptive feedback loops that continuously improve system reliability, while freeing engineers from repetitive operational tasks.

Related Talks

Multi-account and multi-region in LocalStack

Multi-account and multi-region compatibility enables users to manage and deploy resources across multiple AWS accounts and geographic regions. This functionality enhances the robustness of the deployments by offering improved fault tolerance, scalability, and regulatory compliance. By segregating resources into separate accounts and distributing them across various regions, users can minimize the impact of potential failures and optimize performance.In this session from LocalStack Community Meetup May '24, Sannya Singhal discussed how you could use LocalStack to emulate multi-account and multi-region environments locally for testing and development purposes, ensuring that applications were resilient and scalable before deployment to the cloud.

Watch recording
Watch recording
Building LocalStack with LocalStack

LocalStack’s core cloud emulator allows us to run our own cloud application - including its infrastructure - locally, which provides an efficient developer experience at the start of the entire software development lifecycle (SDLC). This experience enables us to build our product features in a way that closely matches what our customers are looking for — a comprehensive developer platform that facilitates local multi-cloud development across different providers and services.In this session from LocalStack Community Meetup April '24, Lukas Pichler showcases how to use the LocalStack core cloud emulator and other novel solutions, to build, test, and integrate new features in our LocalStack Web Application. He broadly discusses:• Application Overview• How do we enable local cloud development?How do we use LocalStack in CI?• How do we use LocalStack to enable application previews and E2E testing?• Conclusion

Watch recording
Watch recording
Deploy and Test Data Migration Pipeline with DMS

AWS Database Migration Service provides migration solutions from databases, data warehouses, and other types of data stores (e.g. S3, SAP). The migration can be homogeneous (source and target have the same type), but often is heterogeneous as it supports migration from various sources to various targets (self-hosted and AWS services).LocalStack supports DMS with selected use cases. In this session from LocalStack Community Meetup July '24, Mathieu Cloutier explores how to use LocalStack to migrate from a MariaDB database to an AWS Kinesis Stream. He goes over the differences between CDC and full load, and as a bonus you will see how easy it is to migrate from an external database to your Kinesis Stream — tested all on your local machine!Docs: https://docs.localstack.cloud/user-guide/aws/dms/

Watch recording
Watch recording

Launch yourself in the world of local cloud development

Try for free
Try for free
Talk to Sales
Talk to Sales