RESPONSIBILITIES:- Build up, lead and improve existing processes to provide 24x7 operational response for applications in public cloud platforms. - Maintain services once they are live by setting up monitoring, alerting and measuring availability, latency, and overall system health. - Own and review work for accuracy, quality, application performance and completeness. - Review release readiness through activities such as system design consulting, reviewing all observability and monitoring, capacity planning, and launch reviews. - Understanding of Core Principles of DevSecOps.- Partner with architects and engineers to design and implement automation, operations, and support solutions. - Partner Management- Proficient in designing and implementing end-to-end observability frameworks using tools such as Prometheus, Grafana, CloudWatch, ELK/EFK, and OpenTelemetry, ensuring service reliability through dashboard design, SLOs/SLIs, and alerting systems.