Job Responsibilities
- Guide and assist in building designs and gaining consensus from peers.
- Collaborate with software engineers to design deployment approaches using automated CI/CD pipelines.
- Design, develop, test, and implement availability, reliability, and scalability solutions.
- Implement infrastructure, configuration, and network as code for applications and platforms.
- Resolve complex problems with technical experts and stakeholders.
- Utilize service level objectives to proactively resolve issues.
- Support the adoption of site reliability engineering best practices.
Required Qualifications, Capabilities, and Skills
- Formal training or certification on software engineering concepts and 3+ years applied experience
- Proficient in site reliability culture and principles.
- Proficient in programming languages such as Python, Java/Spring Boot, and .Net.
- Knowledge of software applications and technical processes within a technical discipline.
- Experience in observability using tools like Grafana, Dynatrace, Prometheus, Datadog, Splunk.
- Experience with CI/CD tools like Jenkins, GitLab, or Terraform.
Preferred Qualifications, Capabilities, and Skills
- Exposure to EAC deployments on AWS.
- Experience in complex and large-scale batch environments.
- Familiarity with container orchestration such as ECS, Kubernetes, and Docker.