Expoint – all jobs in one place
The point where experts and best companies meet
Limitless High-tech career opportunities - Expoint

Nvidia Senior DevOps Automation Engineer Fabric Networking - GPU 
Israel, Tel Aviv District, Tel Aviv-Yafo 
299842285

Today
Israel, Tel Aviv
Israel, Yokneam
time type
Full time
posted on
Posted Today
job requisition id

What you will be doing:

  • Build and maintain CI/CD pipelines that support fast, reliable integration and deployment across complex systems.

  • Design tools and automation workflows that simplify software releases, manage dependencies, and increase reliability.

  • Accelerate development by modularizing systems and enabling independent release cycles.

  • Build infrastructure automation for provisioning, scaling, and maintaining GPU clusters.

  • Automate software updates and monitor system health to improve reliability and availability.

  • Troubleshoot and resolve operational issues across distributed infrastructure.

  • Manage firmware and software rollouts to minimize downtime and ensure consistency.

  • Work with global engineering teams to align infrastructure tools and support project achievements.

What we need to see:

  • BS or MS in Computer Science, Computer Engineering, or a related field

  • 5+ years of experience managing infrastructure or systems in high-performance or distributed environments.

  • Expertise in scripting and automation using Python, Ansible, and Shell.

  • Practical experience with modern CI/CD tools andinfrastructure-as-codeframeworks.

  • Strong understanding of Linux, networking, and distributed system design.

  • Proven ability to break down monolithic systems into scalable, loosely coupled components.

  • Ability and flexibility to work and communicate effectively in a multi-national, multi-time-zone corporate environment.

Ways to stand out from the crowd:

  • Experience with cluster management tools like Slurm.

  • Familiarity with NVIDIA DGX/HGX systems and GPU-based clusters.

  • Knowledge of observability tools such as Prometheus and Grafana.

  • Proven ability to lead DevOps process improvements and drive team efficiency.