Senior Engineer - Site Reliability Engineering
MontyCloud is seeking a Senior SRE to own the reliability, scalability, and performance of their cloud management SaaS platform. The role involves designing automation solutions, monitoring system health, leading disaster recovery efforts, and participating in on-call rotations. Candidates must have deep expertise in AWS, Python, and configuration management tools like Ansible. This position focuses on driving operational excellence through chaos engineering and post-mortem analysis.
50k new jobs listed every day. Install TAL to find more jobs like this.

Experience
5+ years
Function
Engineering
Work mode
Onsite, India
Company
Tier 2
What you will work on
MontyCloud is seeking a Senior SRE to own the reliability, scalability, and performance of their cloud management SaaS platform. The role involves designing automation solutions, monitoring system health, leading disaster recovery efforts, and participating in on-call rotations. Candidates must have deep expertise in AWS, Python, and configuration management tools like Ansible. This position focuses on driving operational excellence through chaos engineering and post-mortem analysis.
TAL's take
Solid role scope and clear requirements in cloud SRE, though company brand presence is not top-tier.
Highly specific JD with clear responsibilities, tech stack, and experience requirements.
Must haves
- 5 years experience as an SRE in a SaaS environment
- 3 years expert hands-on AWS experience
- 4 years automation experience with Ansible, Puppet, or Chef
- 4 years scripting in Python
- 3 years experience with monitoring tools like Splunk, Datadog, or New Relic
- 4 years active involvement in on-call rotations
Tools and skills
Nice to have: application development, ci/cd, chaos engineering tools.
About the company
Unfamiliar company, default mid-tier for a specialized cloud platform provider.