This role is for one of the Weekday's clients
Salary range: Rs 2000000 - Rs 4000000 (ie INR 20-40 LPA)
Min Experience: 6 yearsLocation: HyderabadJobType: full-timeWe are seeking an experienced
Senior Systems Engineer - SRE (Storage)
to join our Site Reliability Engineering (SRE) team. The ideal candidate will be responsible for ensuring high availability, reliability, and performance of our storage infrastructure across cloud and on-prem environments. You will design, deploy, and maintain enterprise-scale storage solutions leveraging technologies such as
Pure Storage (FlashArray and FlashBlade)
,
NetApp
, and
AWS
, while driving automation through
Terraform
,
Ansible
, and
Python/Shell scripting
.This role is ideal for professionals who combine strong technical depth in storage systems with a mindset focused on scalability, automation, and operational excellence.
Requirements
Key Responsibilities:
- Storage Infrastructure Management: Design, implement, and maintain scalable and resilient storage infrastructure using Pure Storage, NetApp, and AWS storage services (EBS, S3, FSx).
- Performance Optimization: Monitor and fine-tune storage performance and capacity utilization, proactively addressing latency or throughput issues.
- Automation & Infrastructure as Code (IaC): Develop and manage automation scripts using Terraform, Ansible, and Python/Shell to streamline provisioning, monitoring, and maintenance of storage systems.
- Site Reliability Engineering: Apply SRE principles to improve system reliability, availability, and disaster recovery processes across hybrid environments.
- Monitoring & Incident Management: Set up robust observability frameworks for storage systems; ensure efficient incident detection, root cause analysis, and post-mortem reviews.
- Backup & Disaster Recovery: Implement and manage backup, snapshot, and replication strategies for critical data using FlashArray, FlashBlade, and NetApp ONTAP.
- Cloud Integration: Architect and support hybrid storage environments integrating on-premise and AWS cloud platforms.
- Security & Compliance: Enforce best practices for data security, encryption, and access controls aligned with organizational policies and compliance standards.
- Collaboration & Documentation: Partner with development, operations, and cloud engineering teams to optimize workflows, while maintaining detailed system documentation and runbooks.
Required Skills & Experience:
- 6-12 years of experience in storage engineering, systems administration, or SRE roles.
- Strong hands-on expertise with Pure Storage (FlashArray, FlashBlade) and NetApp storage systems.
- Solid understanding of AWS storage services (EBS, S3, EFS, FSx) and integration with hybrid environments.
- Proven experience with Terraform and Ansible for infrastructure automation and configuration management.
- Proficiency in Python and Shell scripting for automation, monitoring, and troubleshooting tasks.
- Deep knowledge of storage protocols (iSCSI, NFS, CIFS, Fibre Channel) and data protection techniques.
- Experience implementing SRE practices including SLIs, SLOs, monitoring, incident response, and capacity planning.
- Strong analytical and problem-solving skills with a focus on root cause identification and continuous improvement.
- Excellent communication and documentation abilities to collaborate across technical teams.
Preferred Qualifications:
- Certifications in Pure Storage, AWS, or NetApp.
- Familiarity with container storage integration (Kubernetes CSI drivers).
- Experience with CI/CD pipelines and DevOps toolchains