Posted:2 months ago|
Platform:
Work from Office
Full Time
We are seeking a Site Reliability Engineer (SRE) with strong DevOps expertise to ensure the reliability, availability, and performance of critical systems and services. This role bridges the gap between development and operations teams by employing automation, monitoring, and best practices to enhance system scalability, reduce downtime, and improve overall operational efficiency. The SRE will focus on optimizing development pipelines, managing infrastructure, and implementing proactive monitoring and ing systems while upholding the principles of DevOps and reliability engineering . Key Responsibilities: 1. Reliability Engineering: Design, implement, and maintain high-availability systems. Create and enforce Service Level Objectives (SLOs) , Service Level Indicators (SLIs) , and Service Level Agreements (SLAs) . Conduct root cause analysis for system failures and implement post-mortem processes to prevent recurrence. 2. DevOps Automation: Automate infrastructure provisioning, deployment pipelines, and operational processes. Build and maintain CI/CD pipelines using tools like Jenkins , GitHub Actions , or GitLab CI/CD . Develop Infrastructure as Code (IaC) using tools like Terraform , CloudFormation , or Ansible . 3. Monitoring and Incident Management: Implement robust monitoring , logging , and ing solutions using tools like Datadog or Splunk . Establish proactive incident response processes and manage on-call rotations. Ensure effective documentation for incident handling and resolution. 4. Performance and Scalability: Optimize system performance through capacity planning and resource management . Enable horizontal scaling of services to handle increasing loads. Work closely with development teams to improve application resilience and performance . 5. Security and Compliance: Enforce security best practices in infrastructure and application development. Conduct vulnerability assessments and implement remediation measures. Ensure compliance with organizational and industry standards. 6. Collaboration and Culture: Act as a bridge between development and operations teams to foster a DevOps culture . Coach teams on best practices in reliability , automation , and DevOps . Advocate for a culture of ownership and continuous improvement . Key Skills and Competencies: Technical Skills: Expertise in cloud platforms like AWS , Azure , or GCP . Proficiency in Linux system administration and networking concepts . Strong programming/scripting skills (e.g., Python , Go , Bash ). Understanding of Terraform creation and management. Familiarity with containerization and orchestration tools like Docker and Kubernetes . Knowledge of database management (SQL and NoSQL).
UST
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
My Connections UST
Trivandrum
20.0 - 22.0 Lacs P.A.
Trivandrum
19.0 - 21.0 Lacs P.A.
Chennai, Tamil Nadu, India
6.0 - 10.0 Lacs P.A.
Chennai, Tamil Nadu, India
7.0 - 10.0 Lacs P.A.
Bengaluru / Bangalore, Karnataka, India
3.0 - 7.0 Lacs P.A.
Hyderabad / Secunderabad, Telangana, Telangana, India
3.0 - 7.0 Lacs P.A.
Delhi, Delhi, India
3.0 - 7.0 Lacs P.A.
Noida, Uttar Pradesh, India
3.0 - 9.5 Lacs P.A.
Gurgaon / Gurugram, Haryana, India
7.0 - 14.0 Lacs P.A.
Noida, Uttar Pradesh, India
7.0 - 14.0 Lacs P.A.