Job Title: Manager - DevOps Engineering
Location: Chennai, India
Position Summary:
The
Manager - DevOps Engineering will lead a team of skilled engineers responsible for designing, implementing, and managing scalable, reliable, and secure infrastructure and software delivery pipelines. This role involves driving DevOps best practices, promoting automation, and ensuring high availability and performance across environments. The ideal candidate will have strong technical expertise, leadership capabilities, and a passion for continuous improvement.
Key Responsibilities:
Lead, mentor, and manage the DevOps Engineering team while fostering collaboration, accountability, and continuous learning. Define and champion the strategic direction for infrastructure, automation, CI/CD, and Site Reliability Engineering (SRE). Develop and maintain CI/CD pipelines for automated deployment and testing. Collaborate with development teams to streamline the software delivery process and support fast, high-quality releases. Manage and monitor cloud infrastructure (AWS), ensuring optimal performance, cost, and security. Implement and maintain containerization (Docker) and orchestration tools (Kubernetes). Automate configuration management (Ansible) and implement Infrastructure as Code (Terraform). Troubleshoot complex production and staging issues; lead incident response and root cause analysis. Monitor system performance and implement reliability, scalability, and efficiency enhancements. Establish and enforce infrastructure security best practices and compliance standards. Create documentation on processes, procedures, and configurations. Own system reliability by implementing incident response processes and automated recovery mechanisms. Manage budgeting and capacity planning for cloud resources.
Qualifications :
B.E/B.Tech/M.E/M.Tech in CSE/IT/ECE or equivalent.
10+ years of experience in software development or infrastructure engineering.
Minimum 3 years of experience in a technical leadership or DevOps/SRE management role. Strong analytical, problem-solving, communication, and team leadership skills. Ability to work independently, be proactive, and drive initiatives to completion.
Technical Experience Required :
Strong hands-on experience with
AWS (architecture, advanced services). Proficiency in
Python / Shell scripting. Deep experience with
Docker, Kubernetes, Helm. Expertise in
Terraform, Ansible, and Infrastructure as Code concepts. CI/CD tools such as
Jenkins, ArgoCD, or equivalent. Experience with reverse proxies such as
Nginx, Traefik, APISIX, etc. Knowledge of networking:
TCP/IP, DNS, Load Balancers, Firewalls, Ingress, HTTPS/HTTP. Hands-on experience with databases:
MongoDB, PostgreSQL, Redis. Monitoring & logging platforms:
Prometheus, Grafana, ELK Stack, CloudWatch, Loki/Promtail/Tempo. Strong understanding of
microservices architecture, distribute systems, and debugging. Linux internals and performance tuning. Experience with
SRE practices (SLIs/SLOs, error budgets, on-call rotation). Experience implementing
DevSecOps including SAST/DAST integration in CI/CD. Familiarity with
security monitoring and vulnerability management.