Principal Developer - R&D Infrastructure

10 - 15 years

12 - 17 Lacs

Posted:3 days ago| Platform: Naukri logo

Apply

Work Mode

Work from Office

Job Type

Full Time

Job Description

Role Summary
Lead SRE initiatives across all technology stacks. Own reliability, scalability, and performance of production systems. Drive automation, observability, and incident response maturity. Act as technical authority across DevOps domains.
Core Responsibilities
  • Architect and manage multi-cloud infrastructure (AWS, Azure) with IaC (Terraform, Helm).
  • Lead Kubernetes deployments across multi-cluster environments; optimize for scale and resilience.
  • Build and maintain CI/CD pipelines using GitLab, Jenkins, GitHub Actions; enforce GitOps workflows.
  • Implement and manage observability stack: Prometheus, Grafana, ELK, Splunk, Datadog.
  • Drive incident response, root cause analysis, and postmortem processes; reduce MTTR.
  • Enforce SRE principles: SLIs, SLOs, error budgets, chaos engineering.
  • Integrate DevSecOps practices: IAM, RBAC, secrets management, vulnerability scanning.
  • Enable MLOps workflows for AI/ML model deployment and lifecycle management.
  • Mentor junior SREs and DevOps engineers; establish technical standards and best practices.
  • Collaborate with product, engineering, and security teams to align reliability goals.
  • Own disaster recovery planning, backup strategies, and environment consistency.
  • Lead cost optimization and performance tuning across infrastructure layers.
Required Qualifications
  • 10+ years in DevOps/SRE/Platform Engineering.
  • Deep expertise in Kubernetes, Docker, Terraform, Helm.
  • Strong proficiency in scripting (Python, Bash, Go).
  • Proven experience with cloud-native architectures and distributed systems.
  • Hands-on experience with CI/CD tooling and automation frameworks.
  • Familiarity with security frameworks and compliance requirements.
  • Demonstrated leadership in scaling systems and mentoring teams.
  • Strategic thinking across infrastructure and reliability domains.
  • Technical leadership with cross-functional influence.
  • High accountability and ownership of production systems.
  • Clear, concise communication across global teams.
  • Decision-making under pressure during incidents.
  • Ability to mentor and elevate team capabilities.
  • Bias for automation and continuous improvement.
  • Strong stakeholder management and alignment with business goals.

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now
Arctic Wolf Networks logo
Arctic Wolf Networks

Computer and Network Security

Eden Prairie Minnesota

RecommendedJobs for You