Principal Software Engineer

4 - 5 years

13 - 15 Lacs

Posted:-1 days ago| Platform: Naukri logo

Apply

Work Mode

Work from Office

Job Type

Full Time

Job Description

  • Lead Platform Reliability Initiatives : Design and optimize multi-region, highly available cloud architectures using services like container orchestration, compute instances, managed databases, and object storage to achieve SLIs/SLOs and error budgets that exceed 99.99% availability.
  • Drive Automation and IaC : Build and maintain Infrastructure as Code ( IaC ) pipelines with tools like CDK, Terraform, or CloudFormation; automate deployments via CI/CD tools and serverless functions to accelerate delivery while minimizing operational overhead .
  • Reliability, Availability & Resilience : Establish , track and enforce SLIs, SLOs, error budgets. Ensure systems availability, latency, and throughput meet targets. Build strategies for redundancy, high availability, multi-AZ / multi-region failover, backups, disaster recovery
  • Enhance Observability and Monitoring: Implement comprehensive monitoring stacks with cloud-native metrics, open-source monitoring, and visualization tools; define alerting thresholds, conduct root cause analyses (RCAs), and optimize performance for distributed systems including message brokers, caching layers, and relational databases.
  • Champion Security and Compliance : Enforce cloud best practices for identity and access management, encryption, networking, and policy-as-code with tools like OPA ; integrate security into CI/CD pipelines to protect sensitive data in regulated environments.
  • Innovate on Scalability : Evaluate and implement advanced cloud features like serverless architectures, service meshes, and autoscaling solutions to support growing user demands and reduce latency.
  • Operational Excellence : Participate and lead incident response for production issues and continuously improve processes to balance feature velocity with system reliability.
  • Cost & Performance : Monitor and optimize cloud spend, resource usage; rightsizing, discount strategies and waste elimination.
  • Mentor and Influence : Guide junior engineers through design reviews, incident post-mortems, and adoption of SRE practices; collaborate with stakeholders to shape cloud strategy, cost optimization, and capacity planning for enterprise-scale workloads.
Educational Qualification:
  • Bachelors Degree or equivalent in Computer Science or STEM Majors (Science, Technology, Engineering and Math)
Technical skills:
  • 15+ years in software engineering, site reliability engineering, or cloud platform roles, with significant exposure to AWS production systems .
  • Deep hands-on expertise with core cloud services including container orchestration, compute , databases, storage, monitoring, identity management, serverless, and networking .
  • Expert level skill in Infrastructure as Code: Terraform, CloudFormation, AWS CDK or similar .
  • Proficiency in programming languages like Python, Go, or Java for automation, scripting, and building tools .
  • Deep understanding of observability tooling: metrics, logging, distributed tracing, alerting ( e.g. CloudWatch, Prometheus, Grafana, ELK, etc.).
  • Strong experience with incident management: debugging, performance tuning, root cause analysis.
  • Proven track record of cost optimization in cloud environments.
  • Security mindset: knowledge of AWS security services , governance, compliance standards.
  • Proven track record in implementing SRE practices: SLIs/SLOs, error budgets, monitoring/alerting, and incident management.
  • Strong communication and collaboration abilities to influence without authority and translate technical concepts to non-technical stakeholders

Mock Interview

Practice Video Interview with JobPe AI

Start Job-Specific Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Skills

Practice coding challenges to boost your skills

Start Practicing Now

RecommendedJobs for You