This role is for one of our clients
Company Name: Neemtree
Industry: Technology, Information and Media
Seniority level: Mid-Senior level
Min Experience: 4 years
Location: Gurugram, Delhi, NCR
JobType: full-time
We’re looking for a Site Reliability & Automation Engineer who thrives at the intersection of infrastructure, automation, and reliability. In this role, you’ll design and operate scalable cloud environments that power mission-critical applications. You’ll take ownership of cloud infrastructure, CI/CD pipelines, observability systems, and DevOps automation—helping development teams deliver faster, safer, and more reliable software.
This is an exciting opportunity for an experienced engineer who loves to optimize systems, automate everything, and build the backbone for high-performing teams.
What You’ll Do
Design & Build Cloud Infrastructure — Architect and maintain scalable, secure, and resilient AWS environments for production and staging systems.
Automate Everything — Develop Infrastructure-as-Code (IaC) using Terraform to streamline environment provisioning and ensure consistency.
Own CI/CD Pipelines — Build, optimize, and maintain GitLab CI/CD workflows for rapid, reliable deployments.
Implement GitOps Principles — Use tools like ArgoCD to enable declarative, version-controlled deployments across multiple environments.
Enable Observability — Set up and manage metrics, dashboards, and alerts using Grafana, Prometheus, and CloudWatch to ensure system health and uptime.
Drive Collaboration — Work closely with developers and QA teams to optimize application performance and troubleshoot issues across environments.
Champion Security & Cost Efficiency — Implement best practices for identity management, networking, and cloud cost governance.
Incident Management — Participate in on-call rotations, perform root cause analysis, and continuously improve incident response processes.
Evolve Tooling & Processes — Stay ahead of DevOps trends and evaluate new tools, frameworks, and practices to enhance delivery velocity and reliability.
What You Bring
Experience: 4–10 years in DevOps, SRE, or cloud infrastructure roles with hands-on AWS expertise.
Cloud Mastery: Deep understanding of AWS (EC2, EKS/ECS, S3, RDS, Lambda, CloudWatch, IAM, VPC, etc.).
Automation Skills: Proficient in Terraform and Infrastructure-as-Code principles.
Pipeline Experience: Strong command over GitLab CI/CD or equivalent CI/CD platforms.
Observability Know-How: Experience with Prometheus, Grafana, or similar monitoring systems.
System Proficiency: Solid grasp of Linux systems, shell scripting, and troubleshooting distributed environments.
Containerization: Familiarity with Docker and Kubernetes for microservices deployment and orchestration.
Collaboration: Excellent communicator who thrives in cross-functional and agile setups.
Mindset: Analytical, proactive, and driven to automate manual processes.
Bonus Points
Exposure to configuration management tools (Ansible, Chef, Puppet).
Experience building microservices-based infrastructure on Kubernetes.
Knowledge of DevSecOps practices and cloud security automation.
Understanding of self-healing, automated monitoring, and remediation systems.