Posted:2 days ago|
Platform:
On-site
Full Time
We’re looking for a DevOps / Platform Engineer to support and enhance the infrastructure powering our AI-driven platform. You’ll work on production systems running at scale on Kubernetes and help improve deployment automation, observability, and developer workflows. You’ll also be responsible for provisioning infrastructure and building new Kubernetes deployments from the ground up as we roll out new services. This is a hands-on position for someone who enjoys solving real-world infra problems, thrives in high-scale environments, and is excited to collaborate closely with software engineers to ensure reliable, secure, and efficient delivery of services. Key Responsibilities · Operate and enhance our Kubernetes-based deployment pipelines. · Design and provision new infrastructure and Kubernetes deployments for emerging services. · Improve and maintain CI/CD pipelines using GitHub Actions, GitLab CI, or similar tools. · Automate infrastructure using Terraform or similar Infrastructure-as-Code tools. · Manage monitoring, logging, and alerting using tools like Prometheus, Grafana, and ELK. · Set up secure networking, secrets management, and IAM configurations. · Troubleshoot deployment and runtime issues, collaborating closely with engineers. · Write and maintain internal documentation and runbooks. Qualifications Required · Bachelor’s degree in Engineering or related field · 1–6 years of experience in DevOps, SRE , or Platform Engineering roles · Strong foundation in Docker and Kubernetes. · Experience building infrastructure or deployments from scratch . · Proficiency with at least one major cloud provider (AWS, GCP, or Azure). · Competent with Linux systems, shell scripting, and Git. · Working knowledge of CI/CD tools and deployment automation. · Basic understanding of networking (TLS, DNS, routing, load balancing). Nice to Have · Hands-on experience with Helm and GitOps tools such as ArgoCD or Flux. · Familiarity with HashiCorp or similar Vaults for secrets management. · Exposure to distributed compute frameworks like Ray, Apache Spark, or similar technologies. · Knowledge of modern load balancers and service meshes, including Traefik, Envoy, or Istio. · Experience supporting production-grade ML/AI workloads in a cloud-native environment. · Cloud certifications (AWS, GCP, Azure) or prior involvement in deploying and managing production-scale ML systems. · Strong plus : Prior experience working on highly scaled, production-grade platforms serving millions of users . · Proven ability to manage and optimize infrastructure for high availability, reliability, and performance in large-scale distributed systems Show more Show less
Aigility AI
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
Practice Video Interview with JobPe AI
Bengaluru
18.0 - 20.0 Lacs P.A.
Hyderabad, Ahmedabad, Bengaluru
4.0 - 7.0 Lacs P.A.
Hyderabad, Ahmedabad, Bengaluru
4.0 - 7.0 Lacs P.A.
India
Salary: Not disclosed
8.0 - 12.0 Lacs P.A.
Chennai, Tamil Nadu, India
Experience: Not specified
Salary: Not disclosed
Bengaluru, Karnataka, India
Experience: Not specified
Salary: Not disclosed
Hyderabad, Telangana, India
Experience: Not specified
Salary: Not disclosed
Pune, Maharashtra, India
Experience: Not specified
Salary: Not disclosed
Indore, Pune, Ahmedabad
6.0 - 10.0 Lacs P.A.