GCP DevOps Engineer

6 - 11 years

20 - 35 Lacs

Posted:5 days ago| Platform: Naukri logo

Apply

Work Mode

Hybrid

Job Type

Full Time

Job Description

Key Responsibilities:

Cloud Infrastructure & Platform Engineering

  • Design, provision, and maintain scalable, secure, and cost-efficient infrastructure for GenAI applications on GCP.
  • Deploy and manage containerized workloads using Docker and Kubernetes (GKE).
  • Configure and optimize Vertex AI and IBM Watsonx platforms for training, fine-tuning, and serving LLMs and other generative models.
  • Implement high-performance GPU/TPU clusters to support distributed training and large-scale inference.
  • Ensure business continuity through backup, disaster recovery, and multi-region deployments.

Automation & Reliability

  • Develop and maintain Infrastructure as Code (IaC) templates with Terraform, or Cloud Deployment Manager.
  • Adopt GitOps practices (Flux) for infrastructure lifecycle management.
  • Build and optimize CI/CD pipelines for data pipelines, model workflows, and GenAI applications.
  • Apply SRE principles (SLIs, SLOs, SLAs) to guarantee platform reliability and uptime.

Security, Governance & Compliance

  • Embed DevSecOps best practices across the infrastructure lifecycle, including policy-as-code, vulnerability scanning, and secrets management.
  • Enforce identity and access management (IAM), network segmentation, and data encryption in compliance with standards (HIPAA, SOX, GDPR, FedRAMP).
  • Collaborate with enterprise security and compliance teams to implement governance frameworks for GenAI platforms.

Monitoring, Observability & Cost Optimization

  • Implement observability stacks (Prometheus, Grafana, Cloud Monitoring, Datadog) for both infra health and ML-specific metrics (model drift, data anomalies).
  • Define KPIs to monitor system health, performance, and adoption across AI workloads.
  • Optimize cloud cost efficiency for GPU/TPU-intensive workloads using autoscaling, preemptible instances, and utilization monitoring.

Collaboration & Enablement

  • Partner with data scientists, ML engineers, and software teams to streamline GenAI application development and deployment.
  • Provide onboarding, documentation, and reusable templates to enable faster adoption of AI infrastructure.
  • Stay current with the latest advancements in GenAI, cloud-native infrastructure, and container orchestration.

Required Education

Bachelors or master’s degree in computer science, Software Engineering, or a related field.

Required Experience

  • 5+ years

    of experience in cloud infrastructure engineering,

    DevOps,

    or platform engineering.
  • Experience with GenAI use cases (chatbots, content generation, code assistants, etc.).
  • Strong hands-on expertise with

    Google Cloud Platform (GCP),

    especially

    Vertex

    AI.

  • Experience with

    IBM Watsonx for AI application

    deployment and management.
  • Proven skills in

    Docker, Kubernetes (GKE),

    and container orchestration at scale.
  • Proficiency in

    Python, Bash,

    or other relevant scripting languages.
  • Strong understanding of cloud networking, IAM, and security best practices.
  • Experience with CI/CD tools (GitHub Actions, GitLab CI, Jenkins) and IaC tools (Terraform, Pulumi, Ansible, Deployment Manager).
  • Familiarity with data pipelines and integration tools (Dataflow, Apache Beam, Pub/Sub, Kafka).
  • Excellent problem-solving, debugging, and communication skills.

Preferred Experience

  • Experience in MLOps practices for model deployment, monitoring, and retraining.
  • Exposure to multi-cloud or hybrid cloud environments (GCP, AWS, Azure, on-prem).

Mock Interview

Practice Video Interview with JobPe AI

Start DevOps Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now
Centific logo
Centific

IT Services and IT Consulting

Redmond Washington

RecommendedJobs for You

hyderabad, telangana, india

hyderabad, chennai, bengaluru

pune, chennai, mumbai (all areas)