Principal Cloud Engineer - Data & AI

15 - 20 years

30 - 35 Lacs

Posted:1 day ago| Platform: Naukri logo

Apply

Work Mode

Work from Office

Job Type

Full Time

Job Description

Job Summary
  • Were looking for a Principal Cloud Engineer with a strong foundation in Multi-Cloud & multi region deployment, data architecture, distributed systems, and modern cloud-native platforms to architect, build, and maintain intelligent infrastructure and systems that power our AI, GenAI and data-intensive workloads.
  • Youll work closely with cross-functional teams, including data scientists, ML & software engineers, and product managers & play a key role in designing a highly scalable platform to manage the lifecycle of data pipelines, APIs, real-time streaming, and agentic GenAI workflows, while enabling federated data architectures.
  • The ideal candidate will have a strong background in building and maintaining scalable AI & Data Platform, optimizing workflows, and ensuring the reliability and performance of Data Platform systems.
Responsibilities

Cloud Architecture & Engineering

  • Deep expertise in designing, implementing, and managing architectures across multiple cloud platforms (e.g., AWS, Azure, GCP)
  • Proven experience in architecting hybrid and multi-cloud solutions, including interconnectivity, security, workload placement, and DR strategies
  • Strong knowledge of cloud-native services (e.g., serverless, containers, managed databases, storage, networking)
  • Experience with enterprise-grade IAM, security controls, and compliance frameworks across cloud environments

AI & GenAI Platform Integration

  • Integrate LLM APIs (OpenAI, Gemini, Claude, etc.) into platform workflows for intelligent automation and enhanced user experience
  • Build and orchestrate multi-agent systems using frameworks like CrewAI, LangGraph, or AutoGen for use cases such as pipeline debugging, code generation, and MLOps
  • Experience in developing and integrating GenAI applications using MCP and orchestration of LLM-powered workflows (e.g., summarization, document Q&A, chatbot assistants, and intelligent data exploration)
  • Hands-on expertise building and optimizing vector search and RAG pipelines using tools like Weaviate, Pinecone, or FAISS to support embedding-based retrieval and real-time semantic search across structured and unstructured datasets

Engineering Enablement

  • Create extensible CLIs, SDKs, and blueprints to simplify onboarding, accelerate development, and standardize best practices
  • Streamline onboarding, documentation, and platform implementation & support using GenAI and conversational interfaces
  • Collaborate across teams to enforce cost, reliability, and security standards within platform blueprints.
  • Work with engineering by introducing platform enhancements, observability, and cost optimization techniques
  • Foster a culture of ownership, continuous learning, and innovation

Automation, IaC, CI/CD

  • Mastery of Infrastructure as Code (IaC) tools especially Terraform, Terragrunt, and CloudFormation / ARM / Deployment Manager
  • Experience building and managing cloud automation frameworks (e.g., using Python, Go, or Bash for orchestration and tooling)
  • Hands-on experience with CI/CD pipelines (e.g., GitHub Actions) for cloud resource deployments
  • Expertise in implementing policy-as-code & Compliance-as-code (e.g., Open Policy Agent, Sentinel)

Security, Governance & Cost

  • Strong background in implementing cloud security best practices (network segmentation, encryption, secrets management, key management, etc.).
  • Experience with multi-account / multi-subscription / multi-project governance models, including landing zones, service control policies, and organizational structures
  • Ability to design for cost optimization, tagging strategies, and usage monitoring across cloud providers

Monitoring & Operations

  • Familiarity with cloud monitoring, logging, and observability tools (e.g., CloudWatch, Azure Monitor, GCP Operations Suite, Datadog, Prometheus)
  • Experience with incident management and building self-healing cloud architectures

Platform & Cloud Engineering

  • Develop and maintain real-time and batch data pipelines using tools like Airflow, dbt, Dataform, and Dataflow/Spark
  • Design and develop event-driven architectures using Apache Kafka, Google Pub/Sub, or equivalent messaging systems
  • Build and expose high-performance data APIs and microservices to support downstream applications, ML workflows, and GenAI agents
  • Architect and manage multi-cloud and hybrid cloud platforms (e.g., GCP, AWS, Azure) optimized for AI, ML, and real-time data processing workloads
  • Build reusable frameworks and infrastructure-as-code (IaC) using Terraform, Kubernetes, and CI/CD to drive self-service and automation
  • Ensure platform scalability, resilience, and cost efficiency through modern practices like GitOps, observability, and chaos engineering

Leadership & Collaboration

  • Experience leading cloud architecture reviews, defining standards, and mentoring engineering teams
  • Ability to work cross-functionally with security, networking, application, and data teams to deliver integrated cloud solutions
  • Strong communication skills to engage stakeholders at various levels, from engineering to executives
Qualifications
  • 15+ years of hands-on experience in Platform or Data Engineering, Cloud Architecture, Multi-Cloud Multi-Region Deployment & Architecture, AI Engineering roles
  • Strong programming background in Java, Python, SQL, and one or more general-purpose languages
  • Deep knowledge of data modeling, distributed systems, and API design in production environments
  • Proficiency in designing and managing Kubernetes, serverless workloads, and streaming systems (Kafka, Pub/Sub, Flink, Spark)
  • Experience with metadata management, data catalogs, data quality enforcement, and semantic modeling & automated integration with Data Platform
  • Proven experience building scalable, efficient data pipelines for structured and unstructured data
  • Experience with GenAI/LLM frameworks and tools for orchestration and workflow automation
  • Experience with RAG pipelines, vector databases, and embedding-based search
  • Familiarity with observability tools (Prometheus, Grafana, OpenTelemetry) and strong debugging skills across the stack
  • Experience with ML Platforms (MLFlow, Vertex AI, Kubeflow) and AI/ML observability tools
  • Prior implementation of data mesh or data fabric in a large-scale enterprise
  • Experience with Looker Modeler, LookML, or semantic modeling layers

Preferred Certifications

  • AWS Certified Solutions Architect Professional
  • Google Professional Cloud Architect
  • Microsoft Certified: Azure Solutions Architect Expert
  • HashiCorp Certified: Terraform Associate
  • Other relevant certifications (CKA, CKS, CISSP cloud concentration) are a plus.

Why Youll Love This Role

  • Drive technical leadership across AI-native data platforms, automation systems, and self-service tools
  • Collaborate across teams to shape the next generation of intelligent platforms in the enterprise
  • Work with a high-energy, mission-driven team that embraces innovation, open-source, and experimentation

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now
Equinix logo
Equinix

Technology, Information and Internet

Redwood City California

RecommendedJobs for You