In this role you will get to:
AI/ML Deployment & GitOps Automation
-
Contribute to GitOps deployment workflows with Argo that ensure AI/ML services are correctly onboarded and configured for GitOps delivery across GKE, Dataproc, Vertex AI, Dataflow, and other platform-managed compute environments.
-
Package AI/ML services into Docker images and manage lifecycle/versioning in Google Artifact Registry.
-
Implement CI/CD pipelines using GitHub Actions for builds, tests, image scans, and deployments.
Infra as Code & Cloud Platform Engineering
-
Develop and manage Terraform modules for Data & AI/ML platform resources such as Composer, Vertex AI, Dataproc, BigQuery, GCS, and other Data/AI/ML services.
-
Ensure infra repeatability, reliability, and alignment with platform and infosec standards.
ML Pipeline Orchestration & Model Lifecycle
-
Build, maintain, and troubleshoot AI/ML pipelines, batch jobs, and custom training workflows through Vertex AI.
-
Use Cloud Composer (Airflow) to orchestrate multi-stage ML workflows spanning data prep, training, evaluation, and deployment.
-
Integrate LLMOps patterns through Gemini Enterprise, LiteLLM or similar model gateways.
Kubernetes & Istio-Based Service Operation s
-
Operate AI/ML powered microservices on GKE involving Istio gateways and service mesh patterns.
-
Collaborate with platform teams on security tooling such as NexusIQ, StackRox, and Wiz.
Monitoring, Observability & Model Quality
-
Use Arize (or similar tools) for model drift/quality monitoring, embeddings monitoring, and LLM evaluation patterns.
-
Implement logging, alerting, and SLOs for ML workloads and pipelines with Splunk, New Relic, Pagerduty etc..
-
Assist with incident response, root-cause analysis, and long-term platform improvements.
DevOps Support for Internal Web Framework
-
Provide operational guidance and deployment automation for internal Python-based frameworks used in AI/ML services.
-
Improve developer productivity through standardized templates, CI/CD patterns, and tool
Who you are:
-
Bachelor s Degree in Computer Science or relevant experience.
-
4-6 years of experience in DevOps, Cloud Engineering, or MLOps roles.
-
Strong experience with GCP (Vertex AI, GKE, Dataflow, Dataproc, Composer, BigQuery, etc.) or other major cloud providers (Azure/AWS).
-
Hands-on expertise with Kubernetes, Vertex AI, Docker and image-based deployment workflows.
-
High proficiency with Python or similar scripting language, especially for automating ML/infra tasks.
-
Strong knowledge of Terraform and IaC patterns at scale.
-
Experience deploying apps via GitOps using Argo.
-
Proven ability to support AI/ML models in production: monitoring, pipelines, debugging, retraining loops.
-
Illustrated history of living the values necessary to Priceline: Customer, Innovation, Team, Accountability and Trust.
-
The Right Results, the Right Way is not just a motto at Priceline; it s a way of life. Unquestionable integrity and ethics is essential.
Nice-to-Haves
-
Experience with LLMOps toolchains (RAG pipelines, vector stores, prompt/version management, agent frameworks).
-
Good understanding of infosec and RBAC best practices, and security posture management.
-
Exposure to SRE best practices and error budgets for ML systems .