Cloud FinOps AI Engineer

5 years

0 Lacs

Posted:6 days ago| Platform: Linkedin logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

We are seeking a highly technical Senior Cloud FinOps Engineer specialized in designing, developing, and deploying AI-powered agents and automation systems that proactively monitor, analyze, and optimize multi-cloud spend (AWS, Azure, GCP) in a large-scale research and academic healthcare environment.


Responsibilities:

  • Research, design, and develop AI/ML-driven agents and automation workflows that continuously ingest cloud billing, usage, and tagging data (via APIs such as AWS Cost Explorer, Azure Cost Management + Billing, GCP Billing exports, CUR, etc.).
  • Build predictive models to forecast spend, identify upcoming eligibility for Savings Plans/Reservations, and recommend optimal purchase strategies (term length, payment option, instance family/region/zone/SKU, convertible vs standard) while factoring in performance SLAs and workload variability typical of research computing.
  • Implement real-time anomaly and spike detection with intelligent alerting (Slack, email, ServiceNow, etc.) that includes root-cause analysis and suggested corrective actions.
  • Develop automated tagging governance engines that detect missing/incorrect tags, suggest or auto-apply corrections (via Lambda/Functions/Azure Automation), and enforce research grant and department chargeback policies.
  • Create “recommendation-as-code” pipelines that generate executable Infrastructure-as-Code (Terraform/CloudFormation/Bicep) or direct API calls to purchase/commit to the optimal savings instruments.
  • Design and maintain a centralized FinOps AI dashboard (Power BI + custom web frontend if needed) that surfaces agent-generated insights, confidence scores, projected savings, and one-click approval workflows.
  • Integrate the AI platform with existing tooling (AWS Cost Anomaly Detection, Azure Advisor, third-party FinOps platforms) and extend them where native capabilities fall short.
  • Collaborate on containerized/microservice architecture (Kubernetes/EKS/AKS/GKE) for the agent platform and ensure all components meet healthcare security and compliance standards.
  • Continuously measure savings attribution, model accuracy, and automation adoption; iterate on models using retraining pipelines and feedback loops.
  • Document architectures, create runbooks, and mentor FinOps analysts and cloud engineers on using the new AI capabilities.


Requirement:

  • Education: Bachelor’s or Master’s degree in Computer Science, Data Science, Engineering, or a related quantitative field; advanced degree in a healthcare or research-related discipline is a plus.
  • 5+ years of hands-on cloud engineering and architecture experience with at least two major providers (AWS and Azure required; GCP a plus).
  • 3+ years building production-grade data pipelines, ML models, or intelligent automation in a cloud cost-management or FinOps context.
  • Proven track record of implementing Savings Plans, Reserved Instances, and committed-use discount strategies at scale (> $10M annual cloud spend preferred).
  • Strong software development skills in Python (mandatory) and at least one additional language (Go, TypeScript/Node.js, Java, etc.).
  • Hands-on experience with ML frameworks (scikit-learn, TensorFlow, PyTorch, XGBoost/LightGBM) and MLOps tools (MLflow, SageMaker, Azure ML, Vertex AI).
  • Expertise in cloud billing APIs, Cost and Usage Reports (CUR), Cost Explorer, Azure Consumption APIs, and building enriched data lakes (S3 + Athena/Glue, Azure Data Lake + Synapse, BigQuery).
  • Proficiency in Infrastructure as Code (Terraform primary; CloudFormation/Bicep acceptable) and CI/CD pipelines (GitHub Actions, GitLab CI, Azure DevOps).
  • Experience with event-driven architectures (EventBridge, Azure Event Grid, Pub/Sub) and serverless compute for real-time processing.
  • Solid understanding of tagging strategies, cost allocation, showbacks/chargebacks in decentralized research/academic environments.


Nice to have:

  • Previous work in healthcare, academic medical centers, or grant-funded research environments.
  • FinOps Certified Practitioner or Platform Engineer certification.
  • Contributions to open-source FinOps or cloud-cost tools (e.g., Kubecost, Cloud Custodian, Infracost, custom agents).
  • Experience with generative AI/LLMs for explaining recommendations to non-technical stakeholders.
  • Familiarity with Apache Airflow, dbt, Databricks, or similar for orchestration and transformation.
  • Knowledge of HIPAA/HITECH-compliant data handling and encryption standards in analytics workloads.

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now

RecommendedJobs for You