Job
Description
As an ML Platform Specialist, your role involves designing, implementing, and maintaining robust machine learning infrastructure and workflows using Databricks Lakehouse Platform. Your responsibilities are critical in ensuring the smooth deployment, monitoring, and scaling of machine learning models across the organization. Key Responsibilities: - Design and implement scalable ML infrastructure on Databricks Lakehouse Platform - Develop and maintain continuous integration and continuous deployment (CI/CD) pipelines for machine learning models using Databricks workflows - Create automated testing and validation processes for machine learning models with Databricks MLflow - Implement and manage model monitoring systems using Databricks Model Registry and monitoring tools - Collaborate with data scientists, software engineers, and product teams to optimize machine learning workflows on Databricks - Develop and maintain reproducible machine learning environments using Databricks Notebooks and clusters - Implement advanced feature engineering and management using Databricks Feature Store - Optimize machine learning model performance using Databricks runtime and optimization techniques - Ensure data governance, security, and compliance within the Databricks environment - Create and maintain comprehensive documentation for ML infrastructure and processes - Work across teams from several suppliers including IT Provision, system development, business units, and Programme management - Drive continuous improvement and transformation initiatives for MLOps / DataOps in RSA Qualifications Required: - Bachelors or masters degree in computer science, Machine Learning, Data Engineering, or related field - 3-5 years of experience in ML Ops with demonstrated expertise in Databricks and/or Azure ML - Advanced proficiency with Databricks Lakehouse Platform - Strong experience with Databricks MLflow for experiment tracking and model management - Expert-level programming skills in Python, with advanced knowledge of PySpark, MLlib, Delta Lake, Azure ML SDK - Deep understanding of Databricks Feature Store and Feature Engineering techniques - Experience with Databricks workflows and job scheduling - Proficiency in machine learning frameworks compatible with Databricks and Azure ML such as TensorFlow, PyTorch, scikit-learn - Strong knowledge of cloud platforms including Azure Databricks, Azure DevOps, Azure ML - Strong exposure to Terraform, ARM/BICEP - Understanding of distributed computing and big data processing techniques Note: Experience with Containerization, WebApps Kubernetes, Cognitive Services, and other MLOps tools will be a plus. As an ML Platform Specialist, your role involves designing, implementing, and maintaining robust machine learning infrastructure and workflows using Databricks Lakehouse Platform. Your responsibilities are critical in ensuring the smooth deployment, monitoring, and scaling of machine learning models across the organization. Key Responsibilities: - Design and implement scalable ML infrastructure on Databricks Lakehouse Platform - Develop and maintain continuous integration and continuous deployment (CI/CD) pipelines for machine learning models using Databricks workflows - Create automated testing and validation processes for machine learning models with Databricks MLflow - Implement and manage model monitoring systems using Databricks Model Registry and monitoring tools - Collaborate with data scientists, software engineers, and product teams to optimize machine learning workflows on Databricks - Develop and maintain reproducible machine learning environments using Databricks Notebooks and clusters - Implement advanced feature engineering and management using Databricks Feature Store - Optimize machine learning model performance using Databricks runtime and optimization techniques - Ensure data governance, security, and compliance within the Databricks environment - Create and maintain comprehensive documentation for ML infrastructure and processes - Work across teams from several suppliers including IT Provision, system development, business units, and Programme management - Drive continuous improvement and transformation initiatives for MLOps / DataOps in RSA Qualifications Required: - Bachelors or masters degree in computer science, Machine Learning, Data Engineering, or related field - 3-5 years of experience in ML Ops with demonstrated expertise in Databricks and/or Azure ML - Advanced proficiency with Databricks Lakehouse Platform - Strong experience with Databricks MLflow for experiment tracking and model management - Expert-level programming skills in Python, with advanced knowledge of PySpark, MLlib, Delta Lake, Azure ML SDK - Deep understanding of Databricks Feature Store and Feature Engineering techniques - Experience with Databricks workflows and job scheduling - Proficiency in machine learning frameworks compatible with Databricks and Azure ML such as TensorFlow, PyTorch, scikit-