Role : MLOps
Experience : 5 years to 8 years
Role Brief: We are seeking a skilled and experienced MLOps Engineer to join our team and drive the operationalization of machine
learning models and pipelines at scale. The ideal candidate will be responsible for automating, deploying, monitoring, and
maintaining AI/ML solutions. Turning prototypes into robust, customer- ready solutions while mitigating risks like production
pipeline failures, will be primary. This role requires expertise in infrastructure management, CI/CD pipelines, cloud services, model
orchestration and collaboration with cross- functional teams to ensure seamless deployment into diverse customer
environments.
Primary Responsibilities:
- Strategizing and implementing scalable infrastructure for ML or LLM model pipelines using tools like and cloud
- Manage auto-scaling mechanisms to handle varying workloads and ensure high availability of Rest APIs
- Automate CI/CD pipelines and Lambda functions for model testing, deployment, and updates, reducing manual errors
and improving efficiency.
- Amazon SageMaker Pipelines for end-to-end ML workflow automation. Optimize utilizing step-functions
- Conduct drift analysis to detect and respond to data drift, concept drift, and label drift. Implement mitigation
strategies such as automated alerts, model retraining triggers, and performance audits.
- Set up reproducible workflows for data preparation, model training, and deployment.
- Provision and optimize cloud resources (e.g., GPUs, memory) to meet computational demands of large models like
those used in RAG systems
- Automate retraining workflows to keep models updated as data evolves
- Work closely with data scientists, ML engineers, and DevOps teams to integrate models into production environments.
Implement monitoring tools to track model performance and detect issues like drift or degradation in real- time. Monitoring
dashboards with real-time alerts for pipeline failures or performance issues C Implementing Model Observability
frameworks.
Required Skills:
- Education Any Engineering (BE/Btech/ME/Mtech)
- Min 4 years of experience with AWS services such as Lambda, Bedrock, Batch with Fargate, RDS
(PostgreSQL), DynamoDB, SQS, CloudWatch, API Gateway, SageMaker
- Should have hands-on experience in drift analysis, including detecting and mitigating data, concept, and label drift
in production ML systems
- Knowledge of ML frameworks (e.g., PyTorch, TensorFlow) to understand model requirements during deployment
- Experience with Rest API Frameworks like Fast APIs, Flask
- Familiarity with model observability like Evidently, Nanny ML, Phoenix and monitoring tools (Grafana etc) and retraining tools
like MLflow/ Kubeflow / Airflow
- AWS Certified Machine Learning - Specialty - Good to have this certification
revolutionizing how the world plans, builds, and manages infrastructure projects with Masterworks, our industry-
is setting new standards for project delivery and asset management. Recognized as one of the Top 25 AI Companies of
2024 and a Great Place to Work for three consecutive years, we are leveraging artificial intelligence to create a smarter,
more connected future for customers in transportation, water and utilities, healthcare, higher education, and the
we don’t just develop software—we shape the future. If you’re excited to join a fast-growing company and
collaborate with some of the brightest minds in the industry to solve real-world challenges, let’s connect.