We are seeking an experienced
Senior MLOps Architect
/
Lead ML Engineer
to design, implement, and maintain end-to-end ML engineering and MLOps solutions. In this role, you will be responsible for architecting pipelines using tools like
MLflow
,
Feast
Feature Store, advanced model serving frameworks, and monitoring systems. You will collaborate with cross-functional teams to ensure efficient, robust, and scalable machine learning operations across the entire lifecycle from development to deployment and beyond.
Key Responsibilities
-
MLOps Architecture & Strategy
- Drive the design and architecture of advanced MLOps frameworks, ensuring seamless integration between data ingestion, model training, deployment, and monitoring.
- Evaluate, select, and implement appropriate MLOps tools and platforms (MLflow, Feast, etc.) to standardize processes and accelerate ML deployments.
- Establish best practices for infrastructure, security, and governance that meet enterprise-grade requirements.
-
ML Pipeline Development & Automation
- Design and maintain CI/CD pipelines that facilitate automated model building, testing, and deployment.
- Leverage
MLflow
for experiment tracking, model versioning, and reproducible pipelines. - Integrate
Feast
Feature Store to streamline feature engineering, management, and versioning across different ML models.
-
Model Serving & Monitoring
- Implement robust, low-latency model serving solutions (e.g., using Docker/Kubernetes, REST APIs, or specialized serving frameworks).
- Set up comprehensive model monitoring systems to track performance metrics, model drift, and data quality in production.
- Establish alerting mechanisms and feedback loops for continuous model improvements and proactive issue resolution.
-
Data & Feature Management
- Collaborate with Data Engineering teams to ensure efficient data pipelines, data governance, and data quality.
- Architect and maintain the
Feast
Feature Store for consistent, reusable, and high-quality features. - Develop strategies for feature lifecycle management, including feature discovery, validation, and retirement.
-
Technical Leadership & Mentorship
- Guide and mentor a team of ML engineers and data scientists, promoting a culture of knowledge sharing and innovation.
- Conduct technical reviews, provide feedback, and ensure adherence to coding standards, design principles, and best practices.
- Collaborate with stakeholders (product managers, DevOps, IT) to align ML solutions with business objectives and technical feasibility.
-
Research & Innovation
- Stay updated on emerging trends and technologies in MLOps, cloud-native solutions, and AI frameworks.
- Evaluate new tools, libraries, and methodologies, incorporating the most effective ones into the development lifecycle.
- Advocate for continuous improvement and experimentation within the ML engineering function.
-
Documentation & Compliance
- Produce and maintain detailed architectural diagrams, system configurations, and operational runbooks.
- Ensure compliance with data privacy, security, and regulatory requirements throughout the ML lifecycle.
- Support audit and compliance requests by maintaining clear, consistent documentation of processes and systems.
Required Qualifications
-
Overall Experience
: 14+ years in software engineering, data engineering, or data science roles. -
Relevant MLOps Experience
: 4-5 years of hands-on experience designing and implementing MLOps frameworks. -
Technical Expertise
: -
MLflow
: Proficient with experiment tracking, model registry, and model packaging. -
Feast Feature Store
: Demonstrated experience implementing and managing feature stores at scale. -
Model Serving
: In-depth understanding of deployment strategies (batch/streaming/real-time) using Docker, Kubernetes, or similar tools. -
Model Monitoring
: Experience setting up performance metrics, alerting, and drift detection. -
Python
: Extensive coding experience (data manipulation, building APIs, scripting). -
Cloud & DevOps
: Familiarity with AWS, Azure, or GCP services, CI/CD tools (Jenkins, GitLab CI, etc.), and infrastructure-as-code (Terraform, CloudFormation).
-
Soft Skills
: - Strong communication skills to articulate complex technical solutions to various stakeholders.
- Leadership and team-building capabilities, with a track record of mentoring engineers and data scientists.
- Problem-solving mindset with the ability to handle ambiguity and drive results in a fast-paced environment.
Preferred / Bonus Skills
- Experience with
distributed data processing
frameworks like Spark or Hadoop. - Knowledge of
container orchestration
and service mesh technologies (e.g., Istio, Envoy). - Familiarity with
feature engineering
methodologies, advanced ML/DL frameworks (TensorFlow, PyTorch), or specialized libraries for NLP or computer vision. - Exposure to
big data
tools, streaming data platforms (Kafka), or real-time analytics solutions.