Posted:2 days ago| Platform: Linkedin logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

Job Title:

Location:

Employment Type:

Experience:

Qualification:


About the Role

We are seeking a highly skilled DevOps/MLOps Expert to join our rapidly growing AI-based startup building and deploying cutting-edge enterprise AI/ML solutions. This is a critical role that will shape our infrastructure, deployment pipelines, and scale our ML operations to serve large-scale enterprise clients.

As our DevOps/MLOps Expert, you will be responsible for bridging the gap between our AI/ML development teams and production systems, ensuring seamless deployment, monitoring, and scaling of our ML-powered enterprise applications. You’ll work at the intersection of DevOps, Machine Learning, and Data Engineering in a fast-paced startup environment with enterprise-grade requirements.


Key Responsibilities

MLOps & Model Deployment

• Design, implement, and maintain end-to-end ML pipelines from model development to production deployment

• Build automated CI/CD pipelines specifically for ML models using tools like MLflow, Kubeflow, and custom solutions

• Implement model versioning, experiment tracking, and model registry systems

• Monitor model performance, detect drift, and implement automated retraining pipelines

• Manage feature stores and data pipelines for real-time and batch inference

• Build scalable ML infrastructure for high-volume data processing and analytics

Enterprise Cloud Infrastructure & DevOps

• Architect and manage cloud-native infrastructure with focus on scalability, security, and compliance

• Implement Infrastructure as Code (IaC) using Terraform, CloudFormation, or Pulumi

• Design and maintain Kubernetes clusters for containerized ML workloads

• Build and optimize Docker containers for ML applications and microservices

• Implement comprehensive monitoring, logging, and alerting systems

• Manage secrets, security, and enterprise compliance requirements

Data Engineering & Real-time Processing

• Build and maintain large-scale data pipelines using Apache Airflow, Prefect, or similar tools

• Implement real-time data processing and streaming architectures

• Design data storage solutions for structured and unstructured data at scale

• Implement data validation, quality checks, and lineage tracking

• Manage data security, privacy, and enterprise compliance requirements

• Optimize data processing for performance and cost efficiency

Enterprise Platform Operations

• Ensure high availability (99.9%+) and performance of enterprise-grade platforms

• Implement auto-scaling solutions for variable ML workloads

• Manage multi-tenant architecture and data isolation

• Optimize resource utilization and cost management across environments

• Implement disaster recovery and backup strategies

• Build 24x7 monitoring and alerting systems for mission-critical applications


Required Qualifications

Experience & Education

• 4-8 years of experience in DevOps/MLOps with at least 2+ years focused on enterprise ML systems

• Bachelor’s/Master’s degree in Computer Science, Engineering, or related technical field

• Proven experience with enterprise-grade platforms or large-scale SaaS applications

• Experience with high-compliance environments and enterprise security requirements

• Strong background in data-intensive applications and real-time processing systems


Technical Skills

Core MLOps Technologies

• ML Frameworks: TensorFlow, PyTorch, Scikit-learn, Keras, XGBoost

• MLOps Tools: MLflow, Kubeflow, Metaflow, DVC, Weights & Biases

• Model Serving: TensorFlow Serving, PyTorch TorchServe, Seldon Core, KFServing

• Experiment Tracking: MLflow, Neptune.ai, Weights & Biases, Comet

DevOps & Cloud Technologies

• Cloud Platforms: AWS, Azure, or GCP with relevant certifications

• Containerization: Docker, Kubernetes (CKA/CKAD preferred)

• CI/CD: Jenkins, GitLab CI, GitHub Actions, CircleCI

• IaC: Terraform, CloudFormation, Pulumi, Ansible

• Monitoring: Prometheus, Grafana, ELK Stack, Datadog, New Relic

Programming & Scripting

• Python (advanced) - primary language for ML operations and automation

• Bash/Shell scripting for automation and system administration

• YAML/JSON for configuration management and APIs

• SQL for data operations and analytics

• Basic understanding of Go or Java (advantage)

Data Technologies

• Data Pipeline Tools: Apache Airflow, Prefect, Dagster, Apache NiFi

• Streaming & Real-time: Apache Kafka, Apache Spark, Apache Flink, Redis

• Databases: PostgreSQL, MongoDB, Elasticsearch, ClickHouse

• Data Warehousing: Snowflake, BigQuery, Redshift, Databricks

• Data Versioning: DVC, LakeFS, Pachyderm


Preferred Qualifications

Advanced Technical Skills

• Enterprise Security: Experience with enterprise security frameworks, compliance (SOC2, ISO27001)

• High-scale Processing: Experience with petabyte-scale data processing and real-time analytics

• Performance Optimization: Advanced system optimization, distributed computing, caching strategies

• API Development: REST/GraphQL APIs, microservices architecture, API gateways

Enterprise & Domain Experience

• Previous experience with enterprise clients or B2B SaaS platforms

• Experience with compliance-heavy industries (finance, healthcare, government)

• Understanding of data privacy regulations (GDPR, SOX, HIPAA)

• Experience with multi-tenant enterprise architectures

Leadership & Collaboration

• Experience mentoring junior engineers and technical team leadership

• Strong collaboration with data science teams, product managers, and enterprise clients

• Experience with agile methodologies and enterprise project management

• Understanding of business metrics, SLAs, and enterprise ROI

Growth Opportunities

• Career Path: Clear progression to Lead DevOps Engineer or Head of Infrastructure

• Technical Growth: Work with cutting-edge enterprise AI/ML technologies

• Leadership: Opportunity to build and lead the DevOps/Infrastructure team

• Industry Exposure: Work with Government & MNCs enterprise clients and cutting-edge technology stacks


Success Metrics & KPIs

Technical KPIs

• System Uptime: Maintain 99.9%+ availability for enterprise clients

• Deployment Frequency: Enable daily deployments with zero downtime

• Performance: Ensure optimal response times and system performance

• Cost Optimization: Achieve 20-30% annual infrastructure cost reduction

• Security: Zero security incidents and full compliance adherence


Business Impact

• Time to Market: Reduce deployment cycles and improve development velocity

• Client Satisfaction: Maintain 95%+ enterprise client satisfaction scores

• Team Productivity: Improve engineering team efficiency by 40%+

• Scalability: Support rapid client base growth without infrastructure constraints


Why Join Us

  • Be part of a forward-thinking, innovation-driven company with a strong engineering culture.
  • Influence high-impact architectural decisions that shape mission-critical systems.
  • Work with cutting-edge technologies and a passionate team of professionals.
  • Competitive compensation, flexible working environment, and continuous learning opportunities.


How to Apply

Please submit your resume and a cover letter outlining your relevant experience and how you can contribute to Aaizel Tech Labs’ success. Send your application to hr@aaizeltech.com , bhavik@aaizeltech.com or anju@aaizeltech.com.


Mock Interview

Practice Video Interview with JobPe AI

Start DevOps Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now

RecommendedJobs for You