Overview
We are seeking a highly skilled DevOps Engineer with strong MLOps expertise to join our team. The ideal candidate will have a solid foundation in DevOps practices—infrastructure automation, CI/CD, container orchestration, networking, monitoring, Linux system administration, and security compliance—and extend this expertise into operationalizing ML/AI workloads. You will collaborate with data scientists, ML engineers, and software teams to ensure reliable, secure, and scalable deployments of applications and ML models.Website: https://coreops.aiKey ResponsibilitiesDevOps
- Design, build, and maintain CI/CD pipelines for applications and AI/ML workloads.
- Implement Infrastructure as Code (Terraform, Ansible, CloudFormation).
- Deploy and manage containerized environments using Docker, Kubernetes, and Helm.
- Manage and optimize cloud infrastructure across AWS, Azure, or GCP.
- Ensure system reliability, security, and performance with strong Linux administration skills.
- Manage web servers, DNS, CDN, and databases (SQL/NoSQL).
- Implement monitoring, logging, and alerting using Prometheus, Grafana, ELK, or Datadog.
- Apply best practices for networking (VPCs, load balancers, DNS, firewalls, service meshes), scaling, and cost optimization.
MLOps
- Deploy, monitor, and maintain ML models in production.
- Build automated training, testing, retraining, and data drift detection pipelines.
- Support data pipelines, versioning, and reproducibility (DVC, MLflow, CML).
- Collaborate with data scientists and ML engineers to productionize ML models.
- Integrate ML/AI workflows into CI/CD processes.
- Work with APIs (REST/gRPC) for model/service integration.
Security Compliance
- Design secure, compliant systems (IAM, RBAC, secrets management, audit readiness).
- Implement DevSecOps practices in CI/CD pipelines.
- Ensure alignment with industry standards (GDPR, SOC2, ISO27001).
Must-Have Skills
- Linux expertise: Strong hands-on Linux administration (Ubuntu, RHEL, CentOS).
- DevOps foundation: CI/CD, Kubernetes, Docker, Terraform/Ansible, monitoring, and security.
- Cloud experience: Hands-on with AWS, Azure, or GCP.
- Networking expertise: Strong knowledge of VPCs, load balancers, DNS, firewalls, and service meshes.
- Web infrastructure: Experience with Nginx, Apache, DNS management, CDN integration.
- Database experience: SQL (MySQL, PostgreSQL) and NoSQL (MongoDB, Redis).
- Programming: Proficiency in Python, Bash/Shell scripting, and SQL.
- MLOps knowledge: Model deployment, pipelines, monitoring, retraining, and data drift detection.
- Version control automation: GitHub/GitLab, Jenkins, GitHub Actions.
- ML frameworks: Familiarity with TensorFlow, PyTorch, or Scikit-learn.
- RAG AI pipeline exposure: RAG pipelines, vector databases (Pinecone, Weaviate, FAISS), embeddings, LangChain/LlamaIndex.
- Collaboration tools: Jira, Azure DevOps, Confluence.
Preferred Skills
- Observability and monitoring for ML/AI systems.
- Familiarity with cloud-native ML platforms (SageMaker, Vertex AI, Azure ML).
- Experience with workflow/data orchestration (Airflow, Argo, Kubeflow).
- Security practices in DevOps/MLOps (secrets management, IAM, RBAC, compliance).
- Knowledge of LLMOps best practices (monitoring, evaluation, guardrails).
- Certifications (optional but attractive):
- AWS/Azure/GCP Certified Solutions Architect or DevOps Engineer.
- Kubernetes (CKA/CKAD/CKS).
Qualifications
- Bachelor’s or Master’s degree in Computer Science, Engineering, or related field.
- 4+ years of experience in DevOps (cloud, containers, automation, Linux, networking) and 2+ years of MLOps exposure in production environments.
- Strong understanding of DevOps and MLOps principles and best practices.