AI Ops Engineer

3 - 6 years

8.0 - 18.0 Lacs P.A.

Hyderabad

Posted:Invalid date| Platform: Naukri logo

Apply Now

Skills Required

TerraformContainerizationCICDCloud FormationKubernetesAny Cloud

Work Mode

Work from Office

Job Type

Full Time

Job Description

AI-Ops Engineering Qualifications & Skills: Education: Bachelors or Masters degree in Computer Science, Data Engineering, AI, or a related field. Relevant certifications in cloud platforms (AWS, Azure, GCP) or MLOps frameworks are a plus. Experience: 3+ years of experience in AI/ML operations, MLOps, or DevOps for AI-driven solutions. Hands-on experience deploying and managing AI models, including LLMs and GenAI solutions, in production environments. Experience working with cloud AI platforms such as Azure AI, AWS SageMaker, or Google Vertex AI. Technical Skills: Proficiency in MLOps tools and frameworks such as MLflow, Kubeflow, or Airflow. Hands-on experience with monitoring tools (Prometheus, Grafana, ELK Stack) for AI performance tracking. Experience with containerization and orchestration tools (Docker, Kubernetes) to support AI workloads. Familiarity with automation scripting using Python, Bash, or PowerShell. Understanding of GenAI-specific operational challenges such as response monitoring, token management, and prompt optimization. Knowledge of CI/CD pipelines (Jenkins, GitHub Actions) for AI model deployment. Strong understanding of AI security principles, including data privacy and governance considerations. Key Responsibilities: AI Model Deployment & Integration: Deploy and manage AI/ML models, including traditional machine learning and GenAI solutions (e.g., LLMs, RAG systems). Implement automated CI/CD pipelines for seamless deployment and scaling of AI models. Ensure efficient model integration into existing enterprise applications and workflows in collaboration with AI Engineers. Optimize AI infrastructure for performance and cost efficiency in cloud environments (AWS, Azure, GCP). Monitoring & Performance Management: Develop and implement monitoring solutions to track model performance, latency, drift, and cost metrics. Set up alerts and automated workflows to manage performance degradation and retraining triggers. Ensure responsible AI by monitoring for issues such as bias, hallucinations, and security vulnerabilities in GenAI outputs. Collaborate with Data Scientists to establish feedback loops for continuous model improvement. Automation & MLOps Best Practices: Establish scalable MLOps practices to support the continuous deployment and maintenance of AI models. Automate model retraining, versioning, and rollback strategies to ensure reliability and compliance. Utilize infrastructure-as-code (Terraform, CloudFormation) to manage AI pipelines. Security & Compliance: Implement security measures to prevent prompt injections, data leakage, and unauthorized model access. Work closely with compliance teams to ensure AI solutions adhere to privacy and regulatory standards (HIPAA, GDPR). Regularly audit AI pipelines for ethical AI practices and data governance. Soft Skills: Strong problem-solving skills with the ability to troubleshoot complex AI operational issues. Excellent communication skills to effectively collaborate with cross-functional stakeholders. Proactive and results-driven mindset with a focus on operational efficiency and scalability. Ability to work effectively in a fast-paced, dynamic environment.

RecommendedJobs for You

Chennai, Pune, Mumbai, Bengaluru, Gurgaon

Chennai, Pune, Delhi, Mumbai, Bengaluru, Hyderabad, Kolkata

Pune, Bengaluru, Mumbai (All Areas)