Posted:1 week ago|
Platform:
On-site
Contractual
About Company Our client is a trusted global innovator of IT and business services. We help clients transform through consulting, industry solutions, business process services, digital & IT modernization and managed services. Our client enables them, as well as society, to move confidently into the digital future. We are committed to our clients’ long-term success and combine global reach with local client attention to serve them in over 50 countries around the globe Job Title: Senior AI Cloud Operations Engineer Location: Chennai Experience: 4 to 5 yrs Job Type : Contract to hire Notice Period:- Immediate joiner OffShore Profile Summary: We’re looking for a Senior AI Cloud Operations Engineer to start building a new for AI Cloud Operations team, starting with this strategic position. We are searching for an experienced Senior AI Cloud Operations Engineer with deep expertise in AI technologies to lead our cloud-based AI infrastructure management. This role is integral to ensuring our AI systems' scalability, reliability, and performance, enabling us to deliver cutting-edge solutions. The ideal candidate will have a robust understanding of machine learning frameworks, cloud services architecture, and operations management. Key Responsibilities: Cloud Architecture Design: Design, architect, and manage scalable cloud infrastructure tailored for AI workloads, leveraging platforms like AWS, Azure, or Google Cloud. System Monitoring and Optimization: Implement comprehensive monitoring solutions to ensure high availability and swift performance, utilizing tools like Prometheus, Grafana, or CloudWatch. Collaboration and Model Deployment: Work closely with data scientists to operationalize AI models, ensuring seamless integration with existing systems and workflows. Familiarity with tools such as MLflow or TensorFlow Serving can be beneficial. Automation and Orchestration: Develop automated deployment pipelines using orchestration tools like Kubernetes and Terraform to streamline operations and reduce manual interventions. Security and Compliance: Ensure that all cloud operations adhere to security best practices and compliance standards, including data privacy regulations like GDPR or HIPAA. Documentation and Reporting: Create and maintain detailed documentation of cloud configurations, procedures, and operational metrics to foster transparency and continuous improvement. Performance Tuning: Conduct regular performance assessments and implement strategies to optimize cloud resource utilization and reduce costs without compromising system effectiveness. Issue Resolution: Rapidly identify, diagnose, and resolve technical issues, minimizing downtime and ensuring maximum uptime. Qualifications: Educational Background: Bachelor’s degree in Computer Science, Engineering, or a related field. Master's degree preferred. Professional Experience: 5+ years of extensive experience in cloud operations, particularly within AI environments. Demonstrated expertise in deploying and managing complex AI systems in cloud settings. Technical Expertise: Deep knowledge of cloud platforms (AWS, Azure, Google Cloud) including their AI-specific services such as AWS SageMaker or Google AI Platform. AI/ML Proficiency: In-depth understanding of AI/ML frameworks and libraries such as TensorFlow, PyTorch, Scikit-learn, along with experience in ML model lifecycle management. Infrastructure as Code: Proficiency in infrastructure-as-code tools such as Terraform and AWS CloudFormation to automate and manage cloud deployment processes. Containerization and Microservices: Expertise in managing containerized applications using Docker and orchestrating services with Kubernetes. Soft Skills: Strong analytical, problem-solving, and communication skills, with the ability to work effectively both independently and in collaboration with cross-functional teams. Preferred Qualifications: Advanced certifications in cloud services, such as AWS Certified Solutions Architect or Google Cloud Professional Data Engineer. Experience in advanced AI techniques such as deep learning or reinforcement learning. Knowledge of emerging AI technologies and trends to drive innovation within existing infrastructure. List of Used Tools: Cloud Provider: Azure, AWS or Google. Performance & monitor: Prometheus, Grafana, or CloudWatch. Collaboration and Model Deployment: MLflow or TensorFlow Serving Automation and Orchestration: Kubernetes and Terraform Security and Compliance: Data privacy regulations like GDPR or HIPAA. Qualifications Bachelor's degree in Computer Science (or related field) Show more Show less
People Prime Worldwide
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
My Connections People Prime Worldwide
Thane, Bhiwandi
Experience: Not specified
2.0 - 2.5 Lacs P.A.
Experience: Not specified
2.25 - 3.25 Lacs P.A.
Noida, Uttar Pradesh, India
Experience: Not specified
Salary: Not disclosed
Hyderabad, Telangana, India
Salary: Not disclosed
Hyderabad / Secunderabad, Telangana, Telangana, India
6.0 - 8.0 Lacs P.A.
10.0 - 15.0 Lacs P.A.
3.0 - 6.0 Lacs P.A.
Pune, Mumbai (All Areas)
27.5 - 42.5 Lacs P.A.
Gurugram, Haryana, India
Salary: Not disclosed
Jaipur, Jodhpur
11.0 - 20.0 Lacs P.A.