Jobs
Interviews

3 Model Observability Jobs

Setup a job Alert
JobPe aggregates results for easy application access, but you actually apply on the job portal directly.

3.0 - 8.0 years

0 Lacs

thiruvananthapuram, kerala

On-site

As a Senior AI/ML Engineer specializing in Python/AI-ML, you will be responsible for developing and implementing AI/ML models using Python. Your expertise in Keras and Pandas will be essential for model development. Additionally, you will work with Generative AI and Large Language Models like GPT-3 and Transformers. Your role will involve building and managing data pipelines using technologies such as Kafka and big data platforms. Experience in Time Series modeling and knowledge of cloud platforms like AWS, Azure, or GCP for deployment and performance optimization will be crucial. Familiarity with Aurora DB and ELK stack will also be beneficial. You should be able to create reusable ML components for scalable systems and possess excellent communication skills for collaborating with cross-functional teams. Exposure to GANs, synthetic data generation techniques, model compliance, and performance benchmarking will be considered advantageous. Mentoring junior developers and understanding model observability and monitoring tools are desirable qualities. With at least 5 years of hands-on experience in AI, Machine Learning, and Generative AI, you should have strong programming skills in Python, Keras, Pandas, and FastAPI. Familiarity with advanced ML models like LangChain, GPT-3, GANs, Transformers, and analytical models such as Time Series Forecasting and Predictive Modelling is required. Knowledge of databases like ChromaDB/Pinecone will be an added advantage. This role at UST, a global digital transformation solutions provider, offers the opportunity to work with cutting-edge technologies and make a real impact through transformation. Join our team of over 30,000 employees in 30 countries to innovate and build for boundless impact, touching billions of lives in the process.,

Posted 1 week ago

Apply

5.0 - 10.0 years

22 - 30 Lacs

Pune

Hybrid

We are looking for a Machine Learning Engineer with expertise in MLOps (Machine Learning Operations) or LLMOps (Large Language Model Operations) to design, deploy, and maintain scalable AI/ML systems. You will work on automating ML workflows, optimizing model deployment, and managing large-scale AI applications, including LLMs (Large Language Models) , ensuring they run efficiently in production. Key Responsibilities: Design and implement end-to-end MLOps pipelines for training, validation, deployment, monitoring, and retraining of ML models. Optimize and fine-tune large language models (LLMs) for various applications, ensuring performance and efficiency. Develop CI/CD pipelines for ML models to automate deployment and monitoring in production. Monitor model performance, detect drift , and implement automated retraining mechanisms. Work with cloud platforms ( AWS, GCP, Azure ) and containerization technologies ( Docker, Kubernetes ) for scalable deployments. Implement best practices in data engineering , feature stores, and model versioning. Collaborate with data scientists, engineers, and product teams to integrate ML models into production applications. Ensure compliance with security, privacy, and ethical AI standards in ML deployments. Optimize inference performance and cost of LLMs using quantization, pruning, and distillation techniques . Deploy LLM-based APIs and services, integrating them with real-time and batch processing pipelines. Key Requirements: Technical Skills: Strong programming skills in Python, with experience in ML frameworks ( TensorFlow, PyTorch, Hugging Face, JAX ). Experience with MLOps tools (MLflow, Kubeflow, Vertex AI, SageMaker, Airflow). Deep understanding of LLM architectures , prompt engineering, and fine-tuning. Hands-on experience with containerization (Docker, Kubernetes) and orchestration tools . Proficiency in cloud services (AWS/GCP/Azure) for ML model training and deployment. Experience with monitoring ML models (Prometheus, Grafana, Evidently AI). Knowledge of feature stores (Feast, Tecton) and data pipelines (Kafka, Apache Beam). Strong background in distributed computing (Spark, Ray, Dask) . Soft Skills: Strong problem-solving and debugging skills. Ability to work in cross-functional teams and communicate complex ML concepts to stakeholders. Passion for staying updated with the latest ML and LLM research & technologies . Preferred Qualifications: Experience with LLM fine-tuning , Reinforcement Learning with Human Feedback ( RLHF ), or LoRA/PEFT techniques . Knowledge of vector databases (FAISS, Pinecone, Weaviate) for retrieval-augmented generation ( RAG ). Familiarity with LangChain, LlamaIndex , and other LLMOps-specific frameworks. Experience deploying LLMs in production (ChatGPT, LLaMA, Falcon, Mistral, Claude, etc.) .

Posted 1 month ago

Apply

12.0 - 16.0 years

40 - 50 Lacs

Pune, Chennai, Bengaluru

Hybrid

AI Ops Senior Architect 12 -17 Years Work Location - Pune/ Bengaluru/Hyderabad/Chennai/ Gurugram Tredence is Data science, engineering, and analytics consulting company that partners with some of the leading global Retail, CPG, Industrial and Telecom companies. We deliver business impact by enabling last mile adoption of insights by uniting our strengths in business analytics, data science and data engineering. Headquartered in the San Francisco Bay Area, we partner with clients in US, Canada, and Europe. Bangalore is our largest Centre of Excellence with skilled analytics and technology teams serving our growing base of Fortune 500 clients. JOB DESCRIPTION At Tredence, you will lead the evolution of Industrializing AI ” solutions for our clients by implementing ML/LLM/GenAI & Agent Ops best practices. You will lead the Architecture , Design & development of large scale ML/LLMOps platforms for our clients. You’ll build and maintain tools for deployment, monitoring, and operations. You’ll be a trusted advisor to our clients in ML/GenAI/Agent Ops space & coach to the ML engineering practitioners to build effective solutions to Industrialize AI solutions THE IDEAL CANDIDATE WILL BE RESPONSIBLE FOR AI Ops Strategy, Innovation, Research and Technical Standards 1. Conduct research and experiment with emerging AI Ops technologies and trends. Create POV’s, POC’s & present Proof of Technology to use latest tools, Technologies & services from Hyper scalers focussed on ML, GenAI & Agent Ops 2. Define and propose new technical standards and best practices for the organization's AI Ops environment. 3. Lead the evaluation and adoption of innovative MLOps solutions to address critical business challenges. 4. Conduct meet ups, attend & present in Industry events, conferences, etc 5. Ideate & develop accelerators to strengthen service offerings of AI Ops practice Solution Design & Architectural Development 6. Lead Design & architecture of scalable model training & deployment pipelines for large-scale deployments 7. Architect & Design large scale ML & GenAI Ops platforms 8. Collaborate with Data science & GenAI practice to define and implement strategies of AI solutions for model explainability and interpretability 9. Mentor and guide senior architects in crafting cutting-edge AI Ops solutions 10. Lead architecture reviews and identify opportunities for significant optimizations and improvements. Documentation and Best Practices 11. Develop and maintain comprehensive documentation of AIOps architectures designs and best practices. 12. Lead the development and delivery of training materials and workshops on AIOps tools and techniques. 13. Actively participate in sharing knowledge and expertise with the MLOps team through internal presentations and code reviews. Qualifications and Skills: 1. Bachelor’s or Master’s degree in Computer Science, Data Science, or a related field with minimum 12 years of experience 2. Proven experience in architecting & developing AIOps solutions – to streamline Machine Learning & GenAI development lifecycle 3. Proven experience as an AI Ops Architect – ML & GenAI in architecting & design of ML & GenAI platforms 4. Hands on experience in Model deployment strategies, Designing ML & GenAI model pipelines to scale in production, Model Observability techniques used to monitor performance of ML & LLM’s 5. Strong coding skills with experience in implementing best coding practices Technical Skills & Expertise Python, PySpark, PyTorch ,Java, Micro Services, API’s LLMOps – Vector DB, RAG, LLM Orchestration tools, LLM Observability, LLM Guardrails, Responsible AI MLOps - MLFlow, ML/DL libraries, Model & Data Drift Detection libraries & techniques Real Time & Batch Streaming Container Orchestration Platforms Cloud platforms – Azure/ AWS/ GCP, Data Platforms – Databricks/ Snowflake Nice to Have: Understanding of Agent Ops Exposure to Databricks platform You can expect to – Work with world’s biggest Retailers, CPG’s, HealthCare, Banking & Manufacturing customers and help them solve some of their most critical problems Create multi-million Dollar business opportunities by leveraging impact mindset, cutting edge solutions and industry best practices. Work in a diverse environment that keeps evolving Hone your entrepreneurial skills as you contribute to growth of the organization

Posted 1 month ago

Apply
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Featured Companies