Get alerts for new jobs matching your selected skills, preferred locations, and experience range.
0.0 - 5.0 years
0 - 12 Lacs
Bengaluru / Bangalore, Karnataka, India
On-site
IBM Research is the innovation and growth engine of the IBM corporation. It is the largest industrial research organization in the world with 12 labs on 6 continents. IBM Research produces more breakthroughsmore than 9 patents are produced every daythan any other organization in the world. IBM employs over 3200 researchers worldwide. IBM Research India (IRL) is the leading industrial research lab in India, shaping the future of computing across AI, Hybrid Cloud and Quantum Computing. IRL has a long legacy of ground-breaking innovation in the areas of computer science and its applications to a wide variety of disciplines and offerings for IBM. IRL researchers are working on projects that are pushing the state of the art across Foundation Models, optimized runtime stacks for FM workloads such as tuning, large scale data engineering and pre-training, multi-accelerator model optimization, agentic workflows and modalities across language, code, time series, IT automation and geospatial. We are strong proponents of open-source community-driven software and model development, and our work spans a wide spectrum from research collaborations with academia to developing enterprise-grade commercial software. Your role and responsibilities Research Engineer position at IBM India Research Lab is a challenging, dynamic and highly innovative role. Some of our current areas of work where we are actively looking for top talent are: Optimized runtime stacks for foundation model workloads including fine-tuning, inference serving and large-scale data engineering, with a focus on multi-stage tuning including reinforcement learning, inference-time compute, and data preparation needs for complex AI systems. Optimizing models to run on multiple accelerators including IBM's AIU accelerator leveraging compiler optimizations, specialized kernels, libraries and tools. Developing use cases that effectively leverage the infrastructure and models to deliver value Pre-training language and multi-modal foundation models working with large scale distributed training procedures, model alignment, creating specialized pipelines for various tasks including effective LLM-generated data pipelines, creating frameworks for collecting human data and deploying models in user-centric platforms. Required education Bachelor's Degree Preferred education Master's Degree Required technical and professional expertise You should have one or more of the following: A master's degree in computer science, AI or related fields from a top institution 0-8 years of experience working with modern ML techniques including but not limited to model architectures, data processing, fine-tuning techniques, reinforcement learning, distributed training, inference optimizations Experience with big data platforms like Ray and Spar Experience working with Pytorch FSDP and HuggingFace libraries Programming experience in one of the following: Python, web development technologies Growth mindset and a pragmatic attitude Preferred technical and professional experience Peer-reviewed research at top machine learning or systems conferences Experience working with pytorch.compile, CUDA, triton kernels, GPU scheduling, memory management Experience working with open-source communities
Posted 1 week ago
5.0 - 10.0 years
3 - 7 Lacs
Hyderabad / Secunderabad, Telangana, Telangana, India
On-site
Designing, implementing, and optimizing CI/CD pipelines for cloud and hybrid environments. Integrating AI-driven pipeline automation for self-healing deployments and predictive troubleshooting. Leveraging GitOps (ArgoCD, Flux, Tekton) for declarative infrastructure management. Implementing progressive delivery strategies (Canary, Blue-Green, Feature Flags). Containerizing applications using Docker & Kubernetes (EKS, AKS, GKE, OpenShift, or on-prem clusters). Optimizing service orchestration and networking with service meshes (Istio, Linkerd, Consul). Implementing AI-enhanced observability for containerized services using AIOps-based monitoring. Automating provisioning with Terraform, CloudFormation, Pulumi, or CDK. Supporting and optimizing distributed computing workloads, including Apache Spark, Flink, or Ray. Using GenAI-driven copilots for DevOps automation, including scripting, deployment verification, and infra recommendations. The Impact You Will Have: Enhancing the efficiency and reliability of CI/CD pipelines and deployments. Driving the adoption of AI-driven automation to reduce downtime and improve system resilience. Enabling seamless application portability across on-prem and cloud environments. Implementing advanced observability solutions to proactively detect and resolve issues. Optimizing resource allocation and job scheduling for distributed processing workloads. Contributing to the development of intelligent DevOps solutions that support both traditional and AI-driven workloads. What You ll Need: 5+ years of experience in DevOps, Cloud Engineering, or SRE. Hands-on expertise with CI/CD pipelines (Jenkins, GitHub Actions, GitLab CI, ArgoCD, Tekton, etc.). Strong experience with Kubernetes, container orchestration, and service meshes. Proficiency in Terraform, CloudFormation, Pulumi, or Infrastructure as Code (IaC) tools. Experience working in hybrid cloud environments (AWS, Azure, GCP, on-prem). Strong scripting skills in Python, Bash, or Go. Knowledge of distributed data processing frameworks (Spark, Flink, Ray, or similar)
Posted 2 weeks ago
1 - 3 years
4 - 8 Lacs
Hyderabad, Chennai, Bengaluru
Work from Office
Key Responsibilities: Collaborate with data scientists to support end-to-end ML model development, including data preparation, feature engineering, training, and evaluation. Build and maintain automated pipelines for data ingestion, transformation, and model scoring using Python and SQL.. Assist in model deployment using CI/CD pipelines (e.g., Jenkins) and ensure smooth integration with production systems. Develop tools and scripts to support model monitoring, logging, and retraining workflows. Work with data from relational databases (RDS, Redshift) and preprocess it for model consumption. Analyze pipeline performance and model behavior; identify opportunities for optimization and refactoring. Contribute to the development of a feature store and standardized processes to support reproducible data science. Required Skills & Experience: 13 years of hands-on experience in Python programming for data science or ML engineering tasks. Solid understanding of machine learning workflows, including model training, validation, deployment, and monitoring. Proficient in SQL and working with structured data from sources like Redshift, RDS, etc. Familiarity with ETL pipelines and data transformation best practices. Basic understanding of ML model deployment strategies and CI/CD tools like Jenkins. Strong analytical mindset with the ability to interpret and debug data/model issues. Preferred Qualifications: Exposure to frameworks like scikit-learn, XGBoost, LightGBM, or similar. Knowledge of ML lifecycle tools (e.g., MLflow, Ray). Familiarity with cloud platforms (AWS preferred) and scalable infrastructure. Experience with data or model versioning tools and feature engineering frameworks.
Posted 1 month ago
1 - 3 years
4 - 8 Lacs
Hyderabad, Chennai, Bengaluru
Work from Office
Key Responsibilities: Collaborate with data scientists to support end-to-end ML model development, including data preparation, feature engineering, training, and evaluation. Build and maintain automated pipelines for data ingestion, transformation, and model scoring using Python and SQL.. Assist in model deployment using CI/CD pipelines (e.g., Jenkins) and ensure smooth integration with production systems. Develop tools and scripts to support model monitoring, logging, and retraining workflows. Work with data from relational databases (RDS, Redshift) and preprocess it for model consumption. Analyze pipeline performance and model behavior; identify opportunities for optimization and refactoring. Contribute to the development of a feature store and standardized processes to support reproducible data science. Required Skills & Experience: 13 years of hands-on experience in Python programming for data science or ML engineering tasks. Solid understanding of machine learning workflows, including model training, validation, deployment, and monitoring. Proficient in SQL and working with structured data from sources like Redshift, RDS, etc. Familiarity with ETL pipelines and data transformation best practices. Basic understanding of ML model deployment strategies and CI/CD tools like Jenkins. Strong analytical mindset with the ability to interpret and debug data/model issues. Preferred Qualifications: Exposure to frameworks like scikit-learn, XGBoost, LightGBM, or similar. Knowledge of ML lifecycle tools (e.g., MLflow, Ray). Familiarity with cloud platforms (AWS preferred) and scalable infrastructure. Experience with data or model versioning tools and feature engineering frameworks.
Posted 1 month ago
3 - 8 years
5 - 10 Lacs
Pune
Work from Office
Job Summary: Are you ready to join a game-changing open-source AI platform that harnesses the power of hybrid cloud to drive innovation? The Red Hat OpenShift AI team is looking for a Senior Software Engineer with Kubernetes and MLOps (Machine Learning Operations) experience to join our rapidly growing engineering team. Our focus is to create a platform, partner ecosystem, and community by which enterprise customers can solve problems to accelerate business success using AI. This is a very exciting opportunity to build and impact the next generation of MLOps platforms, participate in open source communities, contribute to the development of the OpenShift AI product, and be at the forefront of the exciting evolution of AI. Youll join an ecosystem that fosters continuous learning, career growth, and professional development. You will be contributing as a core developer for the Model Training team, the core Model training tools (Ray, Kubleflow, Pytorch etc) for OpenShift AI. You will work as part of an evolving development team to rapidly design, secure, build, test, and release new capabilities. This role is for an individual contributor who also leads other junior engineers in the team and collaborates closely with other developers and cross-functional teams. You will have the opportunity to actively participate in both our downstream efforts as well as the upstream projects. You should have a passion for working in open-source communities and for developing solutions that integrate Red Hat, open-source, and partner technologies into a cohesive platform. What you will do Participate in architect and lead implementation tasks of the new features and solutions for OpenShift AI Innovate in the MLOps domain by contributing meaningfully to upstream communities Develop integrations between various portions of the OpenShift AI stack Participate in technical vision and leadership on critical and high impact projects Ensure non-functional requirements including security, resiliency, and maintainability are met Write unit and integration tests and work with quality engineers to ensure product quality Use CI/CD best practices to deliver solutions as productization efforts into OpenShift AI Contribute to a culture of continuous improvement by sharing recommendations and technical knowledge with team members Collaborate with product management, other engineering and cross-functional teams to analyze and clarify business requirements Communicate effectively to stakeholders and team members to ensure proper visibility of development efforts Give thoughtful and prompt code reviews Help in mentoring, influencing, and coaching a distributed team of engineers What you will bring Experience developing applications in Go Experience developing applications in Python Experience developing applications in Kubernetes, OpenShift, or other cloud-native technologies Ability to quickly learn and guide others on using new tools and technologies Proven ability to innovate and a passion for staying at the forefront of technology. Experience with distributed systems (especially those that run on Kubernetes) and troubleshooting them Autonomous work ethic, thriving in a dynamic, fast-paced environment. Experience providing technical leadership in a global team and delivering on a vision Excellent written and verbal communication skills The following will be considered a plus: While a Bachelors degree or higher in computer science or a related discipline is valued, we prioritize practical experience and technical prowess Understanding of how Open Source communities work Experience with development for public cloud services (AWS, GCE, Azure) Experience working with or deploying MLOps platforms Experience with AI/ML Model training and tuning Experience writing Kubernetes/OpenShift controllers and operators Experience writing DSLs in Python or other language
Posted 3 months ago
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.
We have sent an OTP to your contact. Please enter it below to verify.
Accenture
36723 Jobs | Dublin
Wipro
11788 Jobs | Bengaluru
EY
8277 Jobs | London
IBM
6362 Jobs | Armonk
Amazon
6322 Jobs | Seattle,WA
Oracle
5543 Jobs | Redwood City
Capgemini
5131 Jobs | Paris,France
Uplers
4724 Jobs | Ahmedabad
Infosys
4329 Jobs | Bangalore,Karnataka
Accenture in India
4290 Jobs | Dublin 2