On-site
Full Time
Basic Qualifications:
. BS or MS degree in CS or related engineering or science field with 6-10+ years of relevant experience. Experience with benchmarking and troubleshooting or optimizing performance of a system.. Experience with coding, scripting, and automation.. Background in Networking.. General Linux skills.. Demonstrated ability to lead complex projects, independently resolve ambiguity, collaborate with stakeholders across teams, and communicate effectively.
Desired qualifications:
. Experience working on clusters, e.g., running HPC/AI workloads, or maintaining an HPC/AI system.. Experience troubleshooting or tuning performance on distributed systems.. Familiarity with elements of the AI/HPC software stack such as job schedulers (e.g., Slurm) NCCL, RCCL, or MPI or ML frameworks.. Experience with RDMA Networking, i.e., RoCE or Infiniband.. Experience architecting or developing solutions on a public cloud platform.
Responsibilities
. Carry out performance studies on GPU clusters with focus on AIML workload performance, network performance and tuning.. Design and code solutions for performance benchmarking.. Troubleshoot performance problems on RDMA clusters and perform cluster performance validation, including on very novel and not fully understood systems.. Document new tools and procedures to a high standard.. Write whitepapers to disseminate findings of performance studies.. Participate in architecture design and review, code review, and contribute to roadmap development.. Mentor junior engineers.. Participate in operational rotations.
Career Level - IC4
 
                Oracle
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
 
        Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.
We have sent an OTP to your contact. Please enter it below to verify.
 
            
         
                        
                     
    chennai, tamil nadu, india
Salary: Not disclosed
noida, uttar pradesh, india
Salary: Not disclosed
ahmedabad, gujarat, india
Salary: Not disclosed
trivandrum, kerala, india
Salary: Not disclosed
ahmedabad, gujarat, india
Salary: Not disclosed
trivandrum, kerala, india
Salary: Not disclosed
noida, uttar pradesh, india
Salary: Not disclosed
chennai, tamil nadu, india
Salary: Not disclosed
noida, uttar pradesh, india
Salary: Not disclosed
ahmedabad, gujarat, india
Salary: Not disclosed