4 - 8 years
6 - 10 Lacs
Posted:1 day ago|
Platform:
Work from Office
Full Time
About Marvell
.
Your Team, Your Impact
This team at Marvell develops Murals, a next-generation AI/ML infrastructure simulation and design platform that enables in-depth analysis and optimization of large-scale training and inference workloads. Leveraging trace-driven simulation, performance modeling, and hardware/software co-design, the team helps shape scalable and resilient solutions for advanced workloads such as LLMs, DLRMs, GenAI, and GNNs. Working closely with system architects, hardware designers, and ML practitioners, the team explores innovative ways to optimize compute, memory, and networking subsystems across complex datacenter environments.
What You Can Expect
Simulation & Modeling - Implement workflows to study AI/ML workloads using trace-driven and analytical models.
Performance Analysis - Profile and analyze system bottlenecks across compute, memory, and network layers.
Networking Studies - Evaluate collective communication performance (all-reduce, all-to-all, reduce-scatter) across different topologies and fabrics.
Tooling & Automation - Develop utilities for trace generation, merging, conversion, and visualization.
Prototype & Validation - Test distributed training and inference pipelines in simulated and real environments.
Hardware/Software Co-Design - Collaborate on emerging technologies (CXL, DPUs, NVLink, PCIe, UET/UEC, in-network compute).
Scaling Studies - Conduct performance projections and trade-off studies for next-gen AI infrastructure.
Knowledge Sharing - Document workflows, publish internal reports, and drive peer learning.
What Were Looking For
Bachelor s, Master s, or PhD in Computer Science, Electrical Engineering, or related field with 4-12 years of relevant professional experience.
Strong foundation in computer architecture, distributed systems, AI/ML, and operating systems.
Solid networking fundamentals including TCP/IP, RDMA, RoCE, UET/UEC, and switching/routing.
Experience with simulation frameworks (e. g. , Astra-Sim, Chakra, gem5, SST, NS-3).
Hands-on with PyTorch/TensorFlow and distributed training frameworks (DDP, Horovod, DeepSpeed).
Strong programming skills in Python, C++, and scripting for automation.
Familiarity with interconnect and memory technologies (CXL, PCIe, NVLink, UAL).
Experience with profiling, telemetry, observability, and debugging tools.
Knowledge of collective communication algorithms and topology-aware scheduling.
Exposure to AI accelerators, memory disaggregation, DPUs, and custom silicon.
Familiarity with visualization tools (Perfetto, Chrome Tracing, Chakra Timeline, Flamegraphs).
Experience with large-scale AI training pipelines and scaling studies.
Interest in energy/performance trade-offs and resilience techniques.
Marvell Semiconductors
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.
We have sent an OTP to your contact. Please enter it below to verify.
Practice Python coding challenges to boost your skills
Start Practicing Python Nowhyderabad, bengaluru
6.0 - 10.0 Lacs P.A.
hyderabad, telangana, india
Salary: Not disclosed
30.0 - 37.5 Lacs P.A.
25.0 - 30.0 Lacs P.A.
bengaluru
25.0 - 30.0 Lacs P.A.
6.0 - 10.0 Lacs P.A.
2.0 - 5.0 Lacs P.A.
35.0 - 40.0 Lacs P.A.
13.0 - 17.0 Lacs P.A.
10.0 - 14.0 Lacs P.A.