This senior staff-level role is pivotal in advancing the Nutanix Distributed Storage Fabric (NDSF/DSF), the foundational high-performance, shared-nothing distributed file system that powers our hyper-converged and hybrid cloud solutions. As part of the CDP teamthe "data engine" handling IO requests, data placement, deduplication, compression, encryption, failure recovery, and moreyou''ll architect and develop innovative storage solutions for enterprise cloud environments, including cloud-native setups like Container Attached Storage (CAS) integrated with Kubernetes. This position combines deep technical expertise with leadership, mentoring junior engineers, and driving complex projects in a fast-paced, collaborative setting. You''ll contribute to software that enables scalability, resiliency, and performance for diverse applications, virtualization platforms, and emerging technologies like AI systems storage.
About the Team
The Core Data Path division is responsible for the core of NDSF/DSF. We manage metadata, data operations, and essential storage features such as flow control, data avoidance, and reduction. Our work extends to cloud-native AOS, a Kubernetes-native solution that runs AOS distributed storage fabric in pods, supporting dynamic provisioning, thin provisioning, data efficiency, snapshots, and full data management in hyperconverged or disaggregated environments. The team''s primary technology stack includes C++, with opportunities to leverage Go, Python, and kernel programming. We foster a culture of ownership, innovation, and cross-team collaboration to deliver high-quality products that power Nutanix Enterprise Cloud.
Your Role
- Architect, design, and develop storage software for converged computing+storage platforms, including HCI, hybrid cloud, cloud-native environments, and storage for AI workloads.
- Develop a deep understanding of complex distributed systems, resolving issues in large-scale data organization, algorithm scalability, concurrent programming, asynchronous communication, efficient concurrency, reliability, disaster recovery, and fault tolerance.
- Improve performance, scale-out, and resiliency of distributed data paths, control planes, and storage systems.
- Lead and mentor a team of 15-20 talented engineers, guiding technical direction and delivering complex projects across multiple teams.
- Collaborate closely with development, test, documentation, product management, and support teams to ensure high-quality releases in a dynamic environment.
- Engage with customers and support teams to troubleshoot and resolve production issues.
- Apply expertise in storage access protocols (e.g., NFS, CIFS, S3, Cloud) and features like deduplication, compression, encryption, and healing from failures.
- Contribute to software development lifecycle processes, including Git, code reviews, and Jira.
What You Will Bring
- 15+ years of software development experience in product companies, with a proven track record in distributed systems, storage, file systems, operating systems, or related fields.
- Expertise in programming languages, with rock-solid proficiency in C++ and/or Go; familiarity with Python and kernel programming is a plus.
- Strong fundamentals in data structures, algorithms, analysis techniques, TCP/IP, OS internals, and design/implementation tradeoffs in clustered, high-performance, fault-tolerant systems.
- Extensive knowledge of UNIX/Linux, Kubernetes, and cloud-based storage technologies; experience with container-native solutions like CAS is highly desirable.
- Familiarity with virtualization technologies (e.g., ESXi/KVM, VMware, Hyper-V, Xen) and storage management.
- Experience with large-scale distributed systems such as Hadoop, MapReduce, Cassandra, or Zookeeper preferred.
- Demonstrated leadership in mentoring engineers and owning cross-team problem-solving.
- Excellent written and verbal communication skills, with an ownership mindset for tackling difficult challenges.
- Bachelor''s degree in Computer Science or a related field required; Master''s or advanced degree preferred.