About The Job
The Red Hat Chaos Engineering team, part of the Performance and Scale department, is looking for a Senior Software Engineer to join us in Bangalore, India to work on chaos testing Red Hat OpenShift Container Platform, Red Hat OpenShift Virtualization and related product portfolio to identify bottlenecks, tunings and capacity planning guidance under failure conditions. Our goal is to make these products the platform of choice for Red Hat’s enterprise customers!As a senior member of the team, you will be responsible for providing comprehensive resilience, reliability, performance and scalability assessments of the products and improving them. You will collaborate with various Engineering teams on driving features, bug fixes, tunings and providing guidance to ensure stable releases. You will also engage with customers to assist them with establishing chaos and performance test pipelines, best practices, strategies to ensure a scalable environment. This role needs an engineer that thinks creatively, adapts to rapid change, and has the willingness to learn and apply new technologies. You will be joining a vibrant open source culture, and helping promote performance and innovation in this Red Hat engineering team.
What will you do?
- Formulate test plans and carry out chaos testing, performance and scalability benchmarks against various components/features of the OCPv platform to characterize reliability, resilience, drive product performance improvements and detect regressions through data analysis and visualization under failure conditions such as network faults, infrastructure failures, storage faults, etc
- Work on capacity planning guidance for the product to handle failures while still being performant
- Develop tools and automation related to fault injection, load generation and release CI
- Work on AI integration to improve test coverage
- Assist customers
- Collaborate with other engineering teams to resolve resilience and performance issues
- Triage, debug, and solve customer/partner cases related to virtualization reliability, performance and scale
- Publish results, conclusions, recommendations and best practices via internal test reports, presentations, external blogs and official documentation to support our partners and customers
- Participate in internal and external conferences about your work and results
What will you bring?
- Bachelor's or Master's degree in Computer Science or related field, or equivalent experience
- Overall 5+years of experience in software development
- 5+ years of programming experience in Python, Golang or related programming
- Experience with site reliability, chaos testing, performance benchmarking, data capture, analysis and debugging
- Very strong Linux system administration and system engineering skills.
- Experience with container ecosystems like Docker, Podman and Kubernetes
- Ability to quickly learn technologies with guidance and maintain high attention to detail
- Experience with tools, metrics collection and analysis such as iostat, vmstat, sar, perf, pcp, prometheus, Grafana and Elasticsearch
- Familiarity with Continuous Integration frameworks, automation like Jenkins, Airflow, Ansible etc. and version control tools such as Git, etc
- Experience working with public clouds like AWS, Azure, GCP, or IBM Cloud, as well as bare metal environments.
- Excellent written and verbal language skills in English
The Following Are Considered a Plus
- Experience with chaos testing and maintaining reliability of infrastructure at large scale
- Experience working with virtualization technologies such as KubeVirt, VMware
- Knowledge of performance observability/profiling tools like eBPF, Flame Graphs
About Red Hat
Red Hat is the world’s leading provider of enterprise open source software solutions, using a community-powered approach to deliver high-performing Linux, cloud, container, and Kubernetes technologies. Spread across 40+ countries, our associates work flexibly across work environments, from in-office, to office-flex, to fully remote, depending on the requirements of their role. Red Hatters are encouraged to bring their best ideas, no matter their title or tenure. We're a leader in open source because of our open and inclusive environment. We hire creative, passionate people ready to contribute their ideas, help solve complex problems, and make an impact.
Inclusion at Red Hat
Red Hat’s culture is built on the open source principles of transparency, collaboration, and inclusion, where the best ideas can come from anywhere and anyone. When this is realized, it empowers people from different backgrounds, perspectives, and experiences to come together to share ideas, challenge the status quo, and drive innovation. Our aspiration is that everyone experiences this culture with equal opportunity and access, and that all voices are not only heard but also celebrated. We hope you will join our celebration, and we welcome and encourage applicants from all the beautiful dimensions that compose our global village.
Equal Opportunity Policy (EEO)
Red Hat is proud to be an equal opportunity workplace and an affirmative action employer. We review applications for employment without regard to their race, color, religion, sex, sexual orientation, gender identity, national origin, ancestry, citizenship, age, veteran status, genetic information, physical or mental disability, medical condition, marital status, or any other basis prohibited by law.Red Hat does not seek or accept unsolicited resumes or CVs from recruitment agencies. We are not responsible for, and will not pay, any fees, commissions, or any other payment related to unsolicited resumes or CVs except as required in a written contract between Red Hat and the recruitment agency or party requesting payment of a fee.Red Hat supports individuals with disabilities and provides reasonable accommodations to job applicants. If you need assistance completing our online job application, email application-assistance@redhat.com. General inquiries, such as those regarding the status of a job application, will not receive a reply.