Position Summary
We arelookingfora Principal QA Engineerwithexpertisein data systems, observability, and automation tojoin our Distributed CloudData Platformteamto leadthequality strategy for next-generation data infrastructurethat servesmission-critical workloads across our SaaS platform.
This is a hands-on, technical role where you will design test frameworks, define data validation strategies, lead performance benchmarking, and mentor team on reliability, automation, and test best practices for large-scale distributed data systems throughout the development lifecycle.
Quality Strategy & Automation Framework
- De sign QA strategies for data streaming, storage, and observability services.
- Partner with Engineering, Product, SRE, and Platform teams to embed quality and reliability throughout the SDLC.
- Build automation framework for data validation, regression, and integration testing.
- Extend automation to handle real-time data streams, schema evolution, workflows, and data consistency checks.
Data Systems & Observability Testing
-
Design and execute test s for streaming platforms ( e.g. Kafka), ETL pipelines, and data storage systems ( ClickHouse , ElasticSearch , Iceberg, S3).
-
Develop tools to validate ingestion, transformation, and query accuracy.
-
A utomate validation of logs, metrics, and traces for correctness & completeness .
-
Validate telemetry and SaaS usage pipelines ( e.g. Prometheus, OpenTelemetry ).
-
Simulate failure and recovery scenarios for distributed systems.
-
Ensure system instrumentation for high coverage automated observability testing.
Cloud Infrastructure Quality & Performance
- Validate deployments across multi-cloud and K 8s native data clusters.
- Implement chaos and resilience testing for data system components.
- Collaborate with SRE/ Ops to ensure test environments are production-parity.
- Establish performance and load testing frameworks for streaming ( e.g. Kafka topics /partitions) , ingestion of APIs, and warehouse/ Data lake ( e.g. ClickHouse queries ) .
- Build synthetic data generators and benchmarking tools for large-scale test datasets.
- Analyze bottlenecks and help optimize system throughput and latency.
- Perform performance, scalability, and reliability testing to ensure our data platform can handle global-scale workloads.
QA Best Practice & Mentorship
-
Integrate test frameworks into CI/CD pipelines , v alidate complex, distributed systems across multi-cloud environments.
-
Identify , document, and track defects through resolution.
-
Create and maintain test plans, test cases, and documentation.
-
Participate in design and code reviews to ensure quality is built into every stage of development.
-
Mentor junior QA engineers and promote best practices in test automation and quality assurance.
-
Investigate production issues and contribute to root cause analysis and remediation strategies.
Required Skills & Experience
- Be kind.
- Demonstrate a human-first, empathetic approach to teamwork and communication. Collaborate, value learning, humility, and shared ownership.
- 10+ years of experience in Quality Assurance, with at least 7 years focused on automation, with Computer Science or equivalent practical experience.
- Strong background testing data-intensive or observability systems ( e.g. Kafka, Flink, Spark, ClickHous e , ElasticSearc h , Prometheus, OpenTelemetr y ).
- Proficiency coding/scripting skills in Python, Go, or Java for automation and tooling.
- Experience with automation frameworks ( e.g. Selenium or similar).
- Expertise in performance testing tools (e.g., Locust, Gatling, k6, JMeter) and benchmarking distributed systems .
- Expertise in streaming data validation, schema, and event-driven architectures.
- Exposure to warehouse/data lake performance tuning and query optimization.
- Exposure to AI/Agentic/GenAI tools to accelerate test automation and productivity.
- Familiarity with compliance validation in data pipelines ( e.g. PII masking).
- Familiar with cloud-native architectures (K 8 s, Terraform, Helm, CI/CD pipelines).
- Experience testing in cloud , distributed systems, microservices, and APIs.
- Familiarity with CI/CD pipelines, version control (Git), and DevOps practices.
- Excellent analytical, debugging, and communication skills.
- Experience leading QA strategy in SaaS, observability, or analytics platforms .