4 - 8 years
8 - 15 Lacs
Posted:3 weeks ago|
Platform:
Work from Office
Full Time
Job Summary We are looking for a Resiliency & Chaos Testing Engineer with 5+ years of hands-on experience in performance engineering , resiliency Testing , and chaos testing for enterprise-grade, cloud-native environments. The ideal candidate will be proficient with performance testing tools (LoadRunner, JMeter), chaos engineering tools (Chaos Monkey, Chaos Studio, Chaos Mesh), and observability platforms (Dynatrace, AppDynamics, Prometheus, Grafana). The role focuses on identifying performance bottlenecks, improving application resiliency, conducting failure simulations, and supporting enterprise audit and recovery readiness. Key Responsibilities Performance Engineering & Observability Design, execute, and analyze performance tests using LoadRunner , JMeter , etc. Identify bottlenecks and provide tuning recommendations across apps, databases, and infrastructure. Utilize APM tools (Dynatrace, AppDynamics, New Relic) and open-source monitoring (Prometheus-Grafana, Azure Monitor). Chaos Engineering & Resiliency Validation Execute chaos tests using tools like Chaos Studio , Chaos Monkey , and Gremlin to simulate pod failures, network latency, node crashes, and dependency outages. Perform CL0 validation , failover testing, MTTR/MTBF analysis, and support disaster recovery strategies. Ensure systems are auto-healing and can withstand production-grade fault scenarios. Audit & Risk Compliance Address and remediate enterprise audit issues like IS - 17556 . Support operational resiliency efforts (e.g., Travis K space ), ensuring enterprise uptime, compliance, and observability readiness. Cloud & Container Platforms Test application performance and resiliency in Docker , Kubernetes , and OpenShift environments. Work with cloud-native solutions, Helm chart deployments, rolling updates, and secure TLS/mTLS configurations for microservices. CI/CD & Agile Collaboration Integrate chaos and performance tests into CI/CD pipelines . Collaborate with Agile/DevOps teams to define NFRs, performance KPIs, and system readiness within sprints. Participate in backlog grooming, system hardening, and environment stability assessments. Required Skills & Qualifications 5+ years in performance testing , resiliency validation , or chaos engineering . Expertise in LoadRunner , JMeter , Prometheus , Grafana , Chaos Studio , and Chaos Monkey . Experience with Kubernetes , OpenShift , Docker , and monitoring tools like Azure Monitor and New Relic . Familiarity with messaging systems like Kafka , RabbitMQ , IBMMQ , and databases like MongoDB and MSSQL . Hands-on with mTLS/SSL configurations and Helm for container deployment. Strong collaboration, analytical, and documentation skills. Preferred Skills Experience with languages like Java , Python , or Golang for scripting fault simulations or automation. Understanding of risk frameworks, performance profiling tools (MAT, Java VisualVM), and cloud security practices. Prior work in payment domains or regulated environments with SLAs and compliance constraints.
Clarium Tech
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
8.0 - 15.0 Lacs P.A.
4.0 - 8.0 Lacs P.A.
Hyderabad
6.0 - 7.0 Lacs P.A.
Bengaluru
4.0 - 8.0 Lacs P.A.
7.0 - 15.0 Lacs P.A.
4.0 - 9.0 Lacs P.A.
5.5 - 7.0 Lacs P.A.
Bengaluru
11.0 - 21.0 Lacs P.A.
Greater Noida
10.0 - 20.0 Lacs P.A.
Pune, Greater Noida
9.0 - 14.0 Lacs P.A.