Get alerts for new jobs matching your selected skills, preferred locations, and experience range. Manage Job Alerts
5.0 - 9.0 years
0 Lacs
haryana
On-site
We are seeking individuals who can offer informed and unique perspectives, enjoy collaborating with cross-functional teams, and are continuously pushing boundaries to create reliable and scalable solutions and enhance user experiences. Your main responsibilities will include analyzing the current technologies utilized within the company, devising monitoring and notification tools to enhance observability and visibility. You will be tasked with ensuring system stability by proactively identifying failure scenarios and implementing solutions to reduce MTTR. Developing solutions to boost system performance with a strong emphasis on high availability, scalability, and resilience will be a key focus. You will also integrate telemetry and alerting platforms to monitor and enhance system reliability. It is essential to adhere to industry best practices for system development, configuration management, and deployment. Additionally, you will play a crucial role in facilitating seamless information flow between teams by documenting acquired knowledge. Staying current with modern technologies and trends will enable you to advocate for their incorporation into products if they bring value. In incident management, you will be involved in troubleshooting production issues, conducting root cause analysis (RCA), and actively sharing insights to enhance system reliability and internal knowledge. The ideal candidate should have experience in troubleshooting and optimizing high-performance microservices architectures running on Kubernetes and AWS in highly available production environments. A minimum of 5 years of experience in software development using languages such as Python, Java, Go, etc., with a strong foundation in data structures, algorithms, problem-solving, and complexity analysis is required. During the SRE selection process, a coding challenge will be presented. You should possess a curious and proactive nature in identifying performance bottlenecks, scalability issues, and resilience problem areas and be adept at resolving them. Familiarity with observability tools and data collection is essential. Knowledge of databases like RDS, NoSQL, distributed TiDB, etc., is preferred. Strong communication skills, a collaborative approach, and a proactive attitude to deliver results are highly valued. Embracing challenges and seeing them through to completion is a key attribute. Preferred qualifications include expertise in container image management and optimization, experience in large distributed system architecture and capacity planning, understanding of Infrastructure as Code (IaC), automation tools like Terraform, CloudFormation, etc., background in SRE/DevOps concepts and implementation, proficiency in managing monitoring tools such as CloudWatch, VictoriaMetrics, Prometheus, and reporting with Snowflake and Sigma. In-depth knowledge of web technologies like CloudFront, Nginx, etc., and experience in designing, implementing, or maintaining disaster recovery strategies and multi-region architecture for high availability, resilience, and business continuity across critical systems are advantageous. Proficiency in Japanese and English languages is a plus, although language skills are not mandatory as we have professional translators available. **Working Conditions** **Employment Status:** Full Time **Office Location:** Gurugram (WeWork) The development center requires your presence at the Gurugram office to help establish a strong core team.,
Posted 6 days ago
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.
We have sent an OTP to your contact. Please enter it below to verify.
Accenture
64580 Jobs | Dublin
Wipro
25801 Jobs | Bengaluru
Accenture in India
21267 Jobs | Dublin 2
EY
19320 Jobs | London
Uplers
13908 Jobs | Ahmedabad
Bajaj Finserv
13382 Jobs |
IBM
13114 Jobs | Armonk
Accenture services Pvt Ltd
12227 Jobs |
Amazon
12149 Jobs | Seattle,WA
Oracle
11546 Jobs | Redwood City