Get alerts for new jobs matching your selected skills, preferred locations, and experience range. Manage Job Alerts
5.0 - 9.0 years
0 Lacs
delhi
On-site
We are seeking Site Reliability Engineers to oversee critical cloud infrastructure for our global clients. Your role involves maintaining, enhancing, and ensuring seamless continuity across multiple production environments. Responsibilities: Your core responsibilities include: - Monitoring system availability and ensuring overall system health. - Providing proactive insights on system health and recommending optimizations to prevent future issues. - Developing software and systems to manage platform infrastructure and applications. - Enhancing reliability, quality, and time-to-market for our cloud and on-premises software solutions. - Optimizing system performance to meet evolving customer needs and drive continual innovation. - Offering primary operational support and engineering for large-scale distributed infrastructure and related applications. Requirements: This is a deeply technical role focused on enhancing and maintaining production systems. We will evaluate candidates based on the following criteria: - 5+ years of experience in supporting large-scale infrastructure and cloud systems. - Proficiency in gathering and analyzing metrics for performance tuning and issue resolution. - Collaboration with development teams to enhance services through rigorous testing and release processes. - Involvement in system design consulting, platform management, and capacity planning. - Creation of sustainable systems and services through automation. - Balancing feature development speed and reliability with defined service level objectives. Technical Requirements: - Proficiency in automation technologies, particularly Terraform or Ansible. - Strong knowledge of Linux, MySQL, and scripting languages like Bash and Python. - Experience in maintaining on-premises cloud solutions such as OpenStack, Cloud Stack, etc. - Expertise in containers and container orchestration using Kubernetes. - Familiarity with monitoring systems like Prometheus, Nagios, etc., and implementing predictive analysis. - Extensive experience in maintaining high-availability systems and ensuring business continuity. - Solid understanding of distributed systems, storage, networking, SDN, and SDS. Bonus Attributes: - Familiarity with Cloud Stack, Citrix CloudPlatform, and related roles. - Experience in data centers or ISPs in a similar capacity. - Knowledge of GPU-based systems and virtualization techniques. - Background in supporting AI/ML workloads.,
Posted 3 days ago
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.
We have sent an OTP to your contact. Please enter it below to verify.
Accenture
31458 Jobs | Dublin
Wipro
16542 Jobs | Bengaluru
EY
10788 Jobs | London
Accenture in India
10711 Jobs | Dublin 2
Amazon
8660 Jobs | Seattle,WA
Uplers
8559 Jobs | Ahmedabad
IBM
7988 Jobs | Armonk
Oracle
7535 Jobs | Redwood City
Muthoot FinCorp (MFL)
6170 Jobs | New Delhi
Capgemini
6091 Jobs | Paris,France