Senior Cloud Site Reliability Engineer

3 - 7 years

0 Lacs

Posted:1 day ago| Platform: Shine logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

As a Senior Cloud Site Reliability Engineer (SRE) at our company, your main role will be to ensure the reliability, scalability, and efficiency of our cloud infrastructure. You will be responsible for designing, implementing, and managing cloud environments while focusing on availability, scalability, and performance. Your expertise in Cloud infrastructure & System Management (Linux & Windows) will be crucial for the success of our team. **Key Responsibilities:** - Design, implement, and manage cloud infrastructure to ensure high availability, scalability, and performance. - Manage and optimize Systems at OS Level for both Linux and Windows servers. - Optimize web servers (Apache, Nginx, IIS), databases (MySQL, PostgreSQL, SQL Server, MongoDB), and caching systems (Redis, Memcached) to enhance performance and scalability. - Perform Cloud to Cloud or On-Premises to Cloud Migration efficiently. - Develop automation tools and frameworks to streamline deployments, monitoring, and incident response. - Monitor system health using tools like CloudWatch, Prometheus, Grafana, Nagios, troubleshoot issues, and proactively improve system performance. - Establish and refine SRE best practices, including SLIs, SLOs, and error budgets. - Implement and maintain CI/CD pipelines for seamless application deployment. - Conduct root cause analysis and post-mortems for incidents to drive long-term fixes. - Improve observability by implementing logging, tracing, and monitoring tools. - Mentor and guide junior SREs, fostering a culture of continuous learning and improvement. - Collaborate closely with DevOps and security teams to optimize cloud security and app performance, and document best practices. **Qualification Required:** - 3+ years of experience as a Cloud Engineer or SRE or similar role. - Strong expertise in Linux administration (Ubuntu, CentOS, RHEL); Windows experience is a plus. - Hands-on experience with cloud platforms like AWS, Azure, or Google Cloud. - Hands-on experience with cloud/server migrations. - Proficiency in Infrastructure as Code (IaC) tools such as Terraform or CloudFormation. - Strong scripting and automation skills in Python, Bash, or similar languages. - Expertise in Kubernetes and container orchestration tools. - Experience with monitoring and logging tools like Prometheus, Grafana, ELK, or Datadog. - Excellent problem-solving skills and the ability to work in a fast-paced environment. - Strong communication and collaboration skills.,

Mock Interview

Practice Video Interview with JobPe AI

Start Job-Specific Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Skills

Practice coding challenges to boost your skills

Start Practicing Now

RecommendedJobs for You