Site Reliability Engineer

1 - 3 years

5 - 9 Lacs

Posted:2 days ago| Platform: Naukri logo

Apply

Work Mode

Work from Office

Job Type

Full Time

Job Description

Roles & Responsibilities:
  • Ensure high system reliability and uptime.
  • Develop and maintain monitoring systems.
  • Lead incident response and root cause analysis.
  • Automate repetitive tasks for efficiency.
  • Perform capacity planning and resource scaling.
  • Lead infrastructure as code (e.g., Terraform, Kubernetes).
  • Collaborate with development and operations teams.
  • Maintain clear documentation and share knowledge.
  • Optimize system and application performance.
  • Ensure security and compliance standards are met.
  • Define, measure, and monitor Service Level Objectives (SLOs) and Service-Level Agreements (SLAs) to align with business goals.
  • Drive continuous process and system improvements.
  • Define guidelines, standards, strategies, security policies and organizational change policies to support the Data Lake
What we expect of you Basic Qualifications and Experience:
  • Master s degree in computer science or engineering field and 1 to 3 years of relevant experience OR
  • Bachelor s degree in computer science or engineering field and 3 to 5 years of relevant experience OR
  • Diploma and Minimum of 8+ years of relevant work experience
Must-Have Skills:
  • Proficiency in programming/scripting (Python, Java).
  • Experience in Linux/Unix system administration.
  • Experience with cloud platforms (AWS, Databricks, Azure, Snowflake).
  • Proficiency in containerization and orchestration (Docker, Kubernetes).
  • Knowledge of Infrastructure as Code (Terraform, Ansible).
  • Familiarity with monitoring and logging tools (Prometheus, Grafana).
  • Understanding of CI/CD pipelines (Jenkins, GitLab CI/CD).
  • Strong networking knowledge and troubleshooting skills.
  • Understanding of security principles and compliance.
  • Familiarity with database management (SQL and NoSQL).
  • Strong troubleshooting and debugging skills.
  • Experience in performance optimization.
  • Experience with backup and storage solutions.
Good-to-Have Skills:
  • Familiarity with the use of AI for development productivity, such as GitHub Copilot, Databricks Assistant, Amazon Q Developer or equivalent.
  • Knowledge of Agile and DevOps practices.
  • Skills in disaster recovery planning.
  • Familiarity with load testing tools (JMeter, Gatling).
  • Basic understanding of AI/ML for monitoring.
  • Knowledge of distributed systems and microservices.
  • Data visualization skills (Tableau, Power BI).
  • Strong communication and leadership skills.
  • Understanding of compliance and auditing requirements.
Soft Skills:
  • Excellent analytical and solve skills
  • Excellent written and verbal communications skills (English) in translating technology content into business-language at various levels
  • Ability to work effectively with global, virtual teams
  • High degree of initiative and self-motivation
  • Ability to handle multiple priorities successfully
  • Team-oriented, with a focus on achieving team goals
  • Strong problem-solving and analytical skills.
  • Strong time and task leadership skills to estimate and successfully meet project timeline with ability to bring consistency and quality assurance across various projects.

Mock Interview

Practice Video Interview with JobPe AI

Start Job-Specific Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Skills

Practice coding challenges to boost your skills

Start Practicing Now

RecommendedJobs for You

Serilingampalli, Telangana, India

Navi Mumbai, Maharashtra, India